How to open zip file python

How to open zip file python

Working with zip files in Python

This article explains how one can perform various operations on a zip file using a simple python program.

What is a zip file?

ZIP is an archive file format that supports lossless data compression. By lossless compression, we mean that the compression algorithm allows the original data to be perfectly reconstructed from the compressed data. So, a ZIP file is a single file containing one or more compressed files, offering an ideal way to make large files smaller and keep related files together.

Why do we need zip files?

To work on zip files using python, we will use an inbuilt python module called zipfile.

1. Extracting a zip file

The above program extracts a zip file named “my_python_files.zip” in the same directory as of this python script.
The output of above program may look like this:

How to open zip file python. Смотреть фото How to open zip file python. Смотреть картинку How to open zip file python. Картинка про How to open zip file python. Фото How to open zip file python

Let us try to understand the above code in pieces:

ZipFile is a class of zipfile module for reading and writing zip files. Here we import only class ZipFile from zipfile module.

Here, a ZipFile object is made by calling ZipFile constructor which accepts zip file name and mode parameters. We create a ZipFile object in READ mode and name it as zip.

printdir() method prints a table of contents for the archive.

extractall() method will extract all the contents of the zip file to the current working directory. You can also call extract() method to extract any file by specifying its path in the zip file.
For example:

This will extract only the specified file.

If you want to read some specific file, you can go like this:

2. Writing to a zip file

Consider a directory (folder) with such a format:

How to open zip file python. Смотреть фото How to open zip file python. Смотреть картинку How to open zip file python. Картинка про How to open zip file python. Фото How to open zip file python

Here, we will need to crawl the whole directory and its sub-directories in order to get a list of all file paths before writing them to a zip file.
The following program does this by crawling the directory to be zipped:

Объект ZipFile модуля zipfile в Python.

Использование объекта ZipFile с описанием.

В данном разделе рассмотрены методы объекта ZipFile с их подробным описанием и примерами. Объект ZipFile получается в результате создания экземпляра класса ‘zipfile.ZipFile()’.

Содержание:

ZipFile.close() :

Метод ZipFile.close() закрывает файл архива. Необходимо вызвать этот метод перед выходом из программы, иначе не вся информация будет записана или используйте менеджер контекста.

ZipFile.getinfo(name) :

ZipFile.infolist() :

Метод ZipFile.infolist() возвращает список, содержащий объект zipfile.ZipInfo() для каждого члена архива.

Объекты находятся в том же порядке, что и в реальном ZIP-файле на диске.

ZipFile.namelist() :

Метод ZipFile.namelist() возвращает список членов архива по имени.

ZipFile.open(name, mode=’r’, pwd=None, *, force_zip64=False) :

Метод ZipFile.open() предоставляет доступ к элементу архива в виде двоичного файлового объекта.

Метод ZipFile.open() является менеджером контекста и поэтому поддерживает оператор with :

При записи файла, если размер файла заранее неизвестен, но может превышать 2 ГБ, то передайте force_zip64=True для поддержки больших файлов. Если размер файла известен заранее, то создавайте объект zipfile.ZipInfo() с установленным аргументом file_size и используйте его в качестве параметра имени.

ZipFile.extract(member, path=None, pwd=None) :

Метод ZipFile.extract() извлекает элемент архива member в текущий рабочий каталог.

Метод ZipFile.extract() возвращает созданный нормализованный путь (каталог или новый файл).

ZipFile.extractall(path=None, members=None, pwd=None) :

Метод ZipFile.extractall() извлекает все элементы архива в текущий рабочий каталог.

ZipFile.printdir() :

ZipFile.setpassword(pwd) :

Метод ZipFile.setpassword() устанавливает pwd в качестве пароля по умолчанию для извлечения зашифрованных файлов.

ZipFile.read(name, pwd=None) :

ZipFile.testzip() :

ZipFile.write(filename, arcname=None, compress_type=None, compresslevel=None) :

Примечания:

ZipFile.writestr(zinfo_or_arcname, data, compress_type=None, compresslevel=None) :

Метод ZipFile.writestr() записывает файл в архив.

A complete guide for working with I/O streams and zip archives in Python 3

How to open zip file python. Смотреть фото How to open zip file python. Смотреть картинку How to open zip file python. Картинка про How to open zip file python. Фото How to open zip file python

As part of daily job, sometimes you have to work with zip archives/files. Even though it looks straight-forward, sometimes few custom requirements can force you to the bang-head situation while searching a clean way to manage zip files.

Recently, at my work, I implemented a feature where I have to download a zip file from S3, update its content, and upload it back to S3. The content can be dynamic, and I have to update only the specific part(a file) and retain all others. On the process, I researched a bit about the topic, tried to explore the Python 3 standard library’s zip utilities. I want to share my knowledge here.

After reading this article, you can work with zip files effortlessly in Python. I try to cover possible use cases one might come across along with tests to understand how things work.

Note: We use Python 3.7 for our code samples and API. You can install a Python 3.7 using a virtual environment and activate it.

All the code samples can be found at this GitHub link:
https://github.com/narenaryan/python-zip-howto

Python I/O streams and buffers

Before jumping into Python’s zipfile API, let us discuss the basics of streams and buffers. They are necessary to understand the internals of how Python treats files and data in general. A file recognized by Python can store three types of data:

How to open zip file python. Смотреть фото How to open zip file python. Смотреть картинку How to open zip file python. Картинка про How to open zip file python. Фото How to open zip file python

Python considers an object falling in the above three categories as a “file-like object.” They are also called streams from where data can be read from or written. The data stored in streams are called buffers of that stream. The first two, i.e. Text and Binary streams, are buffered I/O streams, and raw type is unbuffered. In this article, we are only interested in buffered streams.

Everyone who worked with Python may have seen operating on files from disk before. In Python, one can open a file like this:

What is precisely the above code doing? It does these:

A stream can be a file-like object in Python

In Python, we can also create in-memory streams that can hold different kinds of buffers. Python’s io package provides two classes:

Let us discuss each buffered I/O type in detail.

Text Streams

A text stream operates on a text buffer. We can create an empty initialized “ file-like object” using StringIO that can store text buffers like this,

The preceding program creates a new text stream, writes some data to its buffer, then prints the buffer content to console. This text stream can be moved freely among Python functions whose signature processes an I/O stream.

One should be aware that, in Python, a file-like object can be used in any I/O operation. The classic example is the print statement.

Python’s print statement takes a keyword argument called file that decides which stream to write the given message/objects. Its value is a “ file-like object.” See the definition of print,

The sys.stdout is a stream, which is a file-like object. That default value makes Python write to the console. We can also change that destination to any custom writable stream. Let us modify our program to change the destination of print to our custom text stream.

Text streams are only useful in operating on UTF-8 buffers(XML, JSON, CSV). There are many other cases where we have to represent binary buffers(ZIP, PDF, Custom Extensions) in program memory. Binary streams come to the rescue.

Binary Streams

One can store any binary data coming from a PDF or ZIP file into a custom binary stream like the preceding one.

StringIO & BytesIO are high-level interfaces for Buffered I/O in Python.

In the next section, we discuss the basics of zipping in Python. We also see many use cases with examples.

Understanding the Python `zipfile` API

A zip file is a binary file. Contents of a zip file are compressed using an algorithm and paths are preserved. All open-source zip tools do the same thing, understand the binary representation, process it. It is a no-brainer that one should use BytesIO while working with zip files. Python provides a package to work with zip archives called zipfile

The zipfile package has two main classes:

How to open zip file python. Смотреть фото How to open zip file python. Смотреть картинку How to open zip file python. Картинка про How to open zip file python. Фото How to open zip file python

A ZipInfo object is a path in the zip file. It is the combination of directories plus path. For example, let us say we have a directory called config, and it stores configurations for application, containers, and, some root-level configuration. Assume the content looks like this,

If you zip config directory using your favourite zip tool, I pick this python command,

and then try to list the contents of config.zip using Python command,

It displays all paths in the zip file.

How to open zip file python. Смотреть фото How to open zip file python. Смотреть картинку How to open zip file python. Картинка про How to open zip file python. Фото How to open zip file python

What are those paths? Each path listed in the output is a ZipInfo object for Python. To prove that, let us write a small script that creates a zip archive in memory with config.zip,

When you run this script, you see the following output:

This ZipInfo object is critical for modifying a file/path in the archive. It is a high-level wrapper for a file stream. On a ZipInfo object, one can read or modify data. One can also create new ZipInfo objects and add them to the archive. Let us see all variations where we use simple Python programs to create, update zip archives in the next section. All the examples don’t create zip files on disk but in memory.

Use case #1: Create zip archive with files

We can create a zip file with the given name by opening a new ZipFile object with write mode ‘w’ or exclusive create mode ‘x.’

Next, we can add files/paths to the zip archive. There are two approaches to do that:

v1: Add a file as a file-like object

This approach writes independent files as file-like objects.

v2: Add a file as a ZipInfo object

This approach composes files as objects and gives more flexibility to add meta information on file.

Both versions create config.zip file on disk. While creating a file in the archive, they consider relative paths like this,

v2 is slightly flexible as it gives freedom to modify ZipInfo object properties at any point in time.

Use case #2: Read a file from zip archive

Another possible use case is to read a file from an existing zip archive. Let us use the config.zip file created from Use case #1.

For Ex: read the content of docker-compose.yaml from the zip and print it.

Use case #3: Update or Insert a file in zip archive

This use case is the most tricky part of the zipping business in Python. On the first look, it might look simple. Let us try attempting a few solutions.

Attempt #1

The apparent thing that comes to one’s mind is to update a specific file in the archive with the latest data, is this,

If we run the preceding script, it replaces the file in archive config.zip, but, as zipfile is opened in write mode ‘w,’ the other files/paths in archive can vanish. You can check it using this command,

Woah, root config and app config have vanished from the config.zip. It is a side-effect.

Don’t use ‘w’ mode, when you update/replace a single file in a zip archive, or your data is gone for good.

Attempt #2

Can’t we append a file to the existing zip? Will it magically overwrite the file? Yes, it does. Just replace mode in previous code snippet from ‘w’ to ‘a.’

And rerun the script on a fresh config.zip(which has a root, docker and, app configs). You see this warning:

It is just a warning. So go ahead, extract the content like this to see what is inside,

and you find there is only one docker-compose.yaml file in docker directory with all other files/paths preserved. Wonderful!

Even though it seems to be an obvious solution, there is a serious bug here. The extracted archive may not have visible duplicate files, but the underlying file pointer might have duplicated information.

Type zipfile list command, to see those hidden duplicates.

So docker/docker-compose.yaml had appeared twice in the ZipInfo list but only once in extraction. For every update, the zip archive size grows and grows in the magnitude of the updated file size. If you ignore the Python warning, at some point the junk in the archive may occupy more space than actual files.

The two attempts until now couldn’t achieve an acceptable solution. Now comes third, which is a clean and elegant way.

Attempt #3

There is a no easy way to update the contents of a zip archive. A clean way is to create a new zip archive in memory and copy old ZipInfo objects from the old archive into the new archive. In case of a path where data should be inserted or replaced, instead of reading from the old archive, create a custom ZipInfo object and add it to the new archive.

How to open zip file python. Смотреть фото How to open zip file python. Смотреть картинку How to open zip file python. Картинка про How to open zip file python. Фото How to open zip file python

The algorithm looks like this:

Here, we are defining a function that takes the path in archive and data to replace. It iterates over the old archive and copies existing stuff into the new archive. When it spots an existing element, it creates a new ZipInfo object and puts that into the new archive. Run the script on a fresh config.zip(created by createzipv1.py), and you see there is no duplication of file objects and docker/docker-compose.yaml is updated as expected.

This solution has a minor drawback of dealing with two streams at a given time, and in the worst case, it can end up consuming double the amount of run-time memory. It won’t happen unless you are talking about Gigabyte sized zip files.

Use the technique of cloning for updating/inserting paths in a zip archive.

Use case #4: Remove an existing file from zip archive

By now, after looking at many use cases, one can guess how to remove a file from the archive. The cleanest way again is to copy contents from the old archive to the new archive and skip the ZipInfo objects that match the given path. Finally, overwrite the old zip file with the new zip file. The algorithm should have only one condition like this,

The delete script now has a function that takes only path argument and skips the respective ZipInfo object while copying.

It finishes all possible use cases that pop up while working with zip files in Python.

Note: The in-memory stream objects created(using BytesIO) in the above scripts can also be used with AWS S3 instead of flushing to a disk.

Final words

As we already discussed, one should monitor the size of a zip file and program memory for a copy operation.

Python provides a module named zipfile that has a bunch of functionalities to work with zip archives. It provides various tools to perform operations like create, read, write and append on zip archives. As a part of this tutorial, we’ll explain the API of zipfile module with simple examples which will make it easy to grasp.

We have created one simple text file that we’ll be using when explaining various examples. Below we have printed the contents of the file. We have also created one archive with only this file using our Linux tools which we’ll try to read in our first example.

Example 1: Reading the Contents of Existing Zip File¶

As a part of our first example, we’ll explain how we can open an existing zip archive and read the contents of the file stored in it. We’ll be using class ZipFile and its methods for this purpose.

Important Methods of ZipFile Object¶

Our code for this example starts by creating an instance of ZipFile with zip archive and then opens text file inside of it using open() method of ZipFile instance. It then reads the contents of the file, prints it, and closes the archive. We have decoded the contents of the file as it’s returned as bytes.

Our second code example for this part explains the usage of read() method of ZipFile object. It has almost the same code as the previous example with the only change that it uses read() method. We also check whether the file is a valid zip file or not using is_zipfile() method.

Example 2: Create Zip Archive¶

As a part of our second example, we’ll explain how we can create zip archives. We’ll be using write() method of ZipFile instance to write files to zip archive.

Important Methods of ZipFile Object¶

Our code for this example creates a new archive by using ‘w’ mode of ZipFile instance. It then writes our text file to the archive and closes the archive. The next part of our code opens the same archive again in reading mode and reads the contents of the text file written to verify it.

Our code for this example is exactly the same as our code for the previous example with the only change that it explains the usage of writestr() method.

Import Methods of ZipFile Object¶

Example 3: Append Files to Existing Zip File¶

As a part of our third example, we’ll explain how we can append new files to an existing archive.

Our code for this example opens an archive that we created in the previous example in append mode. It then writes a new JPEG file to the archive and closes it. The next part of our code again opens the same archive in reading mode and prints the contents of the archive using printdir() method. The printdir() method lists archive contents which contain information about each file of an archive, its last modification date, and size.

Example 4: ZipFile as a Context Manager¶

We can also use ZipFile instance as a context manager (with statement). We don’t need to call close() method on archive if we have opened it as a context manager as it’ll close the archive by itself once we exit from context manager.

Our code for this example opens an existing archive as a context manager, reads the contents of the text file inside of it, and prints them.

It’s recommended to use ZipFile as a context manager because it’s a safe and easy way to work with archives. We’ll be using it as a context manager in our examples going forward.

Example 5: Creating Zip Archive with Different File Types¶

As a part of our fifth example, we are demonstrating how we can write multiple files to a zip archive.

Our code for this example starts by creating an instance of ZipFile with a new zip archive name and mode as ‘w’. It then loops through four file names and writes them one by one to an archive. It then also lists a list of files present in an archive using namelist() method.

Example 6: Trying Different Compression Algorithms¶

As a part of our sixth example, we’ll try different compression algorithms available for creating zip archives.

Our code for this example creates a different archive for each compression type and writes four files of a different type to each archive. We are then listing archive names to see the file size created by each type.

We can notice from our results below that lzma has performed more compression compared to other algorithms.

Below we have created another example that is exactly the same as our above example with the only change that we have provided compresslevel for BZIP2 and DEFLATED compressions. As we had explained earlier, the lower values for compression level will result in compression completing faster but compression will be low hence file sizes will be more compared to more compression levels.

When we don’t provide compression level, it uses the best compression. We can compare file sizes generated by this example with previous ones. We can notice that file sizes generated by DEFLATED and BZIP2 are more compared to the previous example. As we did not provide compression level in our previous example, it used most compression. For this example, we have provided the least compression level hence compression completes fast but file sizes are more due to less compression.

Example 7: Extracting Files from Zip Archive¶

As a part of our seventh example, we’ll explain how we can extract the contents of an archive. We’ll be using extract() and extractall() methods of ZipFile instance for our purpose.

Important Methods of ZipFile Instance¶

Our code for this example starts by opening a previously created zip file in reading mode. It then extracts one single JPEG to the current path. We then extract the same file to a different location by giving the location to path parameter. We have then also extracted all files using extractall() method.

Example 8: Important Properties of Zipped Files¶

As a part of our eighth example, we’ll explain how we can retrieve details of individual members of an archive. The zipfile library has a class named ZipInfo whose object stores information about one file. The information like modification date, compression type, compression size, file size, CRC, and few other important attributes are available through ZipInfo instance. We can get ZipInfo instance using getinfo() or infolist() methods of ZipInfo instance.

Important Methods of ZipFile Object¶

Important Methods/Attributes of ZipInfo Object¶

Our code for this example starts by creating a method that takes as input ZipInfo instance and prints important attributes of members represented by that instance. We have then opened the existing zip archive in reading mode. We have then retrieved ZipInfo instance for the text file using getinfo() method and printed the attributes of the file. We have then retrieved all ZipInfo instances for all members of the archive and printed details of each member.

Example 9: Testing Zip File for Corruption¶

As a part of our ninth example, we are demonstrating how we can check for corrupt archives using testzip() method of ZipFile instance.

Important Methods of ZipFile Object¶

Our code for this example opens two different archives and checks them for corruption. It then prints whether archives are corrupt or not.

As a part of our tenth and last example, we’ll explain how we can compress compiled python files (.pyc) and create an archive from them. The zipfile provides a special class named PyZipFile for this purpose.

Important Methods of PyZipFile Object¶

It has the same methods as ZipFile with one extra method described below.

Our code for this example starts by creating an instance of PyZipFile for the new archive. It then adds files to the zip archive from the path given as input. We then list all files added to the archive using printdir() method.

This ends our small tutorial explaining the usage of zipfile module of Python. Please feel free to let us know your views in the comments section.

Python’s zipfile: Manipulate Your ZIP Files Efficiently

Table of Contents

Python’s zipfile is a standard library module intended to manipulate ZIP files. This file format is a widely adopted industry standard when it comes to archiving and compressing digital data. You can use it to package together several related files. It also allows you to reduce the size of your files and save disk space. Most importantly, it facilitates data exchange over computer networks.

Knowing how to create, read, write, populate, extract, and list ZIP files using the zipfile module is a useful skill to have as a Python developer or a DevOps engineer.

In this tutorial, you’ll learn how to:

If you commonly deal with ZIP files, then this knowledge can help to streamline your workflow to process your files confidently.

To get the files and archives that you’ll use to code the examples in this tutorial, click the link below:

Get Materials: Click here to get a copy of the files and archives that you’ll use to run the examples in this zipfile tutorial.

Getting Started With ZIP Files

ZIP files are a well-known and popular tool in today’s digital world. These files are fairly popular and widely used for cross-platform data exchange over computer networks, notably the Internet.

You can use ZIP files for bundling regular files together into a single archive, compressing your data to save some disk space, distributing your digital products, and more. In this tutorial, you’ll learn how to manipulate ZIP files using Python’s zipfile module.

Because the terminology around ZIP files can be confusing at times, this tutorial will stick to the following conventions regarding terminology:

TermMeaning
ZIP file, ZIP archive, or archiveA physical file that uses the ZIP file format
FileA regular computer file
Member fileA file that is part of an existing ZIP file

Having these terms clear in your mind will help you avoid confusion while you read through the upcoming sections. Now you’re ready to continue learning how to manipulate ZIP files efficiently in your Python code!

What Is a ZIP File?

PKWARE is the company that created and first implemented this file format. The company put together and maintains the current format specification, which is publicly available and allows the creation of products, programs, and processes that read and write files using the ZIP file format.

The ZIP file format is a cross-platform, interoperable file storage and transfer format. It combines lossless data compression, file management, and data encryption.

Data compression isn’t a requirement for an archive to be considered a ZIP file. So you can have compressed or uncompressed member files in your ZIP archives. The ZIP file format supports several compression algorithms, though Deflate is the most common. The format also supports information integrity checks with CRC32.

Even though there are other similar archiving formats, such as RAR and TAR files, the ZIP file format has quickly become a common standard for efficient data storage and for data exchange over computer networks.

You may be familiar with GitHub, which provides web hosting for software development and version control using Git. GitHub uses ZIP files to package software projects when you download them to your local computer. For example, you can download the exercise solutions for Python Basics: A Practical Introduction to Python 3 book in a ZIP file, or you can download any other project of your choice.

ZIP files allow you to aggregate, compress, and encrypt files into a single interoperable and portable container. You can stream ZIP files, split them into segments, make them self-extracting, and more.

Why Use ZIP Files?

Knowing how to create, read, write, and extract ZIP files can be a useful skill for developers and professionals who work with computers and digital information. Among other benefits, ZIP files allow you to:

These features make ZIP files a useful addition to your Python toolbox if you’re looking for a flexible, portable, and reliable way to archive your digital files.

Can Python Manipulate ZIP Files?

Python also provides a high-level module called zipfile specifically designed to create, read, write, extract, and list the content of ZIP files. In this tutorial, you’ll learn about Python’s zipfile and how to use it effectively.

Manipulating Existing ZIP Files With Python’s zipfile

Python’s zipfile provides convenient classes and functions that allow you to create, read, write, extract, and list the content of your ZIP files. Here are some additional features that zipfile supports:

Be aware that zipfile does have a few limitations. For example, the current data decryption feature can be pretty slow because it uses pure Python code. The module can’t handle the creation of encrypted ZIP files. Finally, the use of multi-disk ZIP files isn’t supported either. Despite these limitations, zipfile is still a great and useful tool. Keep reading to explore its capabilities.

Opening ZIP Files for Reading and Writing

In the zipfile module, you’ll find the ZipFile class. This class works pretty much like Python’s built-in open() function, allowing you to open your ZIP files using different modes. The read mode ( «r» ) is the default. You can also use the write ( «w» ), append ( «a» ), and exclusive ( «x» ) modes. You’ll learn more about each of these in a moment.

ZipFile implements the context manager protocol so that you can use the class in a with statement. This feature allows you to quickly open and work with a ZIP file without worrying about closing the file after you finish your work.

Before writing any code, make sure you have a copy of the files and archives that you’ll be using:

Get Materials: Click here to get a copy of the files and archives that you’ll use to run the examples in this zipfile tutorial.

To get your working environment ready, place the downloaded resources into a directory called python-zipfile/ in your home folder. Once you have the files in the right place, move to the newly created directory and fire up a Python interactive session there.

The first argument to the initializer of ZipFile can be a string representing the path to the ZIP file that you need to open. This argument can accept file-like and path-like objects too. In this example, you use a string-based path.

If you want to make sure that you’re targeting a valid ZIP file before you try to open it, then you can wrap ZipFile in a try … except statement and catch any BadZipFile exception:

To check for a valid ZIP file, you can also use the is_zipfile() function:

Note: If you’re using ZipFile with existing files, then you should be careful with the «w» mode. You can truncate your ZIP file and lose all the original content.

If the target ZIP file doesn’t exist, then ZipFile creates it for you when you close the archive:

Note: ZipFile is smart enough to create a new archive when you use the class in writing mode and the target archive doesn’t exist. However, the class doesn’t create new directories in the path to the target ZIP file if those directories don’t already exist.

That explains why the following code won’t work:

Because the missing/ directory in the path to the target hello.zip file doesn’t exist, you get a FileNotFoundError exception.

To try out the «a» mode, go ahead and add the new_hello.txt file to your newly created hello.zip archive:

Reading Metadata From ZIP Files

Here’s a summary of those methods:

Note: By default, ZipFile doesn’t compress the input files to add them to the final archive. That’s why the size and the compressed size are the same in the above example. You’ll learn more about this topic in the Compressing Files and Directories section below.

Note: The example above was adapted from zipfile — ZIP Archive Access.

Reading From and Writing to Member Files

Note: Python’s zipfile supports decryption. However, it doesn’t support the creation of encrypted ZIP files. That’s why you would need to use an external file archiver to encrypt your files.

Some popular file archivers include 7z and WinRAR for Windows, Ark and GNOME Archive Manager for Linux, and Archiver for macOS.

If you’re looking for a more flexible way to read from member files and create and add new member files to an archive, then ZipFile.open() is for you. Like the built-in open() function, this method implements the context manager protocol, and therefore it supports the with statement:

Reading the Content of Member Files as Text

However, when you have a ZIP archive containing text files, you may want to read their content as text instead of as bytes. There are at least two way to do this. You can use:

Extracting Member Files From Your ZIP Archives

Extracting the content of a given archive is one of the most common operations that you’ll do on ZIP files. Depending on your needs, you may want to extract a single file at a time or all the files in one go.

Closing ZIP Files After Use

Sometimes, it’s convenient for you to open a given ZIP file without using a with statement. In those cases, you need to manually close the archive after use to complete any writing operations and to free the acquired resources.

Creating, Populating, and Extracting Your Own ZIP Files

In this section, you’ll code a few practical examples that’ll help you learn how to create ZIP files from several input files and from an entire directory using zipfile and other Python tools. You’ll also learn how to use zipfile for file compression and more.

Creating a ZIP File From Multiple Regular Files

Sometimes you need to create a ZIP archive from several related files. This way, you can have all the files in a single container for distributing them over a computer network or sharing them with friends or colleagues. To this end, you can create a list of target files and write them into an archive using ZipFile and a loop:

Here, you create a ZipFile object with the desired archive name as its first argument. The «w» mode allows you to write member files into the final ZIP file.

Building a ZIP File From a Directory

Now check out the root_dir/ folder in your working directory. In this case, you’ll find the following structure:

Here’s how to zip a complete directory tree, like the one above, using zipfile along with Path.rglob() from the pathlib module:

Compressing Files and Directories

Typically, you’ll use the term stored to refer to member files written into a ZIP file without compression. That’s why the default compression method of ZipFile is called ZIP_STORED, which actually refers to uncompressed member files that are simply stored in the containing archive.

ConstantCompression MethodRequired Module
zipfile.ZIP_DEFLATEDDeflatezlib
zipfile.ZIP_BZIP2Bzip2bz2
zipfile.ZIP_LZMALZMAlzma

As an additional requirement, if you choose one of these methods, then the compression module that supports it must be available in your Python installation. Otherwise, you’ll get a RuntimeError exception, and your code will break.

Note: Binary files, such as PNG, JPG, MP3, and the like, already use some kind of compression. As a result, adding them to a ZIP file may not make the data any smaller, because it’s already compressed to some level.

Now say you want to archive and compress the content of a given directory using the Deflate method, which is the most commonly used method in ZIP files. To do that, you can run the following code:

In this example, you pass 9 to compresslevel to get maximum compression. To provide this argument, you use a keyword argument. You need to do this because compresslevel isn’t the fourth positional argument to the ZipFile initializer.

After running this code, you’ll have a comp_dir.zip file in your current directory. If you compare the size of that file with the size of your original sample.zip file, then you’ll note a significant size reduction.

Creating ZIP Files Sequentially

Creating ZIP files sequentially can be another common requirement in your day-to-day programming. For example, you may need to create an initial ZIP file with or without content and then append new member files as soon as they become available. In this situation, you need to open and close the target ZIP file multiple times.

To solve this problem, you can use ZipFile in append mode ( «a» ), as you have already done. This mode allows you to safely append new member files to a ZIP archive without truncating its current content:

In this example, append_member() is a function that appends a file ( member ) to the input ZIP archive ( zip_file ). To perform this action, the function opens and closes the target archive every time you call it. Using a function to perform this task allows you to reuse the code as many times as you need.

Extracting Files and Directories

Exploring Additional Classes From zipfile

Finding Path in a ZIP File

When you open a ZIP file with your favorite archiver application, you see the archive’s internal structure. You may have files at the root of the archive. You may also have subdirectories with more files. The archive looks like a normal directory on your file system, with each file located at a specific path.

The zipfile.Path class allows you to construct path objects to quickly create and manage paths to member files and directories inside a given ZIP file. The class takes two arguments:

With your old friend sample.zip as the target, run the following code:

It’s clear that zipfile.Path provides many useful features that you can use to manage member files in your ZIP archives in almost no time.

Building Importable ZIP Files With PyZipFile

Since version 2.3, the Python interpreter has supported importing Python code from ZIP files, a capability known as Zip imports. This feature is quite convenient. It allows you to create importable ZIP files to distribute your modules and packages as a single archive.

Note: You can also use the ZIP file format to create and distribute Python executable applications, which are commonly known as Python Zip applications. To learn how to create them, check out Python’s zipapp: Build Executable Zip Applications.

Inside the python-zipfile/ directory, you have a hello.py module with the following content:

Once you have hello.py bundled into a ZIP file, then you can use Python’s import system to import this module from its containing archive:

For this example to work, you need to change the placeholder path and pass the path to hello.zip on your file system. Once your importable ZIP file is in this list, then you can import your code just like you’d do with a regular module.

Finally, consider the hello/ subdirectory in your working folder. It contains a small Python package with the following structure:

The __init__.py module turns the hello/ directory into a Python package. The hello.py module is the same one that you used in the example above. Now suppose you want to bundle this package into a ZIP file. If that’s the case, then you can do the following:

Because your code is in a package now, you first need to import the hello module from the hello package. Then you can access your greet() function normally.

Running zipfile From Your Command Line

What if you need to create a ZIP file to archive an entire directory? For example, you may have your own source_dir/ with the same three files as the example above. You can create a ZIP file from that directory by using the following command:

Note: When you use zipfile to create an archive from your command line, the library implicitly uses the Deflate compression algorithm when archiving your files.

After running this command, you’ll have a new sample/ folder in your working directory. The new folder will contain the current files in your sample.zip archive.

Using Other Libraries to Manage ZIP Files

There are a few other tools in the Python standard library that you can use to archive, compress, and decompress your files at a lower level. Python’s zipfile uses some of these internally, mainly for compression purposes. Here’s a summary of some of these tools:

ModuleDescription
zlibAllows compression and decompression using the zlib library
bz2Provides an interface for compressing and decompressing data using the Bzip2 compression algorithm
lzmaProvides classes and functions for compressing and decompressing data using the LZMA compression algorithm

For example, you can use gzip to create a compressed file containing some text:

Conclusion

Python’s zipfile is a handy tool when you need to read, write, compress, decompress, and extract files from ZIP archives. The ZIP file format has become an industry standard, allowing you to package and optionally compress your digital data.

The benefits of using ZIP files include archiving related files together, saving disk space, making it easy to transfer data over computer networks, bundling Python code for distribution purposes, and more.

In this tutorial, you learned how to:

You also learned how to use zipfile from your command line to list, create, and extract your ZIP files. With this knowledge, you’re ready to efficiently archive, compress, and manipulate your digital data using the ZIP file format.

Get a short & sweet Python Trick delivered to your inbox every couple of days. No spam ever. Unsubscribe any time. Curated by the Real Python team.

How to open zip file python. Смотреть фото How to open zip file python. Смотреть картинку How to open zip file python. Картинка про How to open zip file python. Фото How to open zip file python

About Leodanis Pozo Ramos

How to open zip file python. Смотреть фото How to open zip file python. Смотреть картинку How to open zip file python. Картинка про How to open zip file python. Фото How to open zip file python How to open zip file python. Смотреть фото How to open zip file python. Смотреть картинку How to open zip file python. Картинка про How to open zip file python. Фото How to open zip file python

Leodanis is an industrial engineer who loves Python and software development. He’s a self-taught Python developer with 6+ years of experience. He’s an avid technical writer with a growing number of articles published on Real Python and other sites.

Each tutorial at Real Python is created by a team of developers so that it meets our high quality standards. The team members who worked on this tutorial are:

How to open zip file python. Смотреть фото How to open zip file python. Смотреть картинку How to open zip file python. Картинка про How to open zip file python. Фото How to open zip file python

How to open zip file python. Смотреть фото How to open zip file python. Смотреть картинку How to open zip file python. Картинка про How to open zip file python. Фото How to open zip file python

How to open zip file python. Смотреть фото How to open zip file python. Смотреть картинку How to open zip file python. Картинка про How to open zip file python. Фото How to open zip file python

How to open zip file python. Смотреть фото How to open zip file python. Смотреть картинку How to open zip file python. Картинка про How to open zip file python. Фото How to open zip file python

How to open zip file python. Смотреть фото How to open zip file python. Смотреть картинку How to open zip file python. Картинка про How to open zip file python. Фото How to open zip file python

Master Real-World Python Skills With Unlimited Access to Real Python

How to open zip file python. Смотреть фото How to open zip file python. Смотреть картинку How to open zip file python. Картинка про How to open zip file python. Фото How to open zip file python

Join us and get access to thousands of tutorials, hands-on video courses, and a community of expert Pythonistas:

Master Real-World Python Skills
With Unlimited Access to Real Python

How to open zip file python. Смотреть фото How to open zip file python. Смотреть картинку How to open zip file python. Картинка про How to open zip file python. Фото How to open zip file python

Join us and get access to thousands of tutorials, hands-on video courses, and a community of expert Pythonistas:

What Do You Think?

What’s your #1 takeaway or favorite thing you learned? How are you going to put your newfound skills to use? Leave a comment below and let us know.

Commenting Tips: The most useful comments are those written with the goal of learning from or helping out other students. Get tips for asking good questions and get answers to common questions in our support portal. Looking for a real-time conversation? Visit the Real Python Community Chat or join the next “Office Hours” Live Q&A Session. Happy Pythoning!

Related Tutorial Categories: intermediate python

Источники информации:

Добавить комментарий

Ваш адрес email не будет опубликован. Обязательные поля помечены *