In this short guide, you'll see how to unzip a specific folder from a ZIP archive using Python.
Here you can find the short answer:
(1) Extract specific folder with ZipFile
import zipfile
with zipfile.ZipFile('archive.zip', 'r') as zip_ref:
for file in zip_ref.namelist():
if file.startswith('project/data/'):
zip_ref.extract(file, 'output/')
(2) Extract folder with pathlib filtering
from zipfile import ZipFile
members = [m for m in zip_ref.namelist() if m.startswith('docs/')]
zip_ref.extractall('output/', members=members)
(3) Extract matching pattern
import fnmatch
members = [m for m in zip_ref.namelist() if fnmatch.fnmatch(m, 'reports/*')]
zip_ref.extractall('output/', members=members)
So let's see several useful examples on how to extract specific folders from ZIP archives with Python.
Suppose you have a ZIP file structure like:
project.zip
├── project/
│ ├── data/
│ │ ├── sales.csv
│ │ └── customers.csv
│ ├── docs/
│ │ └── readme.md
│ └── src/
│ └── main.py
1: Extract specific folder using ZipFile
Let's start with the most straightforward method - extracting a specific folder by filtering filenames:
import zipfile
import os
zip_path = 'project.zip'
target_folder = 'project/data/'
output_dir = 'output/'
with zipfile.ZipFile(zip_path, 'r') as zip_ref:
for file in zip_ref.namelist():
if file.startswith(target_folder):
zip_ref.extract(file, output_dir)
print(f"Extracted: {file}")
print(f"Files extracted to: {output_dir}")
result will be:
Extracted: project/data/sales.csv
Extracted: project/data/customers.csv
Files extracted to: output/
This method extracts only files within the project/data/ folder while preserving the full directory structure. The extracted files will be in output/project/data/.
What if you want to extract without the parent folders? You can modify the extraction path:
with zipfile.ZipFile('project.zip', 'r') as zip_ref:
for file in zip_ref.namelist():
if file.startswith('project/data/') and not file.endswith('/'):
filename = os.path.basename(file)
source = zip_ref.open(file)
target = open(os.path.join('output/', filename), 'wb')
target.write(source.read())
target.close()
source.close()
print("Files extracted without folder structure")
result:
Files extracted without folder structure
output/
├── sales.csv
└── customers.csv
2: Extract multiple folders using extractall with members
The most efficient way to extract multiple folders is using the extractall() method with a filtered list of members:
import zipfile
zip_path = 'project.zip'
folders_to_extract = ['project/data/', 'project/docs/']
output_dir = 'output/'
with zipfile.ZipFile(zip_path, 'r') as zip_ref:
members = [m for m in zip_ref.namelist()
if any(m.startswith(folder) for folder in folders_to_extract)]
zip_ref.extractall(output_dir, members=members)
print(f"Extracted {len(members)} files")
print("Extracted files:", members)
result:
Extracted 3 files
Extracted files: ['project/data/sales.csv', 'project/data/customers.csv', 'project/docs/readme.md']
This approach is faster than extracting files one by one because extractall() optimizes the extraction process. It's the recommended method for extracting multiple folders or large numbers of files.
3: Extract folder using pattern matching
You can use pattern matching with fnmatch or regular expressions to extract folders based on flexible criteria:
import zipfile
import fnmatch
zip_path = 'project.zip'
pattern = 'project/*/readme.md'
output_dir = 'output/'
with zipfile.ZipFile(zip_path, 'r') as zip_ref:
members = [m for m in zip_ref.namelist()
if fnmatch.fnmatch(m, pattern)]
zip_ref.extractall(output_dir, members=members)
print(f"Extracted files matching pattern: {pattern}")
print("Files:", members)
result:
Extracted files matching pattern: project/*/readme.md
Files: ['project/docs/readme.md']
For more complex patterns, use regular expressions:
import zipfile
import re
zip_path = 'backup.zip'
pattern = r'^reports/202[4-6]/.*\.csv$'
output_dir = 'output/'
with zipfile.ZipFile(zip_path, 'r') as zip_ref:
members = [m for m in zip_ref.namelist()
if re.match(pattern, m)]
zip_ref.extractall(output_dir, members=members)
print(f"Extracted {len(members)} CSV files from 2024-2026 reports")
result:
Extracted 15 CSV files from 2024-2026 reports
This method is excellent for extracting specific file types, date-based folders, or files matching naming conventions.
4: Extract folder with size validation
Before extracting, you might want to check the uncompressed size to avoid disk space issues:
import zipfile
zip_path = 'project.zip'
target_folder = 'project/data/'
max_size_mb = 100
with zipfile.ZipFile(zip_path, 'r') as zip_ref:
members = [m for m in zip_ref.namelist() if m.startswith(target_folder)]
total_size = sum(zip_ref.getinfo(m).file_size for m in members)
total_size_mb = total_size / (1024 * 1024)
if total_size_mb > max_size_mb:
print(f"Error: Folder size {total_size_mb:.2f}MB exceeds limit {max_size_mb}MB")
else:
zip_ref.extractall('output/', members=members)
print(f"Extracted {total_size_mb:.2f}MB successfully")
result:
Extracted 45.67MB successfully
This prevents disk space exhaustion and handles zip bombs or unexpectedly large archives safely.
5: List folder contents before extraction
To preview what will be extracted without actually extracting files:
import zipfile
zip_path = 'project.zip'
target_folder = 'project/data/'
with zipfile.ZipFile(zip_path, 'r') as zip_ref:
files = [f for f in zip_ref.namelist() if f.startswith(target_folder)]
print(f"Folder contains {len(files)} files:")
for file in files:
info = zip_ref.getinfo(file)
size_mb = info.file_size / (1024 * 1024)
print(f" {file} ({size_mb:.2f}MB)")
result:
Folder contains 2 files:
project/data/sales.csv (12.45MB)
project/data/customers.csv (8.92MB)
This is useful for validation, logging, or user confirmation before extraction.