File and Directory Handling in Python
Working with files and directories is one of the most common tasks in programming. Python provides built-in functions and powerful modules to handle files efficiently and safely. This article covers file operations (reading, writing, appending), CSV handling, object serialization with pickle, and parsing JSON and XML — all essential skills for real-world Python development.
Introduction to Files
A file is a named location on disk used to store data permanently. Python treats files as either text files (human-readable characters) or binary files (images, executables, pickled objects, etc.).
Core operations include:
- Opening a file
- Reading from / writing to a file
- Closing the file
Opening a File
Use the built-in open() function.
file = open("example.txt", "r") # read mode (default)
# or
with open("example.txt", "r", encoding="utf-8") as file:
# work with file
# automatically closed when block ends
Always prefer the with statement — it guarantees the file is properly closed even if an exception occurs.
File Modes
| Mode | Description |
|---|---|
r | Read (default). File must exist. |
w | Write. Creates new file or overwrites existing one. |
a | Append. Creates file if not exists, writes at end. |
x | Exclusive creation. Fails if file already exists. |
b | Binary mode (add to any above: rb, wb, ab…) |
t | Text mode (default) |
+ | Update mode (read + write): r+, w+, a+ |
Reading Data from File
with open("data.txt", "r", encoding="utf-8") as f:
# Method 1: read entire content
content = f.read()
# Method 2: read line by line
for line in f:
print(line.strip())
# Method 3: read all lines into list
lines = f.readlines()
Writing Data into File
with open("output.txt", "w", encoding="utf-8") as f:
f.write("Hello, world!\n")
f.write("Line 2\n")
# Writing multiple lines
lines = ["Python\n", "is\n", "awesome\n"]
f.writelines(lines)
Appending Data into File
with open("log.txt", "a", encoding="utf-8") as f:
f.write(f"New entry at {datetime.now()}\n")
Line Count in File
def count_lines(filename):
with open(filename, "r", encoding="utf-8") as f:
return sum(1 for _ in f)
print(count_lines("large_file.txt")) # very memory efficient
CSV Module
The csv module makes reading and writing CSV files safe and easy (handles quotes, delimiters, etc.).
Reading from CSV file
import csv
with open("users.csv", newline="", encoding="utf-8") as f:
reader = csv.reader(f)
header = next(reader) # first row = header
for row in reader:
print(row) # list of values
# OR using DictReader (recommended)
reader = csv.DictReader(f)
for row in reader:
print(row["name"], row["email"])
Writing into CSV file
import csv
data = [
{"name": "Mahmoud", "age": 30, "city": "Giza"},
{"name": "Sara", "age": 28, "city": "Cairo"}
]
with open("output.csv", "w", newline="", encoding="utf-8") as f:
writer = csv.DictWriter(f, fieldnames=["name", "age", "city"])
writer.writeheader()
writer.writerows(data)
Object Serialization – pickle Module
pickle is Python’s built-in serialization module. It can store almost any Python object (lists, dicts, custom classes, etc.) to disk.
import pickle
data = {
"name": "Mahmoud",
"scores": [95, 88, 92],
"settings": {"theme": "dark", "lang": "en"}
}
# Writing
with open("data.pkl", "wb") as f:
pickle.dump(data, f)
# Reading
with open("data.pkl", "rb") as f:
loaded = pickle.load(f)
print(loaded)
JSON Parsing
JSON is the most common data interchange format today. Python’s json module handles it perfectly.
import json
# String → Python object
json_str = '''{"name": "Mahmoud", "age": 30, "skills": ["Python", "SQL"]}'''
data = json.loads(json_str)
print(data["skills"][0]) # Python
# Python object → JSON string
person = {"name": "Sara", "age": 28}
json_string = json.dumps(person, indent=2, ensure_ascii=False)
print(json_string)
# Reading from file
with open("config.json", encoding="utf-8") as f:
config = json.load(f)
# Writing to file
with open("output.json", "w", encoding="utf-8") as f:
json.dump(config, f, indent=2, ensure_ascii=False)
XML Parsing
Python provides several ways to parse XML. The most commonly used are xml.etree.ElementTree (built-in) and lxml (faster, more powerful).
Using ElementTree
import xml.etree.ElementTree as ET
tree = ET.parse("books.xml")
root = tree.getroot()
for book in root.findall("book"):
title = book.find("title").text
author = book.find("author").text
print(f"{title} by {author}")
Best Practices Summary
- Always use
with open(...)→ automatic resource management - Specify
encoding="utf-8"for text files - Use
csvmodule instead of splitting strings manually - Use
jsonfor modern APIs and configuration - Avoid
picklewith untrusted data - Prefer
pathliboveros.pathfor modern path handling
Conclusion
File handling, CSV processing, serialization (pickle), and structured data parsing (JSON, XML) are foundational skills in Python programming. Mastering these tools allows you to read logs, process datasets, save application state, communicate with APIs, and much more.
Practice by:
- Creating a small contact book (CSV + JSON)
- Saving and loading game state with pickle
- Parsing real-world XML/JSON APIs
Happy coding!
