File and Directory Handling in Python

File and Directory Handling in Python

Working with files and directories is one of the most common tasks in programming. Python provides built-in functions and powerful modules to handle files efficiently and safely. This article covers file operations (reading, writing, appending), CSV handling, object serialization with pickle, and parsing JSON and XML — all essential skills for real-world Python development.

Introduction to Files

A file is a named location on disk used to store data permanently. Python treats files as either text files (human-readable characters) or binary files (images, executables, pickled objects, etc.).

Core operations include:

  • Opening a file
  • Reading from / writing to a file
  • Closing the file

Opening a File

Use the built-in open() function.

file = open("example.txt", "r")   # read mode (default)
# or
with open("example.txt", "r", encoding="utf-8") as file:
    # work with file
    # automatically closed when block ends

Always prefer the with statement — it guarantees the file is properly closed even if an exception occurs.

File Modes

ModeDescription
rRead (default). File must exist.
wWrite. Creates new file or overwrites existing one.
aAppend. Creates file if not exists, writes at end.
xExclusive creation. Fails if file already exists.
bBinary mode (add to any above: rb, wb, ab…)
tText mode (default)
+Update mode (read + write): r+, w+, a+

Reading Data from File

with open("data.txt", "r", encoding="utf-8") as f:
    # Method 1: read entire content
    content = f.read()
    
    # Method 2: read line by line
    for line in f:
        print(line.strip())
    
    # Method 3: read all lines into list
    lines = f.readlines()

Writing Data into File

with open("output.txt", "w", encoding="utf-8") as f:
    f.write("Hello, world!\n")
    f.write("Line 2\n")
    
    # Writing multiple lines
    lines = ["Python\n", "is\n", "awesome\n"]
    f.writelines(lines)

Appending Data into File

with open("log.txt", "a", encoding="utf-8") as f:
    f.write(f"New entry at {datetime.now()}\n")

Line Count in File

def count_lines(filename):
    with open(filename, "r", encoding="utf-8") as f:
        return sum(1 for _ in f)

print(count_lines("large_file.txt"))  # very memory efficient

CSV Module

The csv module makes reading and writing CSV files safe and easy (handles quotes, delimiters, etc.).

Reading from CSV file

import csv

with open("users.csv", newline="", encoding="utf-8") as f:
    reader = csv.reader(f)
    header = next(reader)           # first row = header
    for row in reader:
        print(row)                  # list of values
    
    # OR using DictReader (recommended)
    reader = csv.DictReader(f)
    for row in reader:
        print(row["name"], row["email"])

Writing into CSV file

import csv

data = [
    {"name": "Mahmoud", "age": 30, "city": "Giza"},
    {"name": "Sara",    "age": 28, "city": "Cairo"}
]

with open("output.csv", "w", newline="", encoding="utf-8") as f:
    writer = csv.DictWriter(f, fieldnames=["name", "age", "city"])
    writer.writeheader()
    writer.writerows(data)

Object Serialization – pickle Module

pickle is Python’s built-in serialization module. It can store almost any Python object (lists, dicts, custom classes, etc.) to disk.

import pickle

data = {
    "name": "Mahmoud",
    "scores": [95, 88, 92],
    "settings": {"theme": "dark", "lang": "en"}
}

# Writing
with open("data.pkl", "wb") as f:
    pickle.dump(data, f)

# Reading
with open("data.pkl", "rb") as f:
    loaded = pickle.load(f)
    print(loaded)

JSON Parsing

JSON is the most common data interchange format today. Python’s json module handles it perfectly.

import json

# String → Python object
json_str = '''{"name": "Mahmoud", "age": 30, "skills": ["Python", "SQL"]}'''
data = json.loads(json_str)
print(data["skills"][0])          # Python

# Python object → JSON string
person = {"name": "Sara", "age": 28}
json_string = json.dumps(person, indent=2, ensure_ascii=False)
print(json_string)

# Reading from file
with open("config.json", encoding="utf-8") as f:
    config = json.load(f)

# Writing to file
with open("output.json", "w", encoding="utf-8") as f:
    json.dump(config, f, indent=2, ensure_ascii=False)

XML Parsing

Python provides several ways to parse XML. The most commonly used are xml.etree.ElementTree (built-in) and lxml (faster, more powerful).

Using ElementTree

import xml.etree.ElementTree as ET

tree = ET.parse("books.xml")
root = tree.getroot()

for book in root.findall("book"):
    title = book.find("title").text
    author = book.find("author").text
    print(f"{title} by {author}")

Best Practices Summary

  • Always use with open(...) → automatic resource management
  • Specify encoding="utf-8" for text files
  • Use csv module instead of splitting strings manually
  • Use json for modern APIs and configuration
  • Avoid pickle with untrusted data
  • Prefer pathlib over os.path for modern path handling

Conclusion

File handling, CSV processing, serialization (pickle), and structured data parsing (JSON, XML) are foundational skills in Python programming. Mastering these tools allows you to read logs, process datasets, save application state, communicate with APIs, and much more.

Practice by:

  • Creating a small contact book (CSV + JSON)
  • Saving and loading game state with pickle
  • Parsing real-world XML/JSON APIs

Happy coding!

Next Post Previous Post