Andromeda
Note

CSV Module (Python)

Definition

A built-in module for reading and writing Comma-Separated Values (CSV) files, handling the complexities of delimiters and escape characters.

Why It Matters

CSV is the “lowest common denominator” of data interchange. Python’s csv module allows developers to bridge the gap between simple text files and complex data analysis tools, making it an essential skill for automation and data science.

Core Concepts

  • Reading:
    • csv.reader(file): Returns an iterator yielding lists of strings.
    • Memory Constraint: list(reader) loads the entire file into RAM; use for row in reader for large files.
  • Writing:
    • csv.writer(file): Used to write rows.
    • The “Double Newline” Bug: On Windows, always open the file with newline='' (e.g., open('f.csv', 'w', newline='')) to prevent extra blank lines.
  • Header-based Access:
    • csv.DictReader: Maps each row to a dictionary where keys are the column headers.
    • csv.DictWriter: Writes dictionaries to rows using a fieldnames list.
  • Example Usage:
import csv

# Writing a CSV file
with open('data.csv', 'w', newline='') as f:
    writer = csv.writer(f)
    writer.writerow(['Name', 'Age'])
    writer.writerow(['Alice', '30'])

# Reading and processing
with open('data.csv', 'r') as f:
    reader = csv.DictReader(f)
    for row in reader:
        print(f"{row['Name']} is {row['Age']} years old.")
  • Customization: delimiter (e.g., '\t') and lineterminator (e.g., '\n\n') allow for non-standard formats.

Connected Concepts