Understanding CSV Files in Python

Comma-separated values (CSV) files are a universal format for storing tabular data. They are simple text files that use specific structuring to arrange tabular data. Because it’s a plain text format, it’s naturally language-independent and can be imported and exported by various software. This guide will delve into reading and writing CSV files using Python. This powerful and intuitive programming language has a strong suite of libraries for data manipulation and analysis.

Understanding the Python CSV Module

Python doesn’t require any external library to import or export CSV files. Python’s built-in module ‘csv’ is responsible for this functionality. The csv module’s reader and writer objects read and write sequences. Let’s import this module to get started.

import csv

Creating a CSV File in Python

Basic Usage of csv.writer()

The csv.writer() function returns a writer object that converts the user’s data into a delimited string. This string can later be used to write into CSV files using the writerow() method.

Let’s consider a scenario where we have to write the following data into a CSV file:

data = [
    ["SN", "Name", "Contribution"],
    [1, "Linus Torvalds", "Linux Kernel"],
    [2, "Tim Berners-Lee", "World Wide Web"],
    [3, "Guido van Rossum", "Python Programming"]
]

Here’s how we can write this data into a CSV file:

with open('innovators.csv', 'w', newline='') as file:
    writer = csv.writer(file)
    for row in data:
        writer.writerow(row)

In the above code, we first open the file in write mode ('w'). We then create a writer object using the csv.writer() function. The writerow() method is then used to write each row from our data into the CSV file.

Writing Multiple Rows with writerows()

The writerows() method writes all given rows into the CSV file at once. This method can be more efficient than calling writerow() multiple times, especially when dealing with large amounts of data.

with open('innovators.csv', 'w', newline='') as file:
    writer = csv.writer(file)
    writer.writerows(data)

Customizing Your CSV File Output in Python

Using Different Delimiters

The csv.writer() function uses a comma as the delimiter between fields in a CSV file by default. However, we can use a different delimiter if needed. For example, if we want to use a pipe (|) as the delimiter, we can do so by passing it as the delimiter parameter to the csv.writer() function.

with open('innovators.csv', 'w', newline='') as file:
    writer = csv.writer(file, delimiter='|')
    writer.writerows(data)

Handling Quotes in CSV Files

Sometimes, we might want to include quotes around certain fields in our CSV file. We can control this behavior using the quoting parameter of the csv.writer() function. This parameter accepts one of the following constants from the csv module:

  • csv.QUOTE_MINIMAL: Only quote fields that contain special characters such as the delimiter, quotechar or any of the characters in lineterminator.
  • csv.QUOTE_ALL: Quote all fields.
  • csv.QUOTE_NONNUMERIC: Quote all non-numeric fields.
  • csv.QUOTE_NONE: Do not quote fields.

For example, if we want to quote all non-numeric fields, we can do so as follows:

with open('quotes.csv', 'w', newline='') as file:
    writer = csv.writer(file, quoting=csv.QUOTE_NONNUMERIC)
    writer.writerows(data)

Using Custom Quoting Characters

By default, the csv.writer() function uses the double quote character (") as the quoting character. However, we can use a different character if needed by passing it as the quotechar parameter to the csv.writer() function.

with open('quotes.csv', 'w', newline='') as file:
    writer = csv.writer(file, quoting=csv.QUOTE_NONNUMERIC, quotechar='*')
    writer.writerows(data)

In the above code, we use the asterisk (*) as the quoting character.

Utilizing Dialects for Efficient CSV Handling in Python

When we’re dealing with multiple CSV files that have similar formats, it can be cumbersome to pass the same formatting parameters to the csv.writer() function each time. To avoid this, we can use dialects.

A dialect is a way to group many specific formatting patterns into a single name. We can define a dialect using the csv.register_dialect() function and then use it in our csv.writer() function.

csv.register_dialect('myDialect', delimiter='|', quoting=csv.QUOTE_ALL)

with open('office.csv', 'w', newline='') as file:
    writer = csv.writer(file, dialect='myDialect')
    writer.writerows(data)

In the above code, we first define a dialect named 'myDialect' that uses a pipe as the delimiter and quotes all fields. We then use this dialect when creating our writer object.

Writing to CSV Files Using Python’s csv.DictWriter()

The csv.DictWriter() class allows us to write to a CSV file from a Python dictionary. This can be useful when our data is stored as a list of dictionaries.

data = [
    {'player_name': 'Magnus Carlsen', 'fide_rating': 2870},
    {'player_name': 'Fabiano Caruana', 'fide_rating': 2822},
    {'player_name': 'Ding Liren', 'fide_rating': 2801}
]

with open('players.csv', 'w', newline='') as file:
    fieldnames = ['player_name', 'fide_rating']
    writer = csv.DictWriter(file, fieldnames=fieldnames)

    writer.writeheader()
    for row in data:
        writer.writerow(row)

In the above code, we first define the fieldnames that will be used as the column headers in our CSV file. We then create a DictWriter object and write the header row using the writeheader() method. Finally, we write each row of our data into the CSV file.

Reading Data from a CSV File in Python

Basic Usage of csv.reader()

The csv.reader() function returns a reader object which iterates over lines in the specified CSV file. Here’s a basic example of how to use this function:

with open('innovators.csv', 'r') as file:
    reader = csv.reader(file)
    for row in reader:
        print(row)

In the above code, we first open the file in read mode ('r'). We then create a reader object using the csv.reader() function. The reader object is an iterator over lines in the CSV file. We then print each row of the CSV file.

Reading CSV Files with Different Delimiters

Just like when writing to a CSV file, we can also specify a different delimiter when reading from a CSV file. We can do this by passing the delimiter parameter to the csv.reader() function.

with open('innovators.csv', 'r') as file:
    reader = csv.reader(file, delimiter='|')
    for row in reader:
        print(row)

Reading CSV Files into a Dictionary with csv.DictReader()

The csv.DictReader() function operates like a regular reader but maps the information read into a dictionary. The keys for the dictionary can be passed in with the fieldnames parameter or inferred from the first row of the CSV file.

with open('players.csv', 'r') as file:
    reader = csv.DictReader(file)
    for row in reader:
        print(row)

Wrapping Up: Understanding CSV Files in Python

In this guide, we’ve explored how to read and write CSV files in Python using the built-in csv module. We’ve covered how to write data into a CSV file, customize the output format, use dialects to simplify the handling of similar formats, and write data from a Python dictionary into a CSV file. We’ve also learned how to read data from a CSV file, both as a list of lists and as a list of dictionaries. With this knowledge, you should be equipped to handle CSV files in your Python projects effectively.

Share to...