Pandas provides robust functionality for working with CSV and Excel files, making it easier to load, analyze, and manipulate tabular data. This article covers how to read, write, and manipulate data from CSV and Excel files using Pandas.
Before you can use Pandas, you need to import the library:
import pandas as pd
CSV (Comma Separated Values) files are commonly used for storing tabular data. You can easily read and write CSV files using Pandas.
You can use the read_csv()
function to load a CSV file into a DataFrame:
# Reading a CSV file df = pd.read_csv("data.csv") print(df)
If the CSV file is located at a specific path or URL, you can specify the full path:
# Reading a CSV file from a path df = pd.read_csv("path/to/your/data.csv") print(df)
To write a DataFrame to a CSV file, use the to_csv()
function:
# Writing DataFrame to a CSV file df.to_csv("output.csv", index=False) # index=False to avoid writing row numbers
CSV files may use different delimiters, such as semicolons or tabs. You can specify the delimiter using the sep
parameter:
# Reading a CSV with a different delimiter df = pd.read_csv("data.csv", sep=";") print(df)
Pandas provides excellent support for reading and writing Excel files with the help of the openpyxl
library for .xlsx files.
Use the read_excel()
function to load an Excel file into a DataFrame:
# Reading an Excel file df = pd.read_excel("data.xlsx") print(df)
If the Excel file has multiple sheets, you can specify the sheet name:
# Reading a specific sheet by name df = pd.read_excel("data.xlsx", sheet_name="Sheet1") print(df)
You can also load all sheets into a dictionary of DataFrames:
# Reading all sheets from an Excel file df_dict = pd.read_excel("data.xlsx", sheet_name=None) for sheet_name, df in df_dict.items(): print(f"Data from {sheet_name}:") print(df)
To write a DataFrame to an Excel file, use the to_excel()
function:
# Writing DataFrame to an Excel file df.to_excel("output.xlsx", index=False)
You can write multiple DataFrames to different sheets in an Excel file:
# Writing multiple DataFrames to different sheets with pd.ExcelWriter("output.xlsx") as writer: df.to_excel(writer, sheet_name="Sheet1", index=False) df.to_excel(writer, sheet_name="Sheet2", index=False)
Working with CSV and Excel files in Pandas is straightforward, thanks to the powerful read_csv()
, to_csv()
, read_excel()
, and to_excel()
functions. These tools make it easy to load, analyze, and save tabular data, making Pandas an essential library for data manipulation and analysis in Python.