Home Python C Language C ++ HTML 5 CSS Javascript Java Kotlin SQL DJango Bootstrap React.js R C# PHP ASP.Net Numpy Dart Pandas Digital Marketing

Creating and Manipulating DataFrames in Pandas


Pandas is a powerful library in Python for data manipulation and analysis. One of its primary data structures is the DataFrame, which represents data in a tabular format. This article explores how to create and manipulate DataFrames in Pandas with examples.

Importing Pandas

Before using Pandas, you need to import the library:

    import pandas as pd
        

Creating DataFrames

You can create a DataFrame from various data sources such as dictionaries, lists, or CSV files.

Creating DataFrame from a Dictionary

    # Creating a DataFrame from a dictionary
    data = {
        "Name": ["Alice", "Bob", "Charlie"],
        "Age": [25, 30, 35],
        "City": ["New York", "Los Angeles", "Chicago"]
    }
    df = pd.DataFrame(data)
    print(df)
        

Creating DataFrame from a List of Lists

    # Creating a DataFrame from a list of lists
    data = [
        ["Alice", 25, "New York"],
        ["Bob", 30, "Los Angeles"],
        ["Charlie", 35, "Chicago"]
    ]
    df = pd.DataFrame(data, columns=["Name", "Age", "City"])
    print(df)
        

Creating DataFrame from a CSV File

    # Creating a DataFrame from a CSV file
    df = pd.read_csv("data.csv")
    print(df)
        

Basic DataFrame Operations

Once a DataFrame is created, you can perform various operations on it.

Accessing Columns

    # Accessing a single column
    print(df["Name"])

    # Accessing multiple columns
    print(df[["Name", "City"]])
        

Adding a New Column

    # Adding a new column
    df["Salary"] = [50000, 60000, 70000]
    print(df)
        

Deleting a Column

    # Deleting a column
    df = df.drop("Salary", axis=1)
    print(df)
        

Accessing Rows

    # Accessing a single row by index
    print(df.iloc[1])

    # Accessing multiple rows
    print(df.iloc[0:2])
        

Filtering Data

    # Filtering rows based on a condition
    filtered_df = df[df["Age"] > 25]
    print(filtered_df)
        

Updating Data

    # Updating a value in the DataFrame
    df.loc[1, "City"] = "San Francisco"
    print(df)
        

DataFrame Aggregation and Statistics

You can perform aggregation and statistical operations on DataFrames.

Summary Statistics

    # Summary statistics
    print(df.describe())
        

GroupBy Operations

    # Grouping data and calculating the mean
    grouped = df.groupby("City").mean()
    print(grouped)
        

Sorting Data

    # Sorting by a column
    sorted_df = df.sort_values("Age")
    print(sorted_df)
        

Conclusion

DataFrames are a fundamental feature of Pandas, allowing you to store and manipulate structured data easily. By mastering DataFrame creation and manipulation, you can perform efficient data analysis and preprocessing in Python.



Advertisement





Q3 Schools : India


Online Complier

HTML 5

Python

java

C++

C

JavaScript

Website Development

HTML

CSS

JavaScript

Python

SQL

Campus Learning

C

C#

java