Learn Pandas DataFrames

Pandas DataFrame is a two-dimensional labeled data structure, similar to a table or a spreadsheet. It consists of rows and columns, where each column can have a different data type. Pandas provides powerful tools for data manipulation and analysis using DataFrames. Here's a guide to working with Pandas DataFrames:

Creating DataFrames

You can create DataFrames from various data sources such as dictionaries, lists, NumPy arrays, or external files like CSV, Excel, SQL databases, etc.

From Dictionary:


          import pandas as pd

          data = {'Name': ['Alice', 'Bob', 'Charlie'],
                  'Age': [25, 30, 35],
                  'City': ['New York', 'Los Angeles', 'Chicago']}
          df = pd.DataFrame(data)

From Lists


          data = [['Alice', 25, 'New York'],
          ['Bob', 30, 'Los Angeles'],
          ['Charlie', 35, 'Chicago']]
          df = pd.DataFrame(data, columns=['Name', 'Age', 'City'])

Viewing Data

head() and tail(): View the first or last few rows of the DataFrame.


          print(df.head())  # View the first few rows
          print(df.tail())  # View the last few rows

info(): Get a concise summary of the DataFrame including column names, data types, and non-null counts.


          print(df.info())

describe(): Generate descriptive statistics for numerical columns.

          
          print(df.describe())

Selection and Indexing

Selecting Columns: You can select one or more columns using square brackets or dot notation.


          print(df['Name'])
          print(df.Name)  # Alternative syntax

Selecting Rows: Use iloc[] or loc[] to select rows by index or label, respectively.


          print(df.iloc[0])   # Select row by index
          print(df.loc[0])    # Select row by label (if label is index)

Operations on DataFrames

Adding/Removing Columns: You can add or remove columns from a DataFrame.


          df['Gender'] = ['Female', 'Male', 'Male']  # Adding a new column
          df.drop(columns=['Gender'], inplace=True)  # Removing a column

Filtering Data: You can filter rows based on conditions.


          print(df[df['Age'] > 25])  # Filter rows where Age > 25

Grouping and Aggregating: You can group data using groupby() and perform aggregations like sum(), mean(), count(), etc.

          
          print(df.groupby('City').mean())  # Mean value for each group

Cleaning Data

Correlation

Plotting

Learn Pandas DataFrames

Creating DataFrames

Viewing Data

Selection and Indexing

Operations on DataFrames

Q3 Schools : India

Online Complier

Website Development

Campus Learning