๐ผ Pandas DataFrames
Pandas DataFrame is a primary two-dimensional tabular data structure in the pandas Python library. They allow you to store and manipulate tabular data efficiently.
Mastering this concept will significantly boost your Python data science skills!
๐ป Code Example:
import pandas as pd import numpy as np # 1. Create DataFrame from dict data = { "username": ["santoshtvk", "dhruv", "tvk", "alice", "bob"], "course" : ["Python", "AI", "Python", "DevOps", "AI"], "score" : [95, 88, 72, 65, 91], "premium" : [True, True, False, False, True], "joined" : pd.to_datetime(["2024-01-15","2024-02-20","2024-03-01","2024-03-15","2024-04-01"]), } df = pd.DataFrame(data) # 2. Basic exploration print("Shape:", df.shape) print("\nInfo:") df.info() print("\nDescribe:\n", df.describe()) # 3. Selection & filtering print("\nPremium users:\n", df[df["premium"] == True][["username","score"]]) print("\nScore > 80:\n", df.loc[df["score"] > 80, ["username","course","score"]]) # 4. GroupBy aggregation stats = df.groupby("course").agg( count=("username", "count"), avg_score=("score", "mean"), max_score=("score", "max"), ).round(1) print("\nCourse stats:\n", stats) # 5. New columns df["grade"] = pd.cut(df["score"], bins=[0,60,75,90,100], labels=["C","B","A","A+"]) df["days_active"] = (pd.Timestamp.now() - df["joined"]).dt.days # 6. Merge (join) plans = pd.DataFrame({ "username": ["santoshtvk", "dhruv"], "plan" : ["Annual", "Monthly"], }) merged = df.merge(plans, on="username", how="left") print("\nMerged sample:\n", merged[["username","score","grade","plan"]].head()) # 7. Pivot table pivot = df.pivot_table(values="score", index="course", columns="premium", aggfunc="mean").round(1) print("\nPivot (score by course x premium):\n", pivot) <!-- SECTION:
Keep exploring and happy coding! ๐ป