⏳ Time Series Resampling
Resampling allows you to change the frequency of your time series data, e.g., from daily to monthly.
Mastering this concept will significantly boost your Python data science skills!
💻 Code Example:
import pandas as pd import numpy as np np.random.seed(42) # 1. Create daily pynfinity user activity series (1 year) dates = pd.date_range(start="2024-01-01", periods=365, freq="D") df = pd.DataFrame({ "sessions": np.random.poisson(lam=120, size=365), "signups" : np.random.poisson(lam=5, size=365), "revenue" : np.round(np.random.exponential(scale=500, size=365), 2), }, index=dates) print("Daily data (first 5 rows):") print(df.head()) # 2. Weekly resampling — total sessions per week weekly = df.resample("W").agg({ "sessions": "sum", "signups" : "sum", "revenue" : "sum", }) print("\nWeekly totals (last 4 weeks):") print(weekly.tail(4)) # 3. Monthly average revenue monthly_avg = df["revenue"].resample("ME").mean().round(2) print("\nMonthly avg revenue:") print(monthly_avg) # 4. Rolling 7-day average (smoothing) df["sessions_7d_avg"] = df["sessions"].rolling(window=7).mean().round(1) # 5. Lag features (yesterday vs today growth) df["sessions_prev_day"] = df["sessions"].shift(1) df["day_over_day_pct"] = ((df["sessions"] - df["sessions_prev_day"]) / df["sessions_prev_day"] * 100).round(2) print("\nWith rolling avg & lag:") print(df[["sessions", "sessions_7d_avg", "day_over_day_pct"]].tail(5)) # 6. Expanding cumulative stats df["cumulative_revenue"] = df["revenue"].expanding().sum().round(2) print(f"\nTotal revenue for the year: ₹{df['cumulative_revenue'].iloc[-1]:,.2f}")
Keep exploring and happy coding! 💻