All roadmaps
PandasEasy → Hard

Pandas Data Wrangling

The essential Pandas skills for DE and analytics engineering interviews — filtering, grouping, merging, pivoting, time-series, and window functions.

25 problems~10h
1
Filter and Sort Customersbasicpandas

Boolean indexing + sort_values() — the two most fundamental Pandas operations. Start every analysis here.

2
Page With No Likesbasicpandas

merge() with how="left" + isnull() filter — Pandas left join to find rows with no match in another table.

3
Unfinished Partsbasicpandas

Boolean filter on a categorical column — practice filtering on string equality and multiple conditions.

4
Laptop vs Mobile Viewershipbasicpandas

value_counts() — counting distinct categories in a column, the quickest way to profile a column.

5
Histogram of Tweetsbasicpandas

groupby() + count() + rename() — aggregation pipeline to build a frequency distribution.

6
Pharmacy Analyticsbasicpandas

groupby() + agg with multiple functions + sort — combining sum/count aggregations in one groupby call.

7
Average Post Hiatuseasypandas

groupby() + max/min on dates + timedelta arithmetic — date column operations and per-group date range.

8
Teams Power Userseasypandas

groupby() + filter by date + count + sort + head() — multi-step chained transformation pipeline.

9
Cities With Completed Tradeseasypandas

merge() + groupby() + sort — joining two DataFrames then aggregating on the joined result.

10
Data Science Skillseasypandas

groupby() + filter with transform — filtering groups based on aggregate conditions using transform().

11
Compressed Modeeasypandas

groupby() + idxmax() or nlargest — finding the mode in a frequency-encoded dataset.

12
Second Day Confirmationmediumpandas

merge() on two date columns + timedelta comparison — joining on computed date difference.

13
Signup Activation Ratemediumpandas

merge() + isnull().mean() — ratio calculation after a left join. Funnel metric pattern.

14
International Calls Percentagemediumpandas

merge() + boolean condition on joined columns + mean() — join-then-filter-then-aggregate pipeline.

15
Sending vs. Opening Snapsmediumpandas

pivot or conditional groupby + percentage calculation — reshaping data to compare two metrics side by side.

16
Supercloud Customermediumpandas

groupby() + nunique() + filter — counting distinct values per group, then filtering on that count.

17
Odd and Even Measurementsmediumpandas

groupby() + conditional sum using np.where — splitting one column into two based on row parity.

18
Highest-Grossing Itemsmediumpandas

groupby() + rank(method="dense") — top-N per group using dense rank, a key interview pattern.

19
Top 5 Artistshardpandas

merge() + groupby() + rank() + filter — multi-step pipeline combining join, aggregation, and ranking.

20
Users' Third Transactionhardpandas

groupby() + cumcount() to assign row numbers per group + filter on row 3 — ranked row selection pattern.

21
Repeated Paymentshardpandas

sort_values() + shift() per group + timedelta comparison — detecting near-duplicate rows within a time window.

22
Rolling Average Tweetshardpandas

rolling(window, min_periods) + mean() — time-series rolling window calculation, a key analytics pattern.

23
Advertiser Statushardpandas

merge() + np.select() for multi-condition column assignment — complex conditional logic across joined tables.

24
Server Utilization Timehardpandas

pivot + date arithmetic + sum — computing uptime from start/stop events, a real DE infrastructure metric.

25
Maximize Prime Inventoryhardpandas

groupby() + cumsum() + conditional allocation — multi-constraint optimization over grouped data.