Joins & GROUP BY Mastery
The two most-tested SQL concepts in every DE interview. Master every join type and every aggregation pattern — from simple counts to multi-table funnel analysis.
INNER JOIN basics — match rows from two tables on a foreign key. The join you will write every single day.
GROUP BY + COUNT(*) — the most fundamental aggregation. Know this before anything else.
LEFT JOIN + WHERE IS NULL — find rows with no match. Three equivalent approaches: LEFT JOIN, NOT IN, NOT EXISTS.
GROUP BY + HAVING COUNT > 1 — filter after aggregation. HAVING is WHERE for groups.
GROUP BY on a date column — grouping by month/year is in every reporting query you will write.
GROUP BY + SUM() — aggregating a numeric column per group. Foundation for all revenue reporting.
LEFT JOIN to preserve all rows from the left table — when you need every person even without an address.
LEFT JOIN + IS NULL vs NOT IN — classic "find what is missing" pattern asked at every company.
Self-join — joining a table to itself on a relationship column. Standard for hierarchy problems.
GROUP BY + HAVING with a threshold — practice HAVING with COUNT to filter small groups.
GROUP BY + HAVING COUNT > 1 on a string column — deduplication query pattern for data quality checks.
GROUP BY + HAVING COUNT(DISTINCT) = total categories — relational division, a tough but common pattern.
GROUP BY date truncation + AVG() — monthly bucketing with date functions alongside aggregation.
Conditional aggregation: SUM(CASE WHEN ...) — pivot-style counts/sums in a single GROUP BY query.
Two-table ratio: COUNT from join / COUNT from base — funnel and conversion rate calculation pattern.
JOIN to a subquery returning MAX per group — the classic correlated subquery vs window function trade-off.
Self-join + GROUP BY + HAVING — joining the same table to aggregate subordinates per manager.
Multi-table JOIN + GROUP BY + multiple aggregations — the most common DE business metric query.
Self-join on order_id + GROUP BY pair + HAVING — market basket analysis with a self-join.
Self-join + GROUP BY + COUNT — finding co-occurrence patterns. Harder variant of products-bought-together.
Multi-join + date filter + conditional aggregation in one query — the classic hard Uber/Lyft problem.
Self-join on user_id across two consecutive months — retention analysis without window functions.
Multiple LEFT JOINs + conditional COUNT — building a funnel from raw event data, asked at every product company.
Self-join + GROUP BY + HAVING + percentage — full association rule mining with support and confidence.
Recursive CTE — self-referential join to traverse a tree. The advanced join pattern asked at senior-level rounds.