Introduction
SQL can look deceptively simple, but when you’re dealing with complex analytics, query optimization, or advanced aggregations, things get tricky fast. If you’ve ever wondered the difference between GROUP BY and PARTITION BY, or what exactly goes into the OVER() clause, you’re not alone.
This post unpacks 10 most confusing SQL queries—with real-life use cases, simplified explanations, and tips to write better SQL.
1. GROUP BY vs PARTITION BY
GROUP BY groups rows into a single result per group, reducing the number of output rows.
SELECT department, COUNT(*)
FROM employees
GROUP BY department;
This returns one row per department.
PARTITION BY does not reduce rows; instead, it creates a window (group) across rows while keeping each row visible.
SELECT name, department, COUNT(*) OVER (PARTITION BY department) as dept_count FROM employees;
Each row still appears, but now we know how many people are in their department.
Think of GROUP BY as summarizing data, while PARTITION BY adds context without hiding the details.
2. What Goes Inside OVER() Clause?
The OVER() clause is part of window functions like ROW_NUMBER(), RANK(), SUM(), AVG() etc.
Inside OVER() you can specify:
PARTITION BY– splits data into groupsORDER BY– defines the order inside the partitionROWS BETWEEN– defines the window range (optional)
Example:
SELECT name, department, RANK() OVER (PARTITION BY department ORDER BY salary DESC) as rank_in_dept FROM employees;
This ranks employees within each department by salary.
Tip: If you leave OVER() empty, it treats all rows as one big group.
3. Using CASE with Aggregation
You can use CASE inside SUM(), COUNT() and other aggregations to do conditional aggregation.
Example: Count how many employees are in each department, split by gender.
SELECT department, COUNT(CASE WHEN gender = 'M' THEN 1 END) as male_count, COUNT(CASE WHEN gender = 'F' THEN 1 END) as female_count FROM employees GROUP BY department;
Here, CASE selectively counts rows based on a condition.
Also works with SUM:
SUM(CASE WHEN status = 'Active' THEN salary ELSE 0 END)
4. Why Use Common Table Expressions (CTEs)
Instead of writing long nested queries, you can simplify SQL using CTEs with WITH.
WITH active_employees AS ( SELECT * FROM employees WHERE status = 'Active' ) SELECT department, COUNT(*) FROM active_employees GROUP BY department;
CTEs make your queries readable, easy to debug, and reusable.
5. How to Simplify Complex SQL Queries
- Break into CTEs
- Avoid subqueries unless necessary
- Use meaningful aliases
- Keep only needed columns
- Use indentation and formatting
Example:
Instead of this unreadable mess:
SELECT d.name, COUNT(*)
FROM departments d
JOIN employees e
ON d.id = e.department_id
WHERE e.status = 'Active'
GROUP BY d.name;
Write:
WITH active_employees AS ( SELECT * FROM employees WHERE status = 'Active' ) SELECT d.name, COUNT(*) FROM departments d JOIN active_employees e ON d.id = e.department_id GROUP BY d.name;
6. What is ROW_NUMBER vs RANK vs DENSE_RANK?
All return positions in a list but behave differently when ties happen:
- ROW_NUMBER(): No ties – numbers are always 1, 2, 3…
- RANK(): Leaves gaps for ties
- DENSE_RANK(): No gaps – ties share rank, next one is immediate
7. Aggregating Over Partitions
You can aggregate without GROUP BY using SUM(...) OVER (...).
Example: Cumulative sales per month:
SELECT month,
SUM(sales) OVER (ORDER BY month) as running_total
FROM sales_data;
You get a running total while keeping individual months visible.
8. COALESCE vs CASE WHEN
Both handle nulls, but COALESCE is cleaner for fallback values:
COALESCE(phone_number, 'N/A')
This is like saying “If phone_number is NULL, show N/A”.
CASE is more flexible when you have multiple conditions:
CASE WHEN status = 'Active' THEN 'Working' WHEN status IS NULL THEN 'Unknown' ELSE 'Inactive' END
9. Filtering on Aggregated Values: HAVING
You can’t use WHERE after an aggregation; use HAVING.
SELECT department, COUNT(*) as total FROM employees GROUP BY department HAVING COUNT(*) > 5;
HAVING filters after GROUP BY is done.
10. When to Use EXISTS vs IN vs JOIN
- EXISTS: Fast when checking if rows exist
- IN: Good for small fixed lists
- JOIN: Best when you need columns from both tables
EXISTS is generally faster for large subqueries:
SELECT name FROM employees e WHERE EXISTS ( SELECT 1 FROM projects p WHERE p.employee_id = e.id );
Conclusion
SQL is powerful, but the syntax can get confusing, especially when working with window functions, aggregates, or multi-step logic. If you remember:
GROUP BYsummarizesPARTITION BYadds contextCASEmakes conditional logic easyCTEssimplify everything
… then you’ll be writing cleaner, more efficient queries in no time.






