The Most Confusing SQL Queries Made Easy (Beginner to Pro)

Introduction

SQL can look deceptively simple, but when you’re dealing with complex analytics, query optimization, or advanced aggregations, things get tricky fast. If you’ve ever wondered the difference between GROUP BY and PARTITION BY, or what exactly goes into the OVER() clause, you’re not alone.

This post unpacks 10 most confusing SQL queries—with real-life use cases, simplified explanations, and tips to write better SQL.

1. GROUP BY vs PARTITION BY

GROUP BY groups rows into a single result per group, reducing the number of output rows.

SELECT department, COUNT(*)
FROM employees
GROUP BY department;

This returns one row per department.

PARTITION BY does not reduce rows; instead, it creates a window (group) across rows while keeping each row visible.

SELECT name, department,
       COUNT(*) OVER (PARTITION BY department) as dept_count
FROM employees;

Each row still appears, but now we know how many people are in their department.

Think of GROUP BY as summarizing data, while PARTITION BY adds context without hiding the details.

2. What Goes Inside OVER() Clause?

The OVER() clause is part of window functions like ROW_NUMBER(), RANK(), SUM(), AVG() etc.

Inside OVER() you can specify:

PARTITION BY – splits data into groups
ORDER BY – defines the order inside the partition
ROWS BETWEEN – defines the window range (optional)

Example:

SELECT name, department,
       RANK() OVER (PARTITION BY department ORDER BY salary DESC) as rank_in_dept
FROM employees;

This ranks employees within each department by salary.

Tip: If you leave OVER() empty, it treats all rows as one big group.

Top UDEMY Courses

3. Using CASE with Aggregation

You can use CASE inside SUM(), COUNT() and other aggregations to do conditional aggregation.

Example: Count how many employees are in each department, split by gender.

SELECT department,
       COUNT(CASE WHEN gender = 'M' THEN 1 END) as male_count,
       COUNT(CASE WHEN gender = 'F' THEN 1 END) as female_count
FROM employees
GROUP BY department;

Here, CASE selectively counts rows based on a condition.

Also works with SUM:

SUM(CASE WHEN status = 'Active' THEN salary ELSE 0 END)

4. Why Use Common Table Expressions (CTEs)

Instead of writing long nested queries, you can simplify SQL using CTEs with WITH.

WITH active_employees AS (
  SELECT * FROM employees WHERE status = 'Active'
)
SELECT department, COUNT(*) FROM active_employees GROUP BY department;

CTEs make your queries readable, easy to debug, and reusable.

5. How to Simplify Complex SQL Queries

Break into CTEs
Avoid subqueries unless necessary
Use meaningful aliases
Keep only needed columns
Use indentation and formatting

Example:

Instead of this unreadable mess:

SELECT d.name, COUNT(*) 
FROM departments d 
JOIN employees e 
ON d.id = e.department_id 
WHERE e.status = 'Active' 
GROUP BY d.name;

Write:

WITH active_employees AS (
  SELECT * FROM employees WHERE status = 'Active'
)
SELECT d.name, COUNT(*) 
FROM departments d
JOIN active_employees e ON d.id = e.department_id
GROUP BY d.name;

AMAZON BEST SELLERS

6. What is ROW_NUMBER vs RANK vs DENSE_RANK?

All return positions in a list but behave differently when ties happen:

ROW_NUMBER(): No ties – numbers are always 1, 2, 3…
RANK(): Leaves gaps for ties
DENSE_RANK(): No gaps – ties share rank, next one is immediate

7. Aggregating Over Partitions

You can aggregate without GROUP BY using SUM(...) OVER (...).

Example: Cumulative sales per month:

SELECT month, 
SUM(sales) OVER (ORDER BY month) as running_total
FROM sales_data;

You get a running total while keeping individual months visible.

8. COALESCE vs CASE WHEN

Both handle nulls, but COALESCE is cleaner for fallback values:

COALESCE(phone_number, 'N/A')

This is like saying “If phone_number is NULL, show N/A”.

CASE is more flexible when you have multiple conditions:

CASE 
  WHEN status = 'Active' THEN 'Working'
  WHEN status IS NULL THEN 'Unknown'
  ELSE 'Inactive'
END

9. Filtering on Aggregated Values: HAVING

You can’t use WHERE after an aggregation; use HAVING.

SELECT department, COUNT(*) as total
FROM employees
GROUP BY department
HAVING COUNT(*) > 5;

HAVING filters after GROUP BY is done.

10. When to Use EXISTS vs IN vs JOIN

EXISTS: Fast when checking if rows exist
IN: Good for small fixed lists
JOIN: Best when you need columns from both tables

EXISTS is generally faster for large subqueries:

SELECT name FROM employees e
WHERE EXISTS (
  SELECT 1 FROM projects p WHERE p.employee_id = e.id
);

Conclusion

SQL is powerful, but the syntax can get confusing, especially when working with window functions, aggregates, or multi-step logic. If you remember:

GROUP BY summarizes
PARTITION BY adds context
CASE makes conditional logic easy
CTEs simplify everything

… then you’ll be writing cleaner, more efficient queries in no time.

Srini

Data Engineer with deep AI and Generative AI expertise, crafting high-performance data pipelines in PySpark, Databricks, and SQL. Skilled in Python, AWS, and Linux—building scalable, cloud-native solutions for smart applications.