Here is the right answer to differentiate GROUP BY and PARTITION BY in SQL. They serve different purposes and are used in other contexts. Here’s a detailed comparison.

group by vs partition by

GROUP BY

Purpose: Used to group rows that have the same values in specified columns into summary rows, often used with aggregate functions like COUNT(), SUM(), AVG(), etc.

  • Context: Works with aggregate functions to summarize data by groups.
  • Output: Returns one row per group.

Example:

If you want to calculate the total salary per department:

SELECT deptno, SUM(sal) AS total_salary
FROM emp_copy
GROUP BY deptno;

Explanation: Group all rows by deptno and calculate each department’s total salary.

PARTITION BY

  • Purpose: Used within window functions to divide the result set into partitions (subgroups) to perform calculations across these partitions.
  • Context: Used inside window functions like ROW_NUMBER(), RANK(), SUM(), etc., but does not reduce the number of rows; instead, it allows you to calculate values over partitions of the data.
  • Output: Keeps all rows in the result set but performs the calculation within each partition.

Example:

To rank salaries within each department:

SELECT sal, deptno, 
ROW_NUMBER() OVER (PARTITION BY deptno ORDER BY sal DESC) AS rank
FROM emp_copy;

Explanation: Partitions the data by deptno, and then within each partition, it orders the rows by sal in descending order and assigns a rank.

Key Differences:

  • GROUP BY: Aggregates data, reducing the number of rows.
  • PARTITION BY: Partitions data without reducing the number of rows, allowing window functions to operate within those partitions.

References