DB2 SQL Query Performance Tuning (1 of 3)

You can improve SQL query performance, if you use GROUP BY, IN, or EXISTS clauses properly. I am sharing the top ideas on how to use in this post.


If you run an SQL query, it can consume less or more time. It all depends on how you have written the SQL. To improve performance, you need to understand the following ideas.

1. Use Always Group by Clause

SQL Query DISTINCT vs GROUP BY

SELECT DISTINCT E.EMPNO, E.LASTNAME 
FROM EMP E,EMPPROJACT EP
WHERE E.EMPNO = EP.EMPNO

Query with GROUP BY

SELECT E.EMPNO, E.LASTNAME 
FROM EMP E,EMPPROJACT EP
WHERE E.EMPNO = EP.EMPNO
GROUP BY E.EMPNO, E.LASTNAME

Group by eliminates unnecessary sort and so improves query performance.

SQL tuning tutorial
SQL tuning tutorial

2. Use EXISTS Clause instead of IN

Query with ‘IN’

SELECT E.EMPNO, E.LASTNAME FROM EMP E 
WHERE E.EMPNO IN
(SELECT D.MGRNO
FROM DEPARTMENT D
WHERE D.DEPTNO LIKE 'D%')

Query with EXISTS

SELECT E.EMPNO, E.LASTNAME 
FROM EMP E
WHERE EXISTS
(SELECT 1
FROM DEPARTMENT D
WHERE D.MGRNO = E.EMPNO
AND D.DEPTNO LIKE 'D%')

EXISTS is much better than IN. So always use EXISTS.

3. Avoid Duplicating in Where Predicate

SELECT DEPTNO, DEPTNAME 
FROM DEPT WHERE (ADMRDEPT = 'E01'
AND DEPTNAME LIKE 'BRANCH%')
OR (DEPTNO = 'D01' AND DEPTNAME LIKE 'BRANCH%')

In the above query the LIKE statement is having duplicate. That you can avoid it.

SELECT DEPTNO, DEPTNAME 
FROM DEPT WHERE (ADMRDEPT = 'E01' OR DEPTNO = 'D01')
AND DEPTNAME LIKE 'BRANCH%'

Also Read

Author: Srini

Experienced Data Engineer, having skills in PySpark, Databricks, Python SQL, AWS, Linux, and Mainframe