12 Top SQL Joins Interview Questions

Joining more than one table need special knowledge of Joins. These are standard SQL JOIN interview questions to refresh before next interview.

Join methods

SQL interview questions on joins

1. What are the different types of joins available to users?

The user may specify INNER join, LEFT OUTER join, RIGHT OUTER join, or FULL OUTER join. The result of joining tables is an intermediate result table.

2. What is the difference between an INNER join and an OUTER join?

The INNER join combines each row of one table (left) with every row of the right side table (right). Only when the join condition is satisfied in both tables are the rows made part of the answer set. 

3. How does the RDBMS decide which rows to include in the answer set with an OUTER join?

The left outer join includes those rows from the left table even if there is no matching join row in the right table. The right outer join keeps rows from the right table even if no matching join row is in the left table. The full-outer join includes those rows from both the left table and the right table even if no matching join row is in the other table.

4. What is the most efficient type of join?

In practice, all joins are inefficient. Only if needed, you need to use Joins.


5. Must two different tables be used to perform a join?

It is not necessary. You can use the same table to join (self join interview questions). That is called self-join.


6. Which join methods does DB2 use to achieve a join?

There are three methods – nested loop join, a merge scan join, and a hybrid join. The DB2 optimizer chooses the method it calculates to be the most efficient one to perform the join (A multiple table join can involve more than one join method.) 

7. What is the nested loop join method?

A nested loop is the most common method of performing a join. The first or outer composite table is searched. For each row that satisfies any non-join (local) predicates in the WHERE clause conditions, the RDBMS will search for matching rows in the inner or new table.

When a match occurs, the columns requested from the two tables are concatenated together and form the current row of the composite table.

Nested loop join
The nested loop join

When the join is an inner and no inner rows match the outer row being checked, the outer row is discarded.

When the join is an outer and no inner rows match the outer row being checked, a row of null values is concatenated to the current row. Note that the optimizer decides which join method should be used.

8. When is it best to use a nested loop join?

The nested loop is most efficient when an index on the inner table is clustered so that the values of the join column are in sequence.

It is also efficient when the number of rows retrieved from the inner table (via an index) is small. The optimizer also chooses this method when the outer table is small or the local predicates (conditions) significantly reduce the number of qualifying rows in the outer table.

9. What is a merge scan join, and how is it processed?

The merge scan join is similar to any merge. Both tables must have an index on the join column(s) or be sorted in a join column(s) sequence.

A row of the outer table is read, and the sorted inner table is searched for a matching row. If a match occurs, the concatenated row is built.

Merge scan
Merge scan

The scan of the inner table continues as long as there is a match. When no additional match is found, another row of the outer table is read. If that row has the same value as the inner row, the merge continues as previously for all duplicate records in the inner table. If the value in the outer row is new, the inner table is searched to find a new matching value or a lower value. The process then starts again.

10. When is the merge scan join method used?

This type of scan is often used when the optimizer calculates that the join is a many-to-many join, the tables are large, there is no matching index, and/or the join predicates do not provide much filtering.

11. What is a hybrid join?

It is used when the inner table index is non-clustered, and many duplicate rows in the outer table match the condition criteria.

The outer table is scanned to extract all the rows that satisfy any local predicates. The rows that match are placed on a temporary table. An index on the inner table is then searched for matching keys, and the row address on relation IDs (RIDs) of the matching keys are concatenated to the rows of the temporary output table.

Hybrid join
Hybrid join

The temporary output table is then sorted in a RID sequence, and a PREFETCH is used to sequentially read the inner table and finish building the output row by concatenating the data from the inner- table.

12. What is a cartesian join?

Every row of the inner table is concatenated to each row of the outer table because no join predicates exist.

Related posts

Author: Srini

Experienced software developer. Skills in Development, Coding, Testing and Debugging. Good Data analytic skills (Data Warehousing and BI). Also skills in Mainframe.