Here are the SQL and Python(Pandas) interview questions recently asked in an interview.

SQL interview Questions

1️⃣ Top 5 Customers with Highest Total Order Amount

Question:
Find the top 5 customers with the highest total order amount. Include the total sales for each customer and rank them using row numbers.

Table Details

Customers Table

ColumnTypeDescription
CustomerIDINTUnique identifier for the customer
CustomerNameNVARCHARName of the customer

Orders Table

ColumnTypeDescription
OrderIDINTUnique identifier for the order
CustomerIDINTForeign key referencing Customers
OrderDateDATETIMEDate and time when order was placed

OrderDetails Table

ColumnTypeDescription
OrderDetailIDINTUnique identifier for order detail
OrderIDINTForeign key referencing Orders
ProductIDINTForeign key referencing Products
QuantityINTQuantity of product ordered
UnitPriceDECIMALUnit price of product

Products Table

ColumnTypeDescription
ProductIDINTUnique identifier for the product
ProductNameNVARCHARName of the product
UnitPriceDECIMALUnit price of the product

Solution

WITH customer_totals AS (
    SELECT 
        c.CustomerID,
        c.CustomerName,
        SUM(od.Quantity * od.UnitPrice) AS TotalAmount
    FROM Customers c
    JOIN Orders o ON c.CustomerID = o.CustomerID
    JOIN OrderDetails od ON o.OrderID = od.OrderID
    GROUP BY c.CustomerID, c.CustomerName
),
ranked_customers AS (
    SELECT 
        CustomerID,
        CustomerName,
        TotalAmount,
        ROW_NUMBER() OVER (ORDER BY TotalAmount DESC) AS row_num
    FROM customer_totals
)
SELECT *
FROM ranked_customers
WHERE row_num <= 5;

2️⃣ List All Customers with Total Orders and Average Order Amount

Question:
List all customers along with their total number of orders and average order amount. Include customers with zero orders.

WITH order_totals AS (
    SELECT 
        o.OrderID,
        o.CustomerID,
        SUM(od.Quantity * od.UnitPrice) AS OrderTotal
    FROM Orders o
    JOIN OrderDetails od ON o.OrderID = od.OrderID
    GROUP BY o.OrderID, o.CustomerID
)
SELECT 
    c.CustomerID,
    c.CustomerName,
    COUNT(DISTINCT ot.OrderID) AS TotalOrders,
    AVG(ot.OrderTotal) AS AvgOrderAmount
FROM Customers c
LEFT JOIN order_totals ot ON c.CustomerID = ot.CustomerID
GROUP BY c.CustomerID, c.CustomerName;

3️⃣ Sales by Month for a Specific Product in 2023

Question:
Retrieve the total sales by month for a specific product in the year 2023.

WITH product_sales AS (
    SELECT 
        od.ProductID,
        DATE_FORMAT(o.OrderDate, '%Y-%m') AS Month,
        SUM(od.Quantity * od.UnitPrice) AS TotalSales
    FROM OrderDetails od
    JOIN Orders o ON od.OrderID = o.OrderID
    WHERE od.ProductID = 123  -- Replace with your product ID
      AND YEAR(o.OrderDate) = 2023
    GROUP BY od.ProductID, DATE_FORMAT(o.OrderDate, '%Y-%m')
)
SELECT *
FROM product_sales
ORDER BY Month;

4️⃣ Customers Who Have Not Placed Any Orders in the Last 6 Months

Question:
Retrieve a list of customers who have not placed any orders in the last 6 months.

WITH recent_orders AS (
    SELECT CustomerID
    FROM Orders
    WHERE OrderDate >= DATE_SUB(CURDATE(), INTERVAL 6 MONTH)
)
SELECT c.CustomerID, c.CustomerName
FROM Customers c
LEFT JOIN recent_orders ro 
ON c.CustomerID = ro.CustomerID
WHERE ro.CustomerID IS NULL;

Pandas interview questions

data = [
    (1, "USA", "A", 200),
    (2, "Canada", "B", 300),
    (3, "India", "C", 50),
    (4, "USA", "D", 5000),
    (5, "India", "A", 800),
    (6, "Canada", "B", 9000),
    (7, "USA", "C", 2500),
    (8, "India", "D", 12000),
    (9, "Canada", "A", 500),
    (10, "USA", "B", 6000)
]

import pandas as pd

df = pd.DataFrame(data, columns=["ID", "Country", "Product", "Sales"])

1️⃣ Filter records where sales ≥ 100

Question:
Filter out the records where the sales amount is less than 100.

Solution:

# Filter sales >= 100
df_filtered = df[df["Sales"] >= 100]
print(df_filtered)

Output:

   ID Country Product  Sales
0   1     USA       A    200
1   2  Canada       B    300
3   4     USA       D   5000
4   5   India       A    800
5   6  Canada       B   9000
6   7     USA       C   2500
7   8   India       D  12000
8   9  Canada       A    500
9  10     USA       B   6000

2️⃣ Group by country and calculate total sales

Question:
Group the sales by country and calculate the total sales amount for each country.

Solution:

df_grouped = df_filtered.groupby("Country", as_index=False)["Sales"].sum()
df_grouped.rename(columns={"Sales": "Total_Sales"}, inplace=True)
print(df_grouped)

Output:

  Country  Total_Sales
0  Canada         9800
1   India        12800
2     USA        13700

3️⃣ Add a new column categorizing total sales

Question:
Add a new column that categorizes total sales into "High", "Medium", and "Low":

  • "High" if total sales > 10,000
  • "Medium" if total sales > 5,000 and ≤ 10,000
  • "Low" if total sales ≤ 5,000

Solution:

def categorize(sales):
    if sales > 10000:
        return "High"
    elif sales > 5000:
        return "Medium"
    else:
        return "Low"

df_grouped["Category"] = df_grouped["Total_Sales"].apply(categorize)
print(df_grouped)

Output:

  Country  Total_Sales Category
0  Canada         9800   Medium
1   India        12800     High
2     USA        13700     High

4️⃣ Optional: Rank countries by total sales

Question:
Rank the countries by total sales using a numeric rank.

Solution:

df_grouped["Rank"] = df_grouped["Total_Sales"].rank(ascending=False, method="first")
print(df_grouped)

Output:

  Country  Total_Sales Category  Rank
0  Canada         9800   Medium   3.0
1   India        12800     High   2.0
2     USA        13700     High   1.0

Notes:

  • Step 1 filters the dataset to remove small sales.
  • Step 2 aggregates sales per country.
  • Step 3 adds a categorical label based on thresholds.
  • Step 4 shows how to rank items numerically in Pandas.