Pandas Top Interview Questions: Citius Tech

These questions were asked in the CitiusTech Data Engineer role interview. I am sharing with you questions and solutions.

SQL and Pandas Interview Questions

01. How to read CSV files in pandas?

import pandas as pd
# Assuming the CSV file is named "data.csv" and is in the current directory
df = pd.read_csv("data.csv")
# Displaying the DataFrame
print(df)

02. How to create a data frame in Pandas?

import pandas as pd
data=({'a': ['A', 'A', 'B', 'C'], "b": [1,2,1,3] })
df= pd.DataFrame(data)
print(df)

03. How to show in the below format using Pandas?

## input
data=({'a': ['A', 'A', 'B', 'C'], "b": [1,2,1,3] })

## expected output
a  Newcol
A  3
B  1
C  3

Solution

import pandas as pd
data=({'a': ['A', 'A', 'B', 'C'], "b": [1,2,1,3] })
df=pd.DataFrame(data)

df=df.groupby("a")["b"].sum().reset_index(name="Newcol")
print(df)

Output

   a  Newcol
0  A       3
1  B       1
2  C       3

04. How to add a new column in Pandas with a fixed value and range of values?

## Adding fixed value in the Newcol
import pandas as pd
data=({'a': ['A', 'A', 'B', 'C'], "b": [1,2,1,3] })
df=pd.DataFrame(data)
df["Newcol"]=5
print(df)

##Output
   a  b  Newcol
0  A  1       5
1  A  2       5
2  B  1       5
3  C  3       5

Adding a range of values

import pandas as pd
data=({'a': ['A', 'A', 'B', 'C'], "b": [1,2,1,3] })
df=pd.DataFrame(data)
start_value = 1
end_value = 4
df["Newcol"]= range(start_value, end_value + 1)
print(df)

##Output
   a  b  Newcol
0  A  1       1
1  A  2       2
2  B  1       3
3  C  3       4

05. How to get the average age country-wise: write an SQL query?

select country, avg(age) from Customers
group by country;

06. In Pandas, how to format the given data in the list format?

## input
data=({'a': ['A', 'A', 'B', 'C'], "b": [1,2,1,3] })

## Output
a  Newcol
A [1,2]
B [1]
C [3]

Solution

import pandas as pd
data=({'a': ['A', 'A', 'B', 'C'], "b": [1,2,1,3] })
df=pd.DataFrame(data)
df=df.groupby("a")["b"].agg(list).reset_index(name="Newcol")
print(df)

## Output
   a  Newcol
0  A  [1, 2]
1  B     [1]
2  C     [3]

Srini

Data Engineer with deep AI and Generative AI expertise, crafting high-performance data pipelines in PySpark, Databricks, and SQL. Skilled in Python, AWS, and Linux—building scalable, cloud-native solutions for smart applications.