These questions were asked in a Data Engineer interview at Smarterp. They are useful for preparing for similar interviews.

Python and SQL Interview Questions
01. How to sum all the values of an integer?
# Code to sum all the integers
a=12345
d=0
for i in str(a):
d += int(i)
print(d)
Output
15
** Process exited - Return Code: 0 **
Press Enter to exit terminal
02. How to reverse a string?
s="srinivas"
k=s[::-1]
print(k)
Output
savinirs
** Process exited - Return Code: 0 **
Press Enter to exit terminal
03. How to write Pyspark code, to create a DataFrame using Data and Schema. And, extract only the date from the DateTime column?
from pyspark.sql import SparkSession
from pyspark.sql.types import StructField, StructType,StringType, TimestampType
from pyspark.sql.functions import col, to_date
spark=SparkSession.builder.appName("Date extract").getOrCreate()
schema=StructType(
[StructField("date_column", StringType(), True) ]
)
data = [("2024-02-28 08:30:00",), # Sample data
("2024-02-28 12:45:00",),
("2024-02-29 10:15:00",)]
df=spark.createDataFrame(data, schema)
df=df.withColumn("date_column", col("date_column").cast(TimestampType()))
df=df.withColumn("Date_only", to_date("date_column"))
df.show()
Output
+-------------------+----------+
| date_column| Date_only|
+-------------------+----------+
|2024-02-28 08:30:00|2024-02-28|
|2024-02-28 12:45:00|2024-02-28|
|2024-02-29 10:15:00|2024-02-29|
+-------------------+----------+
04. What is the “get root directory” in PySpark?
In PySpark, the “root directory” typically refers to the directory from which your Spark application is running. It’s the base directory where your SparkContext or SparkSession is created. check the link for more information.
05. Can we modify Tuples and Strings in Python?
We cannot modify Tuples and String in place. But, we can modify by assigning those to new Tuple and New string.
my_tuple=(1,2,3,4)
new_tuple=my_tuple[0:1] + (7,) + my_tuple[2:]
print(new_tuple)
Output
(1, 7, 3, 4)
** Process exited - Return Code: 0 **
Press Enter to exit terminal
my_string="Amazon Operations"
new_string="India " + my_string[0:]
print(new_string)
Output
India Amazon Operations
** Process exited - Return Code: 0 **
Press Enter to exit terminal







You must be logged in to post a comment.