- Blog
- Blog
- Homepage
- Homepage
-
Row Vs. Range SQL Window Functions: Top Differences
ROW and RANGE window functions in SQL operate within a window, but differ in row versus value consideration.
-
External Vs. Managed Tables in Databricks: Top Differences
In Databricks, managed tables store and manages both data and metadata, while external tables store data externally and manage only metadata.
-
Writing Dataframes into Delta Tables in PySpark: 6 Top Benefits
Writing DataFrames to Delta tables offers data persistence, optimized performance, schema enforcement, transactional consistency, and integration with data systems.
-
A Complete Guide to Databricks Utilities (DBUtils)
PySparkDatabricks Utilities (DBUtils) provides functionalities like accessing DBFS files, managing clusters, and widgets.
-
SQL Query: Extracting Employees Sal > Avg Salary
The SQL query, Pandas, and PySpark code extract employees earning more than the average salary.
-
Using PySpark to Compare Employee Salaries with Their Managers
The PySpark code demonstrates two methods to compare employee salaries with their manager’s and retrieve the results.
-
Free E-book: 30 PySpark Interview Questions with Answers
The provided content introduces a free e-book with 30 PySpark interview questions and answers for preparation.
-
2 Easy Ways to Read Multiple Files into a Dataframe: PySpark
The Infosys interview question asks how to read multiple files into a dataframe using wholeTextFiles() or recursiveFileLookup.
-
5 Best Ways to Delete Rows in PySpark
In PySpark, delete rows from DataFrame: filter, where, na.drop, drop, SQL Expression based on criteria.
-
Databricks: Essential Interview Questions for Data Engineers
Interview questions for data engineer roles at top companies. Includes PySpark file reading, MySQL data retrieval, SQL comparison, Databricks workflow, and notebook sharing in Databricks accounts.
-
How to Add New-Column Particular Position: PySpark
In PySpark, use withColumn() to add a column at a specific position by rearranging columns in a new DataFrame.
-
PySpark: Splitting Text File into Columns Using Substring Function
In PySpark, use substring and select statements to split text file lines into separate columns of fixed length.