Srini

Srinimf

Blog: https://srinimf.com/
Blog: https://srinimf.com/
Profile: https://srinimf.com/author/moonrobot/
Profile: https://srinimf.com/author/moonrobot/
Homepage: https://srinimf.com
Homepage: https://srinimf.com

MySQL Vs PostgreSQL: Top Differences

MySQL is known for simplicity, speed, and read-heavy operations, while PostgreSQL offers advanced features, data integrity, and write-heavy support.
Easy Ways to Work With XML Files: Python

This content explains two methods for working with XML data in Python, including parsing XML strings and files.
5 Nice Ways to Convert String to Matrix: Python

In Python, a string can be converted to a matrix in various ways, such as lists, NumPy arrays, or DataFrames.
3 Ways to Blind[Masking] a Field in Python

This content covers the techniques of masking, encryption, and hashing to protect sensitive data effectively in software.
How to Read CSV File as Text: PySpark Top Code

This PySpark code demonstrates reading CSV files as text, useful for interview questions on schema inference.
5 Top Reasons Why Python UDFs Slow in PySpark

PySpark UDFs can be slow due to serialization, lack of optimization, processing style, and inefficient resource utilization.
10 Python Interview Questions: SLK Software

The interview covers Python essentials like sorting, enumeration, Pandas, PySpark, inheritance, decorators, errors, and averaging techniques.
5 Must-know AWS Glue Interview Questions Beforehand

Here are the top AWS Glue interview questions on jobs and monitoring. These can be expected in any interview and are helpful to review beforehand. AWS Glue Interview Questions 01. What is an AWS Glue job? An AWS Glue job is a service from Amazon Web Services that helps you create and run scripts to…
Infogain: 5 Tricky Data Engineer Interview Questions

The Infogain interview Q&A covers PySpark aspects like partitioning, bucketing, reading petabyte-size files, Delta Lake, & schema-less files.
PySpark Dataframe: Skipping First Rows and Counting Null Values

This PySpark guide covers skipping rows (beyond header) and counting NULLs for each column of a DataFrame.
Master PySpark Functions: Collect_list, Explode, left_anti, Split

The article covers PySpark’s Explode, Collect_list, and Anti_join functions, providing code examples and their respective outputs.
How to Use Databricks Time Travel for Delta Lake Recovery

Databricks’ time travel feature allows users to recover earlier versions of a Delta Lake table, enabling corrections of incorrect data while adhering to retention policies.