- Blog
- Blog
- Homepage
- Homepage
-
How to Compare Hashed Columns Before and After a Change in Databricks
The content explains how to compare old and new MD5 hashed values in Databricks using PySpark SQL after updating the ‘id’ format in a product table. It details creating a sample table, updating hashes, and using Delta Time Travel to check for mismatches, concluding that mismatches are expected due to the new value format.
-
Databricks Time Travel : How to Compare With Previous Versions
In Databricks with Delta Lake, users can utilize time travel and history features to compare old and new versions of tables post-UPDATE. Steps include creating a table, updating it, describing its history, and performing comparisons on salaries. Key points involve using VERSION AS OF and DESCRIBE HISTORY for data retrieval.
-
Start Your Data Engineering Journey (2025)
Start your data engineering career in 2025 with this comprehensive beginner’s guide. Learn essential skills, tools, and proven steps to land your first job fast.
-
Learn Databricks SQL: From Table Creation to Data Validation
Managing data using Databricks SQL. It includes the creation of Users and Orders tables, data insertion, various updating techniques, and validation of these updates. Additionally, it discusses the use of Delta tables for change tracking. These methods maintain data integrity throughout the entire workflow.
-
Steps to Insert Modified Rows Keeping Orignal Data Intact : Databricks SQL Simplified
The content outlines a three-step process using Databricks SQL and PySpark to update employee salary records. It involves creating target and lookup tables, inserting data, and forming a temporary table to hold modified rows. Finally, it implements a script to dynamically insert only the updated columns back into the target table.
-
Complete Guide to MERGE INTO in Databricks
This content outlines a MERGE INTO example in Databricks SQL for updating a target table (employees_target) using a lookup table (employees_lookup) with updated employee details. It details steps for table creation, data insertion, and the merge operation, resulting in updated salaries for Alice and Charlie, and the addition of a new employee, Eve.
-
Understanding Databricks vs Traditional Databases
Databricks is not a database; it’s a unified analytics platform built on Apache Spark for data engineering, analytics, and machine learning. It supports diverse workloads like ETL and real-time analytics while integrating with various databases. Unlike a traditional database, Databricks uses Delta Lake for efficient data storage and analysis.
-
Top 5 Tricky SQL CASE WHEN Examples You Should Practice
Learn how to use the SQL CASE statement to simplify conditional logic, handle complex scenarios, and write cleaner, more powerful SQL queries easily.
-
Understanding SQL LIKE, ILIKE, and RLIKE Operators
Understanding LIKE, ILIKE, and RLIKE in SQL is essential for effective data querying and reporting. LIKE allows case-sensitive pattern matching, while ILIKE provides case-insensitivity, particularly in PostgreSQL. RLIKE supports regular expressions for advanced patterns. Selecting the appropriate operator enhances query accuracy and user experience in database applications.
-
How to Find Matches and Non-matches- Tricky SQL Example
Master the technique of comparing two tables to find matching brand codes and store numbers, and accurately count both matches and non-matches for interviews
-
Notepad++: Convert Comma-Separated Values Easily
Notepad++ offers shortcuts to convert comma-separated values into columns and vice versa. To convert rows into columns, use Ctrl+A and Ctrl+H, replacing commas with line breaks. For the reverse process, replace line breaks with commas using the same shortcuts. This enhances data management efficiency in Notepad++.