Databricks SQL: Creating, Inserting, Updating, and Validating Data. Let’s go step by step, covering:

  1. Creating two tables (Users and Orders)
  2. Inserting data into the tables
  3. Updating data using different techniques
  4. Validating the update

Step 1: Creating Tables

-- Creating Users Table
CREATE TABLE Users (
UserID INT,
UserName STRING NOT NULL,
Email STRING NOT NULL,
CreatedAt TIMESTAMP DEFAULT CURRENT_TIMESTAMP(),
CONSTRAINT pk_users PRIMARY KEY (UserID),
CONSTRAINT unique_email UNIQUE (Email)
);

-- Creating Orders Table with Foreign Key Reference
CREATE TABLE Orders (
OrderID INT,
UserID INT,
OrderDate TIMESTAMP DEFAULT CURRENT_TIMESTAMP(),
Amount DECIMAL(10,2) CHECK (Amount > 0),
CONSTRAINT pk_orders PRIMARY KEY (OrderID),
CONSTRAINT fk_orders_users FOREIGN KEY (UserID) REFERENCES Users(UserID) ON DELETE CASCADE
);

Step 2: Inserting Data

-- Insert Data into Users Table
INSERT INTO Users (UserID, UserName, Email, CreatedAt)
VALUES
(1, 'Alice', 'alice@example.com', CURRENT_TIMESTAMP()),
(2, 'Bob', 'bob@example.com', CURRENT_TIMESTAMP()),
(3, 'Charlie', 'charlie@example.com', CURRENT_TIMESTAMP());

-- Insert Data into Orders Table
INSERT INTO Orders (OrderID, UserID, OrderDate, Amount)
VALUES
(101, 1, CURRENT_TIMESTAMP(), 250.00),
(102, 2, CURRENT_TIMESTAMP(), 100.00),
(103, 3, CURRENT_TIMESTAMP(), 300.50);

Step 3: Updating Data Using Different Techniques

1️⃣ Basic Update

Update a single column for a specific user:

UPDATE Users 
SET UserName = 'Alice Johnson'
WHERE UserID = 1;

2️⃣ Updating Multiple Columns

UPDATE Users 
SET UserName = 'Robert', Email = 'robert@example.com'
WHERE UserID = 2;

3️⃣ Conditional Update Based on Another Table

If a user’s total order amount is greater than 200, update their username:

UPDATE Users 
SET UserName = 'VIP User'
WHERE UserID IN (SELECT UserID FROM Orders WHERE Amount > 200);

4️⃣ Bulk Update (All Users)

UPDATE Users 
SET UserName = CONCAT(UserName, ' [Updated]');

Step 4: Validating Updates

1️⃣ Check Updated Values

SELECT * FROM Users WHERE UserID = 1;

2️⃣ Count Updated Rows

SELECT COUNT(*) AS UpdatedRowCount 
FROM Users
WHERE UserName LIKE '%Updated%';

3️⃣ Validate Using CASE WHEN

SELECT 
UserID,
UserName,
CASE
WHEN UserName LIKE '%Updated%' THEN 'Updated'
ELSE 'Not Updated'
END AS UpdateStatus
FROM Users;

Step 5: Using Delta Tables for Change Tracking (Optional)

If you’re using Databricks Delta tables, you can track updates using DESCRIBE HISTORY:

DESCRIBE HISTORY Users;
  • This provides audit logs of all updates, including timestamps and who made the changes.

Conclusion

✅ We created tables, inserted data, performed updates, and validated the changes using SQL queries.
🔥 These techniques ensure data integrity while updating records.