- Blog
- Blog
- Homepage
- Homepage
-
AWS: 3 Easy to Write Lambda Functions
Here are three examples of AWS Lambda functions for different use cases. These include the hello world function, image resizing, and fetching data from DynamoDB. 1. Basic Hello World Function This is a simple AWS Lambda function that returns a “Hello, World!” message. It’s often used in the AWS Lambda to understand the basics. def lambda_handler(event, context): return { ‘statusCode’:…
-
How to Delete Source Object After Glue Job Run Complete
Deleting S3 objects post-Glue job streamlines data management, frees up space, and maintains a clean dataset for analysis.
-
CSV Column Validation Using PySpark: Step-by-Step Guide
The Python code demonstrates CSV file validation using PySpark. Validation rules are applied to columns, and the resulting dataframes are written to S3 and PgSQL.
-
20 Python Pandas Interview Questions and Answers
Pandas is a data manipulation library for Python, offering Series, DataFrame, CSV, merging, grouping, and visualization capabilities.
-
Group By Vs Partition By: Here’s the Right Answer
SQL uses GROUP BY to aggregate data into summary rows, while PARTITION BY aids window functions in dividing result sets.
-
How to Create FastAPI in VisualStudioCode
This content explains creating a FastAPI application with a binary number divisibility check endpoint and its interaction using Swagger.
-
Effective Strategies for Databricks Cluster and Job Optimization
Optimizing performance in Databricks involves best practices for Spark, cluster config, data management, and code optimization.
-
How to Read Secret Manager Data in AWS Glue
You can read a secret from AWS Secrets Manager in AWS Glue using boto3 library for Python. Ensure IAM permissions.
-
PySpark Quiz: Crack Your Interview Effortlessly
PySpark quiz covers main features, distributed computing, DataFrame creation, SparkSession, data manipulation, lazy evaluation, missing values, and data I/O.
-
AWS Logging Best Practices for Effective Monitoring
AWS provides CloudWatch and AWS CloudTrail for log monitoring, troubleshooting, and auditing your cloud environment.
-
How to Check Valid Date in PySpark: 4 Top Methods
The post outlines various methods to validate dates in PySpark using functions, UDFs, SQL queries, and DataFrame filtering.
-
PySpark When Check Numeric of Column [Tested]
The post explains how to use PySpark to check if input values are numeric using when condition.