Advanced SQL for Data Engineering Pipelines
Master advanced SQL techniques to build robust, efficient, and scalable data engineering pipelines.
...
Share
SQL Optimization Techniques for Data Pipelines
Unit 1: Understanding Query Execution Plans
Intro to Query Plans
Reading Query Plans
Common Bottlenecks
Tools for Query Analysis
Hands-on: Analyze a Plan
Unit 2: Indexing Strategies
Index Basics
Choosing Index Columns
Composite Indexes
Index Maintenance
Hands-on: Indexing
Unit 3: Query Rewriting Techniques
Rewriting Basics
Subquery Optimization
Join Optimization
Predicate Optimization
Hands-on: Rewriting
Unit 4: Database-Specific Optimizations
PostgreSQL
MySQL
Snowflake
Other DB Systems
Hands-on: Multi-DB
Advanced SQL Features for Data Transformation
Unit 1: Window Functions: Basics and Ranking
Intro to Window Functions
Ranking with Window Funcs
NTILE for Data Bucketing
Window Frame Specification
Advanced Ranking Scenarios
Unit 2: Window Functions: Aggregation and Analysis
Window Aggregation
LAG and LEAD
FIRST_VALUE & LAST_VALUE
Ratio to Report
Conditional Window Agg
Unit 3: Common Table Expressions (CTEs)
Intro to CTEs
CTEs for Data Prep
Recursive CTEs
Multiple CTEs
CTEs vs Subqueries
Unit 4: User-Defined Functions (UDFs)
Intro to UDFs
Scalar UDFs
Table-Valued UDFs
UDF Best Practices
UDF Security
Unit 5: Pivoting and Unpivoting
Intro to Pivoting
Basic Pivoting
Dynamic Pivoting
Intro to Unpivoting
Unpivoting Techniques
Data Quality and Validation with SQL
Unit 1: SQL-Based Data Validation
Intro to Data Validation
Completeness Checks
Accuracy Checks
Consistency Checks
Data Type Validation
Unit 2: Anomaly and Outlier Detection
Intro to Anomaly Detection
Statistical Outliers
Rule-Based Anomaly Detection
Time-Series Anomaly
Contextual Anomalies
Unit 3: Data Cleansing and Standardization
Intro to Data Cleansing
Removing Duplicates
Correcting Errors
Standardizing Formats
Handling Missing Data
Unit 4: Data Quality Monitoring and Alerting
Intro to Data Monitoring
Setting Up Monitoring
Defining Thresholds
Alerting Systems
Dashboarding
Automation, Integration, and Real-time Processing with SQL
Unit 1: Automating Data Pipelines with SQL
Intro to SQL Automation
SQL Scripting Basics
Stored Procedures: The What
Stored Procedures: The How
Scheduling Automation
Unit 2: Integrating SQL with Data Engineering Tools
SQL & Spark: Overview
Spark SQL: Read & Write
SQL & Kafka: Overview
Kafka Connect Deep Dive
SQL & Cloud Warehouses
Unit 3: Real-time Data Processing with SQL
Real-time SQL: The What
Streaming Platforms
Windowing in Real-time
Real-time Data Ingestion
Real-time Monitoring
Unit 4: Data Lineage and Impact Analysis with SQL
Data Lineage: The What
SQL-based Lineage
Impact Analysis: The What
SQL-based Impact Analysis
Lineage Tools
Data Governance and Security in SQL
Unit 1: Data Masking and Anonymization
Intro to Data Masking
Static Data Masking
Dynamic Data Masking
Data Anonymization
Masking UDFs
Unit 2: SQL-Based Access Control
Intro to Access Control
Roles and Permissions
Row-Level Security
Column-Level Security
Secure Views
Unit 3: Auditing and Logging
Intro to Auditing
SQL Server Auditing
PostgreSQL Auditing
MySQL Auditing
Centralized Logging
Unit 4: Data Governance Compliance
Intro to Governance
Data Lineage
Data Discovery
Data Retention
Compliance Reporting