Airflow for Data Engineering: Core Concepts, Basic Pipelines, and Job Integration
Master Airflow for data engineering: build, schedule, and monitor data pipelines with hands-on examples and real-world integrations.
...
Share
Introduction to Airflow
Unit 1: Airflow Fundamentals
What is Airflow?
Core Concepts: DAGs
Core Concepts: Tasks
Core Concepts: Operators
DAGs, Tasks, Operators
Unit 2: Airflow UI and Architecture
Airflow UI: Overview
Airflow UI: DAG Views
Airflow UI: Task Details
Airflow Architecture
Airflow: Scheduler
Unit 3: Setting Up Airflow
Airflow: Webserver
Airflow: Metastore
Local Setup: Docker
Local Setup: Virtual Env
Designing Your First DAG
Unit 1: DAG Fundamentals
DAG Anatomy
Your First DAG
Task Basics
BashOperator in Action
PythonOperator Intro
Unit 2: Dependencies and Scheduling
Task Dependencies
More on Dependencies
DAG Scheduling
Cron Expression Examples
Catchup Explained
Unit 3: Error Handling and Best Practices
Basic Error Handling
More Error Handling
DAG Best Practices
Idempotency
Scheduling, Monitoring, and Troubleshooting DAGs
Unit 1: DAG Scheduling Deep Dive
Scheduling Overview
Cron Expressions
Timedelta Scheduling
Schedule Examples
Catchup Explained
Unit 2: Monitoring DAGs in the UI
Airflow UI Overview
DAG Run Statuses
Task Instance Details
Gantt Chart
Unit 3: Troubleshooting and Logging
Airflow Logs
Common Errors
Debugging Strategies
Logging Best Practices
Alerting
Integrating with Data Storage
Unit 1: Connecting to Data Storage
Intro to Data Storage
Airflow Connections
Intro to Airflow Hooks
S3 Hook Deep Dive
GCS Hook Deep Dive
Unit 2: Transferring Data with Operators
S3 Operators
GCS Operators
S3 to Redshift
GCS to BigQuery
Unit 3: Data Validation and Security
Data Validation Intro
Great Expectations
Custom Validation Checks
Secrets Management
IAM Roles
Integrating with Compute Services
Unit 1: Compute Service Integration Overview
Intro to Compute Services
Airflow & Databricks
Airflow & Snowflake
Unit 2: Databricks Integration in Detail
Databricks Setup
Run a Databricks Job
Pass Params to Databricks
Monitor Databricks Jobs
Unit 3: Snowflake Integration in Detail
Snowflake Setup
Run SQL in Snowflake
Run Stored Procedures
Transfer Data to Snowflake
Unit 4: Advanced Integration Concepts
Dynamic Task Generation
Error Handling
Clean Up Resources
Advanced Airflow Concepts: XComs and Dependencies
Unit 1: XComs: Sharing Data Between Tasks
Intro to XComs
Pushing Data with XComs
Pulling Data with XComs
XComs Data Types
XComs Best Practices
Unit 2: Advanced Task Dependencies
Dependency Settings
All Done? One Failed?
Task Groups
Taskflow API
Unit 3: Conditional Task Execution and Dynamic DAGs
BranchPythonOperator
Dynamic DAGs: Intro
Dynamic DAGs: Example
Dynamic DAGs: Best Practices