ML Engineering for Cloud Architects at Databricks
Master ML engineering on Databricks: Build scalable pipelines, automate model deployment, and optimize performance for cloud architects.
...
Share
Designing Scalable ML Pipelines with Delta Lake on Databricks
Unit 1: Introduction to Scalable ML Pipelines on Databricks
ML Pipelines: An Overview
Databricks for ML Pipelines
Intro to Delta Lake
End-to-End Pipeline Demo
Best Practices
Unit 2: Data Ingestion and Transformation with Delta Lake
Data Ingestion Strategies
Delta Lake for Storage
Data Transformation
Schema Evolution
Data Partitioning
Unit 3: Optimizing Delta Lake for ML Workloads
Indexing Strategies
Caching Techniques
Compaction and Vacuuming
Delta Lake Performance
Delta Lake and Spark
Unit 4: Feature Engineering with Databricks Feature Store
Feature Store: An Intro
Creating Features
Accessing Features
Feature Store Governance
Feature Store Workflow
Automating ML Model Lifecycle with MLflow on Databricks
Unit 1: MLflow Fundamentals on Databricks
MLflow Overview
Setting Up MLflow
MLflow Tracking Basics
MLflow Projects
MLflow Models
Unit 2: Experiment Tracking and Model Management
Advanced Tracking
Hyperparameter Tuning
MLflow Model Registry
Model Versioning
Model Lineage
Unit 3: Model Deployment and CI/CD Integration
MLflow Model Serving
Custom Model Deployment
CI/CD Pipelines
Automated Testing
Monitoring and Alerting
Optimizing ML Model Performance and Resource Utilization on Databricks
Unit 1: Performance Optimization Techniques
Profiling ML Code
Spark Configuration
Data Partitioning
Caching Strategies
Vectorization
Unit 2: Distributed Training
Horovod Intro
Spark MLlib
Data Shuffling
Parameter Averaging
Fault Tolerance
Unit 3: Hyperparameter Tuning
Hyperparameter Spaces
Automated Tuning
Tuning Algorithms
Early Stopping
Parallel Tuning
Unit 4: Resource Management and Monitoring
Cluster Sizing
Spot Instances
Databricks Advisor
Monitoring Tools
Cost Optimization
Securing and Integrating ML Workloads on Databricks
Unit 1: Securing ML Workloads on Databricks
Databricks Security Basics
Access Control Lists (ACLs)
Data Encryption
Network Security
Compliance
Unit 2: CI/CD Pipelines for ML Models on Databricks
CI/CD Intro
Version Control
Automated Testing
Deployment Strategies
Monitoring and Alerting
Unit 3: Integrating Databricks with Other Cloud Services
Cloud Service Integration
AWS SageMaker
Azure Machine Learning
Data Storage
Real-time Data
Unit 4: Real-time ML Inference with Databricks
Real-time Inference
Model Serving
Low-Latency Serving
Streaming Inference
Monitoring Inference