R for Big Data Analytics: dplyr, data.table, sparklyr, and Parallel Computing
Master R for big data: from data manipulation with dplyr and data.table to distributed computing with sparklyr and parallel processing, empowering you to analyze and visualize massive datasets efficiently.
...
Introduction to Big Data Analytics with R
Unit 1: Understanding Big Data
Unit 2: R Ecosystem for Big Data
Unit 3: Setting Up Your R Environment
Unit 4: Course Overview
Data Manipulation with dplyr: Foundations
Unit 1: Introduction to dplyr
Unit 2: Selecting and Filtering Data
Unit 3: Transforming and Arranging Data
Advanced Data Manipulation with dplyr
Unit 1: Grouping and Summarizing Data
Unit 2: Window Functions
Unit 3: Advanced Filtering Techniques
Unit 4: Optimizing dplyr Code
Introduction to data.table
Unit 1: Understanding data.table
Unit 2: Basic Data Manipulation with data.table
Unit 3: data.table vs dplyr
Advanced data.table Techniques
Unit 1: Joins in data.table
Unit 2: Aggregations and Updates by Reference
Unit 3: Optimization and Time Series
Parallel Computing in R with `future` and `furrr`
Unit 1: Introduction to Parallel Computing
Unit 2: Getting Started with `future`
Unit 3: Advanced `future` Techniques
Unit 4: Parallelizing dplyr with `furrr`
Scaling R with sparklyr: Connecting R to Spark
Unit 1: Spark and sparklyr: The Basics
Unit 2: Data Transfer and Spark DataFrames
Unit 3: Lazy Evaluation and Spark SQL
Data Manipulation with sparklyr
Unit 1: Sparklyr Data Manipulation Basics
Unit 2: dplyr Verbs in sparklyr
Unit 3: Advanced sparklyr Data Manipulation
Machine Learning with sparklyr
Unit 1: Introduction to Machine Learning with Sparklyr
Unit 2: Building and Evaluating Machine Learning Models