R for Big Data Analytics: dplyr, data.table, sparklyr, and Parallel Computing

Master R for big data: from data manipulation with dplyr and data.table to distributed computing with sparklyr and parallel processing, empowering you to analyze and visualize massive datasets efficiently.

Introduction to Big Data Analytics with R

Unit 1: Understanding Big Data

Unit 2: R Ecosystem for Big Data

Unit 3: Setting Up Your R Environment

Unit 4: Course Overview

Data Manipulation with dplyr: Foundations

Unit 1: Introduction to dplyr

Unit 2: Selecting and Filtering Data

Unit 3: Transforming and Arranging Data

Advanced Data Manipulation with dplyr

Unit 1: Grouping and Summarizing Data

Unit 2: Window Functions

Unit 3: Advanced Filtering Techniques

Unit 4: Optimizing dplyr Code

Introduction to data.table

Unit 1: Understanding data.table

Unit 2: Basic Data Manipulation with data.table

Unit 3: data.table vs dplyr

Advanced data.table Techniques

Unit 1: Joins in data.table

Unit 2: Aggregations and Updates by Reference

Unit 3: Optimization and Time Series

Parallel Computing in R with `future` and `furrr`

Unit 1: Introduction to Parallel Computing

Unit 2: Getting Started with `future`

Unit 3: Advanced `future` Techniques

Unit 4: Parallelizing dplyr with `furrr`

Scaling R with sparklyr: Connecting R to Spark

Unit 1: Spark and sparklyr: The Basics

Unit 2: Data Transfer and Spark DataFrames

Unit 3: Lazy Evaluation and Spark SQL

Data Manipulation with sparklyr

Unit 1: Sparklyr Data Manipulation Basics

Unit 2: dplyr Verbs in sparklyr

Unit 3: Advanced sparklyr Data Manipulation

Machine Learning with sparklyr

Unit 1: Introduction to Machine Learning with Sparklyr

Unit 2: Building and Evaluating Machine Learning Models

Unit 3: Advanced Machine Learning Techniques

Unit 4: Model Tuning, Selection, and Deployment

Advanced Statistical Modeling in R

Unit 1: Generalized Linear Models (GLMs)

Unit 2: Mixed-Effects Models

Unit 3: Time Series Analysis

Data Visualization with ggplot2

Unit 1: ggplot2 Fundamentals

Unit 2: Common Plot Types

Unit 3: Customization and Advanced Techniques

Interactive Data Visualization with Shiny

Unit 1: Introduction to Shiny

Unit 2: Inputs, Outputs, and Reactivity

Unit 3: Data Visualization and Deployment

Big Data Workflow Optimization

Unit 1: Identifying Bottlenecks

Unit 2: Optimizing for Speed

Unit 3: Optimizing for Memory

Unit 4: Best Practices

Case Studies and Real-World Applications

Unit 1: E-commerce Analytics

Unit 2: Financial Services Analytics

Unit 3: Healthcare Analytics