Compiler Optimization for Low-Level Performance
Master low-level compiler optimization techniques to boost code performance, reduce memory footprint, and leverage modern hardware architectures.
...
Share
Introduction to Low-Level Optimization
Unit 1: Understanding Low-Level Optimization
What is LLO?
The Compilation Pipeline
Key Performance Metrics
Code Size Matters
Optimization Trade-offs
Unit 2: Common Optimization Techniques
Instruction Selection 101
Register Allocation Intro
Peephole Optimization
Instruction Selection Ex
Register Allocation Ex
Unit 3: Compiler-Architecture Interaction
Target Architecture
ISA Impact
Architecture Features
Compiler Feedback
Optimization Summary
Instruction Selection and Scheduling
Unit 1: Instruction Selection Fundamentals
What is Instruction Selection?
IR to Machine Code
Cost Functions
Instruction Set Architecture
Target Architecture
Unit 2: Instruction Selection Strategies
Template Matching
Tree Pattern Matching
Dynamic Programming
Maximal Munch
Code Generation
Unit 3: Instruction Scheduling
What is Instruction Scheduling?
Data Dependencies
Resource Constraints
List Scheduling
Software Pipelining
Register Allocation
Unit 1: Introduction to Register Allocation
Registers: The Basics
Why Register Allocation?
Challenges in Allocation
Unit 2: Register Allocation Algorithms
Liveness Analysis
Interference Graphs
Graph Coloring: Basics
Graph Coloring: Spilling
Unit 3: Spilling and Advanced Techniques
Spill Cost Analysis
Spill Code Insertion
Coalescing
Other Allocation Algorithms
Unit 4: Interaction and Optimization
RA and Instruction Scheduling
RA and Other Optimizations
Real-World Considerations
RA: A Summary
Peephole Optimization
Unit 1: Peephole Optimization Fundamentals
What is Peephole Opt?
Peephole Optimization's Role
Peephole Opt: A Simple View
Peephole Opt Trade-offs
Unit 2: Common Peephole Optimization Techniques
Redundant Load Elimination
Redundant Store Elimination
Unreachable Code Removal
Strength Reduction
Constant Folding
Unit 3: Advanced Peephole Optimization
Jump-to-Jump Optimization
Algebraic Simplification
Special Case Instructions
Unit 4: Target-Specific Peephole Optimization & Evaluation
Custom Rules: An Overview
Evaluating Peephole Opt
Putting it All Together
Compiler Flags and Optimization Levels
Unit 1: Understanding Compiler Flags
Compiler Flag Basics
GCC vs. Clang Flags
Debugging Flags (-g)
Warning Flags (-Wall, -Werror)
Linking Flags (-l, -L)
Unit 2: Optimization Levels
Optimization Levels Intro
-O0: No Optimization
-O1: Basic Optimizations
-O2: Moderate Optimization
-O3: Aggressive Optimization
Unit 3: Advanced Optimization Flags
-Os: Optimize for Size
-Ofast: Extreme Speed
Architecture Flags (-march)
Interprocedural Opts (LTO)
Profile-Guided Opts (PGO)
Performance Analysis and Tuning
Unit 1: Introduction to Performance Analysis
Why Analyze Performance?
Key Performance Metrics
Intro to Profilers
Intro to Benchmarking
Statistical Significance
Unit 2: Using Performance Analysis Tools
Using perf
Using gprof
Using Valgrind
Using Flame Graphs
Choosing the Right Tool
Unit 3: Addressing Performance Bottlenecks
Algorithmic Efficiency
Data Structure Choices
Cache Optimization
Branch Prediction
Reducing Memory Accesses
Optimization for Modern Architectures
Unit 1: Modern Architectures and Optimization
Modern Architectures
SIMD: An Overview
GPU Architecture
Parallelism Concepts
Vectorization Intro
Unit 2: SIMD Optimization Techniques
Manual Vectorization
Auto-Vectorization
Data Layout Matters
SIMD Best Practices
SIMD Case Study
Unit 3: GPU Acceleration Techniques
GPU Offloading
CUDA Programming
OpenCL Programming
Memory Optimization
GPU Trade-offs