Back to Learning Center
Advanced
Reinforcement Learning
Explore RL algorithms, policy gradients, Q-learning, and build intelligent agents that learn from interaction with environments.
20-25 hours total16 modulesCertificate included
Course Modules
1
Introduction to RL
Agents, environments, and rewards
30 min
2
Markov Decision Processes
The mathematical framework
50 min
3
Value Functions
State and action values
45 min
4
Bellman Equations
Optimal value functions
55 min
5
Dynamic Programming
Policy and value iteration
60 min
6
Monte Carlo Methods
Learning from episodes
50 min
7
Temporal Difference Learning
TD(0) and TD(λ)
65 min
8
Q-Learning
Off-policy TD control
70 min
9
SARSA
On-policy TD control
45 min
10
Deep Q-Networks (DQN)
Neural networks meet RL
80 min
11
Policy Gradient Methods
Direct policy optimization
75 min
12
REINFORCE Algorithm
Monte Carlo policy gradient
55 min
13
Actor-Critic Methods
Combining value and policy
70 min
14
A2C and A3C
Advantage actor-critic
65 min
15
PPO
Proximal policy optimization
75 min
16
Capstone: Train a Game Agent
Build an RL agent from scratch
120 min