ELL709 Online Learning

Lecture notes are shared on Teams.

ELL 709: Online Learning

I. Contents

Module 1: Review of Probability and Measure Theory.

Reference: “A Probability Path” - Sidney Resnick

- Sets and Events, Probability Spaces, Random Variables

- Independence, Integration, and Expectation

- Convergence Concepts, Laws of Large Numbers, and Sums of Independent Random Variables

- Characteristic Functions and the Central Limit Theorem

Module 2: Martingales

Reference: “Concentration of Measure Inequalities in Information Theory, Communications and Coding” - Maxim Raginsky and Igal Sason

- Introduction to Martingales and stopping times

- Concentration Inequalities via the Martingale Approach

- Chernoff Technique

- Azuma–Hoeffding inequality

- McDiarmid’s inequality

- Martingales with uniformly bounded differences

- Inequalities for sub- and super-martingales

Module 3: Bandit Algorithms

Reference: “Bandit Algorithms” - Tor Lattimore and Csaba Szepesvari

- Stochastic Bandits

- Explore then commit algorithm

- UCB Algorithm

- UCB - Asymptotic Optimality

- UCB - Minimax optimality

- Exp3 Algorithm

- Lower Bounds for Bandits: Minimax Lower bounds

- Pure Exploration

- Foundations of Bayesian learning

- Bayesian bandits

- Thompson Sampling