06 Stochastic Approximation
- Gap: from non-incremental to incremental
- Mean estimation
- Algorithms
- Robbins-Monro (RM) algorithm
- Stochasitc Gradient Descent (SGD)
- SGD, Batch GD (BGD), Mini Batch GD (MBGD)
- incremental manner and SGD
non-incremental和incremental什么意思
用一个例子说明,比如我有一万个样本,我希望求得这一万个样本的E(X),但我需要等到收集完所有的一万样本我才能求E(X),就叫non-incremental,这样的缺点就是收集的过程中得不到E(X),而如果我边收集边不断去update我的期望,这就叫incremental
Outline
Summary
- Mean estimation: compute using
- RM algorithm: solve using
- SGD algorithm: minimize using
- TD learning algorithms can be viewed as stochastic approximation algorithms and hence have similar expressions
- They are important optimization techniques that can be applied to many other fields
Later
Incremental manner and SGD, widely used later