06 Stochastic Approximation

  • Gap: from non-incremental to incremental
  • Mean estimation
  • Algorithms
    • Robbins-Monro (RM) algorithm
    • Stochasitc Gradient Descent (SGD)
    • SGD, Batch GD (BGD), Mini Batch GD (MBGD)
  • incremental manner and SGD

non-incremental和incremental什么意思

用一个例子说明,比如我有一万个样本,我希望求得这一万个样本的E(X),但我需要等到收集完所有的一万样本我才能求E(X),就叫non-incremental,这样的缺点就是收集的过程中得不到E(X),而如果我边收集边不断去update我的期望,这就叫incremental

Outline

  1. Motivating example
  2. Robbins-Monro algorithm
  3. Stochastic Gradient Descent

Summary

  • Mean estimation: compute using
  • RM algorithm: solve using
  • SGD algorithm: minimize using
  • TD learning algorithms can be viewed as stochastic approximation algorithms and hence have similar expressions
  • They are important optimization techniques that can be applied to many other fields

Later

Incremental manner and SGD, widely used later

next 07 Temporal-Difference Methods