General Descent Methods
Definition¶
Formally, the descent direction is defined as follows.
Also , the following proposition is easy to obtain:
Steepest Descent Direction¶
给定一个范数
$$ \boxed{ \begin{aligned} &\textbf{Definition 5 (Steepest Descent Direction)} \newline &\newline &\Delta_{|\cdot|}\boldsymbol{x}\triangleq\underset{\boldsymbol{v}:|\boldsymbol{v}|\leq1}{\operatorname{argmin}}\langle\boldsymbol{v},\nabla f(\boldsymbol{x})\rangle \end{aligned} } $$ Note*:
- 其实,我们是希望让
的。但是由于直接最小化是不现实的,因此我们只能采用近似策略: - 显然,如果使用 2-范数的话,那么这个 domain 就是 high-dimensional sphere,等价于梯度下降
Examples: Different Norms¶
证明
因此,对于三种范数,descent direction
Example: TRPO¶
TRPO 是强化学习的一个最优化方式。
Let
where
and
The theoretical TRPO update isn't the easiest to work with, so TRPO makes some approximations to get an answer quickly. We Taylor expand the objective and constraint to leading order around
- By the property of KL-divergence,
, thus the second order Taylor expansion of only consists of the Hessian matrix
resulting in an approximate optimization problem,
Randomized Schemes¶
Coordinate Descent¶
For coordinate descent method, the direction of
Note: 由于
Stochastic Gradient¶
For stochastic gradient method, the direction of
Note: 最常见的,就是 batch gradient descent。均匀(不重复)抽样一组 batch,然后进行 descent。