Aggregated Momentum (AggMo) is a variant of the classical momentum stochastic optimizer which maintains several velocity vectors with different $\beta$ parameters. AggMo averages the velocity vectors when updating the parameters. It resolves the problem of choosing a momentum parameter by taking a linear combination of multiple momentum buffers. Each of $K$ momentum buffers have a different discount factor $\beta \in \mathbb{R}^{K}$, and these are averaged for the update. The update rule is:
$$ \textbf{v}_{t}^{\left(i\right)} = \beta^{(i)}\textbf{v}_{t-1}^{\left(i\right)} - \nabla_{\theta}f\left(\mathbf{\theta}_{t-1}\right) $$
$$ \mathbf{\theta_{t}} = \mathbf{\theta_{t-1}} + \frac{\gamma_{t}}{K}\sum^{K}_{i=1}\textbf{v}_{t}^{\left(i\right)} $$
where $v^{\left(i\right)}_{0}$ for each $i$. The vector $\mathcal{\beta} = \left[\beta^{(1)}, \ldots, \beta^{(K)}\right]$ is the dampening factor.
Source: Aggregated Momentum: Stability Through Passive DampingPaper | Code | Results | Date | Stars |
---|
Component | Type |
|
---|---|---|
🤖 No Components Found | You can add them if they exist; e.g. Mask R-CNN uses RoIAlign |