MachineIntelligenceCore:NeuralNets
|
Update using AdaDelta - adaptive gradient descent with running average E[g^2] and E[d^2]. More...
#include <AdaDelta.hpp>
Public Member Functions | |
AdaDelta (size_t rows_, size_t cols_, eT decay_=0.9, eT eps_=1e-8) | |
mic::types::MatrixPtr< eT > | calculateUpdate (mic::types::MatrixPtr< eT > x_, mic::types::MatrixPtr< eT > dx_, eT learning_rate_) |
![]() | |
OptimizationFunction () | |
virtual | ~OptimizationFunction () |
Virtual destructor - empty. More... | |
virtual void | update (mic::types::MatrixPtr< eT > p_, mic::types::MatrixPtr< eT > dp_, eT learning_rate_, eT decay_=0.0) |
virtual void | update (mic::types::MatrixPtr< eT > p_, mic::types::MatrixPtr< eT > x_, mic::types::MatrixPtr< eT > y_, eT learning_rate_=0.001) |
Protected Attributes | |
eT | decay |
Decay ratio, similar to momentum. More... | |
eT | eps |
Smoothing term that avoids division by zero. More... | |
mic::types::MatrixPtr< eT > | EG |
Decaying average of the squares of gradients up to time t ("diagonal matrix") - E[g^2]. More... | |
mic::types::MatrixPtr< eT > | ED |
Decaying average of the squares of updates up to time t ("diagonal matrix") - E[delta Theta^2]. More... | |
mic::types::MatrixPtr< eT > | delta |
Calculated update. More... | |
Update using AdaDelta - adaptive gradient descent with running average E[g^2] and E[d^2].
Definition at line 39 of file AdaDelta.hpp.
|
inline |
Constructor. Sets dimensions, values of decay (default=0.9) and eps (default=1e-8).
rows_ | Number of rows of the updated matrix/its gradient. |
cols_ | Number of columns of the updated matrix/its gradient. |
Definition at line 47 of file AdaDelta.hpp.
References mic::neural_nets::optimization::AdaDelta< eT >::delta, mic::neural_nets::optimization::AdaDelta< eT >::ED, and mic::neural_nets::optimization::AdaDelta< eT >::EG.
|
inlinevirtual |
Calculates the update according to the AdaDelta update rule.
x_ | Pointer to the current matrix. |
dx_ | Pointer to current gradient of that matrix. |
learning_rate_ | Learning rate (default=0.001). NOT USED! |
Implements mic::neural_nets::optimization::OptimizationFunction< eT >.
Definition at line 65 of file AdaDelta.hpp.
References mic::neural_nets::optimization::AdaDelta< eT >::decay, mic::neural_nets::optimization::AdaDelta< eT >::delta, mic::neural_nets::optimization::AdaDelta< eT >::ED, mic::neural_nets::optimization::AdaDelta< eT >::EG, and mic::neural_nets::optimization::AdaDelta< eT >::eps.
|
protected |
Decay ratio, similar to momentum.
Definition at line 101 of file AdaDelta.hpp.
Referenced by mic::neural_nets::optimization::AdaDelta< eT >::calculateUpdate().
|
protected |
Calculated update.
Definition at line 113 of file AdaDelta.hpp.
Referenced by mic::neural_nets::optimization::AdaDelta< eT >::AdaDelta(), and mic::neural_nets::optimization::AdaDelta< eT >::calculateUpdate().
|
protected |
Decaying average of the squares of updates up to time t ("diagonal matrix") - E[delta Theta^2].
Definition at line 110 of file AdaDelta.hpp.
Referenced by mic::neural_nets::optimization::AdaDelta< eT >::AdaDelta(), and mic::neural_nets::optimization::AdaDelta< eT >::calculateUpdate().
|
protected |
Decaying average of the squares of gradients up to time t ("diagonal matrix") - E[g^2].
Definition at line 107 of file AdaDelta.hpp.
Referenced by mic::neural_nets::optimization::AdaDelta< eT >::AdaDelta(), and mic::neural_nets::optimization::AdaDelta< eT >::calculateUpdate().
|
protected |
Smoothing term that avoids division by zero.
Definition at line 104 of file AdaDelta.hpp.
Referenced by mic::neural_nets::optimization::AdaDelta< eT >::calculateUpdate().