|
LBANN
0.103.0
LivermoreBigArtificialNeuralNetworkToolkit
|
Abstract base class for gradient-based optimization algorithms. More...
#include <optimizer.hpp>
Classes | |
| class | GradientHelper |
| Manage gradient information. More... | |
| class | GradientHelperImpl |
Public Member Functions | |
| virtual std::string | get_type () const =0 |
| Human-readable type name. More... | |
| virtual description | get_description () const |
| Human-readable description. More... | |
| virtual double | get_learning_rate () const =0 |
| virtual void | set_learning_rate (double)=0 |
| virtual void | write_proto (lbann_data::Optimizer &proto) const =0 |
| Add optimizer data to prototext. More... | |
| optimizer (const optimizer &other) | |
| Copy construct/copy assign. More... | |
| optimizer & | operator= (const optimizer &other) |
| optimizer_gradient_status | get_gradient_status () const |
| Return the current gradient status. More... | |
| void | set_gradient_status (const optimizer_gradient_status status) |
| std::unordered_set< const void * > & | get_gradient_sources () |
| void | set_comm (lbann_comm &comm) |
| void | set_step_time (EvalType time) |
| void | inc_step_time (EvalType time) |
| virtual std::tuple< El::Int, El::Int, El::DistData > | get_matrix_info () const =0 |
| template<typename TensorDataType > | |
| void | accumulate_all_gradient_contributions (El::AbstractDistMatrix< TensorDataType > &gradient) |
| void | start_gradient_allreduce () |
| Launch non-blocking allreduce on the gradient, if needed. More... | |
| void | finish_gradient_allreduce () |
| Synchronize non-blocking allreduce on the gradient, if needed. More... | |
Constructors and Destructor | |
| optimizer () | |
| virtual | ~optimizer ()=default |
Gradient update management | |
| virtual void | setup (weights *w)=0 |
| template<typename TensorDataType > | |
| void | add_to_gradient (El::AbstractDistMatrix< TensorDataType > const &contrib, TensorDataType scale=1.f, bool allreduce_needed=false) |
| Add to the objective function gradient w.r.t. the weights. More... | |
| void | clear_gradient () |
| Zero out the objective function gradient w.r.t. the weights. More... | |
| El::Int | get_num_gradient_sources () const |
| Objects that are expected to contribute to the gradient. More... | |
| void | add_gradient_source (const void *source) |
| Register a gradient source. More... | |
| void | remove_gradient_source (const void *source) |
| Unregister a gradient source. More... | |
| virtual void | step ()=0 |
| Perform optimization step. More... | |
| template<typename TensorDataType > | |
| El::AbstractDistMatrix< TensorDataType > & | get_gradient_buffer (TensorDataType &buf_scale, TensorDataType &in_scale, bool allreduce_needed=false) |
| Get the gradient buffer. More... | |
| lbann_comm & | get_comm () |
| Communicator access. More... | |
| const lbann_comm & | get_comm () const |
| Access LBANN communicator. More... | |
| EvalType | get_step_time () const |
| Statistics access and management. More... | |
| virtual void | reset_counters () |
| Reset stats counters. More... | |
Checkpointing | |
| template<class Archive > | |
| void | serialize (Archive &ar) |
| Store state to archive for checkpoint and restart. More... | |
Public Member Functions inherited from lbann::Cloneable< HasAbstractFunction< optimizer > > | |
| std::unique_ptr< HasAbstractFunction< optimizer > > | clone () const |
| Return an exception-safe, memory-safe copy of this object. More... | |
Private Types | |
| using | gradient_manager_type = GradientHelper |
| Map from data types to gradient contributions. More... | |
| using | gradient_manager_ptr = std::unique_ptr< gradient_manager_type > |
Private Attributes | |
| lbann_comm * | m_comm |
| LBANN communicator. More... | |
| std::unordered_set< const void * > | m_gradient_sources |
| Sources of gradient contributions. More... | |
| optimizer_gradient_status | m_gradient_status |
| Status of values in objective function gradient. More... | |
| EvalType | m_step_time = 0 |
| Time spent in optimization step. More... | |
| std::unordered_map< std::type_index, gradient_manager_ptr > | gradients_ |
Abstract base class for gradient-based optimization algorithms.
Uses a variant of stochastic gradient descent to optimize the values in a weights instance. The weights values are iteratively adjusted to minimize an objective function. Each optimization step requires the objective function gradient w.r.t. the weights.
Definition at line 85 of file optimizer.hpp.
|
private |
Definition at line 329 of file optimizer.hpp.
|
private |
Map from data types to gradient contributions.
Definition at line 328 of file optimizer.hpp.
| lbann::optimizer::optimizer | ( | ) |
|
virtualdefault |
| lbann::optimizer::optimizer | ( | const optimizer & | other | ) |
Copy construct/copy assign.
| void lbann::optimizer::accumulate_all_gradient_contributions | ( | El::AbstractDistMatrix< TensorDataType > & | gradient | ) |
| void lbann::optimizer::add_gradient_source | ( | const void * | source | ) |
Register a gradient source.
Any object that uses the weights and influences the objective function is expected to contribute to the objective function gradient. These objects should register themselves during forward prop.
| void lbann::optimizer::add_to_gradient | ( | El::AbstractDistMatrix< TensorDataType > const & | contrib, |
| TensorDataType | scale = 1.f, |
||
| bool | allreduce_needed = false |
||
| ) |
Add to the objective function gradient w.r.t. the weights.
| contrib | Contribution to gradient. |
| scale | Scaling factor for gradient contribution. |
| allreduce_needed | Whether the gradient contribution requires an allreduce over its redundant communicator. If false, duplicated data (over the redundant communicator) is assumed to be identical. If true, an allreduce is performed lazily when the gradient is accessed. |
Definition at line 36 of file optimizer_impl.hpp.
| void lbann::optimizer::clear_gradient | ( | ) |
Zero out the objective function gradient w.r.t. the weights.
| void lbann::optimizer::finish_gradient_allreduce | ( | ) |
Synchronize non-blocking allreduce on the gradient, if needed.
Does nothing if an allreduce isn't needed. Throws an exception if an allreduce is needed but hasn't been started.
|
inline |
|
inline |
Access LBANN communicator.
Definition at line 190 of file optimizer.hpp.
|
virtual |
Human-readable description.
| El::AbstractDistMatrix< TensorDataType > & lbann::optimizer::get_gradient_buffer | ( | TensorDataType & | buf_scale, |
| TensorDataType & | in_scale, | ||
| bool | allreduce_needed = false |
||
| ) |
Get the gradient buffer.
This provides access to the underlying gradient buffer, which may be directly summed into. This buffer should be considered ephemeral and not stored. The caller must also ensure the buffer has an appropriate distribution. buf_scale provides the caller with a scale factor that must be applied to the gradient buffer before writing to it, and in_scale provides a scaling factor that must be applied to the user's data. Essentially, this enables computations of the form
* gradient = buf_scale*gradient + in_scale*new_gradient *
This is an expert-mode function and is intended to help eliminate copies and facilitate kernel fusion.
| buf_scale | A scale factor provided to the caller to scale the returned buffer by. |
| in_scale | A scale factor provided to the caller to scale their gradient contributions by. |
| allreduce_needed | Whether this gradient contribution will need to be allreduced. |
Definition at line 49 of file optimizer_impl.hpp.
|
inline |
Definition at line 271 of file optimizer.hpp.
|
inline |
Return the current gradient status.
Definition at line 263 of file optimizer.hpp.
|
pure virtual |
|
pure virtual |
| El::Int lbann::optimizer::get_num_gradient_sources | ( | ) | const |
Objects that are expected to contribute to the gradient.
|
inline |
Statistics access and management.
Time spent in optimization step.
Definition at line 197 of file optimizer.hpp.
|
pure virtual |
Human-readable type name.
|
inline |
Definition at line 279 of file optimizer.hpp.
| void lbann::optimizer::remove_gradient_source | ( | const void * | source | ) |
Unregister a gradient source.
When an object adds its contribution to the objective function gradient during back prop, it should unregister itself. If there are no more gradient sources remaining, a non-blocking allreduce will be launched on the gradient, if needed.
|
inlinevirtual |
Reset stats counters.
Definition at line 200 of file optimizer.hpp.
| void lbann::optimizer::serialize | ( | Archive & | ar | ) |
Store state to archive for checkpoint and restart.
|
inline |
Definition at line 275 of file optimizer.hpp.
|
inline |
Definition at line 267 of file optimizer.hpp.
|
pure virtual |
|
inline |
Definition at line 277 of file optimizer.hpp.
|
pure virtual |
| void lbann::optimizer::start_gradient_allreduce | ( | ) |
Launch non-blocking allreduce on the gradient, if needed.
Does nothing if an allreduce is not needed or has already been started.
|
pure virtual |
Perform optimization step.
|
pure virtual |
Add optimizer data to prototext.
|
private |
Definition at line 330 of file optimizer.hpp.
|
private |
LBANN communicator.
Definition at line 304 of file optimizer.hpp.
|
private |
Sources of gradient contributions.
This set contains pointers to objects (e.g. layers and objective function terms) that contribute to the objective function gradient. Objects should register themselves as they use the weights during forward prop and unregister themselves as they add their gradient contributions. Once this set is empty, it is safe to launch a non-blocking allreduce on the gradient, if needed.
Definition at line 316 of file optimizer.hpp.
|
private |
Status of values in objective function gradient.
Definition at line 319 of file optimizer.hpp.
|
private |
Time spent in optimization step.
Definition at line 323 of file optimizer.hpp.