LBANN  0.103.0
LivermoreBigArtificialNeuralNetworkToolkit
lbann::optimizer Class Referenceabstract

Abstract base class for gradient-based optimization algorithms. More...

#include <optimizer.hpp>

Inheritance diagram for lbann::optimizer:
[legend]
Collaboration diagram for lbann::optimizer:
[legend]

Classes

class  GradientHelper
 Manage gradient information. More...
 
class  GradientHelperImpl
 

Public Member Functions

virtual std::string get_type () const =0
 Human-readable type name. More...
 
virtual description get_description () const
 Human-readable description. More...
 
virtual double get_learning_rate () const =0
 
virtual void set_learning_rate (double)=0
 
virtual void write_proto (lbann_data::Optimizer &proto) const =0
 Add optimizer data to prototext. More...
 
 optimizer (const optimizer &other)
 Copy construct/copy assign. More...
 
optimizeroperator= (const optimizer &other)
 
optimizer_gradient_status get_gradient_status () const
 Return the current gradient status. More...
 
void set_gradient_status (const optimizer_gradient_status status)
 
std::unordered_set< const void * > & get_gradient_sources ()
 
void set_comm (lbann_comm &comm)
 
void set_step_time (EvalType time)
 
void inc_step_time (EvalType time)
 
virtual std::tuple< El::Int, El::Int, El::DistData > get_matrix_info () const =0
 
template<typename TensorDataType >
void accumulate_all_gradient_contributions (El::AbstractDistMatrix< TensorDataType > &gradient)
 
void start_gradient_allreduce ()
 Launch non-blocking allreduce on the gradient, if needed. More...
 
void finish_gradient_allreduce ()
 Synchronize non-blocking allreduce on the gradient, if needed. More...
 
Constructors and Destructor
 optimizer ()
 
virtual ~optimizer ()=default
 
Gradient update management
virtual void setup (weights *w)=0
 
template<typename TensorDataType >
void add_to_gradient (El::AbstractDistMatrix< TensorDataType > const &contrib, TensorDataType scale=1.f, bool allreduce_needed=false)
 Add to the objective function gradient w.r.t. the weights. More...
 
void clear_gradient ()
 Zero out the objective function gradient w.r.t. the weights. More...
 
El::Int get_num_gradient_sources () const
 Objects that are expected to contribute to the gradient. More...
 
void add_gradient_source (const void *source)
 Register a gradient source. More...
 
void remove_gradient_source (const void *source)
 Unregister a gradient source. More...
 
virtual void step ()=0
 Perform optimization step. More...
 
template<typename TensorDataType >
El::AbstractDistMatrix< TensorDataType > & get_gradient_buffer (TensorDataType &buf_scale, TensorDataType &in_scale, bool allreduce_needed=false)
 Get the gradient buffer. More...
 
lbann_commget_comm ()
 Communicator access. More...
 
const lbann_commget_comm () const
 Access LBANN communicator. More...
 
EvalType get_step_time () const
 Statistics access and management. More...
 
virtual void reset_counters ()
 Reset stats counters. More...
 
Checkpointing
template<class Archive >
void serialize (Archive &ar)
 Store state to archive for checkpoint and restart. More...
 
- Public Member Functions inherited from lbann::Cloneable< HasAbstractFunction< optimizer > >
std::unique_ptr< HasAbstractFunction< optimizer > > clone () const
 Return an exception-safe, memory-safe copy of this object. More...
 

Private Types

using gradient_manager_type = GradientHelper
 Map from data types to gradient contributions. More...
 
using gradient_manager_ptr = std::unique_ptr< gradient_manager_type >
 

Private Attributes

lbann_commm_comm
 LBANN communicator. More...
 
std::unordered_set< const void * > m_gradient_sources
 Sources of gradient contributions. More...
 
optimizer_gradient_status m_gradient_status
 Status of values in objective function gradient. More...
 
EvalType m_step_time = 0
 Time spent in optimization step. More...
 
std::unordered_map< std::type_index, gradient_manager_ptrgradients_
 

Detailed Description

Abstract base class for gradient-based optimization algorithms.

Uses a variant of stochastic gradient descent to optimize the values in a weights instance. The weights values are iteratively adjusted to minimize an objective function. Each optimization step requires the objective function gradient w.r.t. the weights.

Definition at line 85 of file optimizer.hpp.

Member Typedef Documentation

◆ gradient_manager_ptr

Definition at line 329 of file optimizer.hpp.

◆ gradient_manager_type

Map from data types to gradient contributions.

Todo:
Refactor this out. It's a hack.

Definition at line 328 of file optimizer.hpp.

Constructor & Destructor Documentation

◆ optimizer() [1/2]

lbann::optimizer::optimizer ( )

◆ ~optimizer()

virtual lbann::optimizer::~optimizer ( )
virtualdefault

◆ optimizer() [2/2]

lbann::optimizer::optimizer ( const optimizer other)

Copy construct/copy assign.

Member Function Documentation

◆ accumulate_all_gradient_contributions()

template<typename TensorDataType >
void lbann::optimizer::accumulate_all_gradient_contributions ( El::AbstractDistMatrix< TensorDataType > &  gradient)

Definition at line 112 of file optimizer_impl.hpp.

Here is the call graph for this function:

◆ add_gradient_source()

void lbann::optimizer::add_gradient_source ( const void *  source)

Register a gradient source.

Any object that uses the weights and influences the objective function is expected to contribute to the objective function gradient. These objects should register themselves during forward prop.

◆ add_to_gradient()

template<typename TensorDataType >
void lbann::optimizer::add_to_gradient ( El::AbstractDistMatrix< TensorDataType > const &  contrib,
TensorDataType  scale = 1.f,
bool  allreduce_needed = false 
)

Add to the objective function gradient w.r.t. the weights.

Parameters
contribContribution to gradient.
scaleScaling factor for gradient contribution.
allreduce_neededWhether the gradient contribution requires an allreduce over its redundant communicator. If false, duplicated data (over the redundant communicator) is assumed to be identical. If true, an allreduce is performed lazily when the gradient is accessed.

Definition at line 36 of file optimizer_impl.hpp.

Here is the call graph for this function:

◆ clear_gradient()

void lbann::optimizer::clear_gradient ( )

Zero out the objective function gradient w.r.t. the weights.

◆ finish_gradient_allreduce()

void lbann::optimizer::finish_gradient_allreduce ( )

Synchronize non-blocking allreduce on the gradient, if needed.

Does nothing if an allreduce isn't needed. Throws an exception if an allreduce is needed but hasn't been started.

◆ get_comm() [1/2]

lbann_comm& lbann::optimizer::get_comm ( )
inline

Communicator access.

Access LBANN communicator.

Definition at line 187 of file optimizer.hpp.

◆ get_comm() [2/2]

const lbann_comm& lbann::optimizer::get_comm ( ) const
inline

Access LBANN communicator.

Definition at line 190 of file optimizer.hpp.

◆ get_description()

virtual description lbann::optimizer::get_description ( ) const
virtual

Human-readable description.

Here is the caller graph for this function:

◆ get_gradient_buffer()

template<typename TensorDataType >
El::AbstractDistMatrix< TensorDataType > & lbann::optimizer::get_gradient_buffer ( TensorDataType &  buf_scale,
TensorDataType &  in_scale,
bool  allreduce_needed = false 
)

Get the gradient buffer.

This provides access to the underlying gradient buffer, which may be directly summed into. This buffer should be considered ephemeral and not stored. The caller must also ensure the buffer has an appropriate distribution. buf_scale provides the caller with a scale factor that must be applied to the gradient buffer before writing to it, and in_scale provides a scaling factor that must be applied to the user's data. Essentially, this enables computations of the form

*    gradient = buf_scale*gradient + in_scale*new_gradient
*  

This is an expert-mode function and is intended to help eliminate copies and facilitate kernel fusion.

Parameters
buf_scaleA scale factor provided to the caller to scale the returned buffer by.
in_scaleA scale factor provided to the caller to scale their gradient contributions by.
allreduce_neededWhether this gradient contribution will need to be allreduced.

Definition at line 49 of file optimizer_impl.hpp.

Here is the call graph for this function:
Here is the caller graph for this function:

◆ get_gradient_sources()

std::unordered_set<const void*>& lbann::optimizer::get_gradient_sources ( )
inline

Definition at line 271 of file optimizer.hpp.

◆ get_gradient_status()

optimizer_gradient_status lbann::optimizer::get_gradient_status ( ) const
inline

Return the current gradient status.

Definition at line 263 of file optimizer.hpp.

◆ get_learning_rate()

virtual double lbann::optimizer::get_learning_rate ( ) const
pure virtual

◆ get_matrix_info()

virtual std::tuple<El::Int, El::Int, El::DistData> lbann::optimizer::get_matrix_info ( ) const
pure virtual
Here is the caller graph for this function:

◆ get_num_gradient_sources()

El::Int lbann::optimizer::get_num_gradient_sources ( ) const

Objects that are expected to contribute to the gradient.

◆ get_step_time()

EvalType lbann::optimizer::get_step_time ( ) const
inline

Statistics access and management.

Time spent in optimization step.

Definition at line 197 of file optimizer.hpp.

◆ get_type()

virtual std::string lbann::optimizer::get_type ( ) const
pure virtual

Human-readable type name.

◆ inc_step_time()

void lbann::optimizer::inc_step_time ( EvalType  time)
inline

Definition at line 279 of file optimizer.hpp.

◆ operator=()

optimizer& lbann::optimizer::operator= ( const optimizer other)
Here is the caller graph for this function:

◆ remove_gradient_source()

void lbann::optimizer::remove_gradient_source ( const void *  source)

Unregister a gradient source.

When an object adds its contribution to the objective function gradient during back prop, it should unregister itself. If there are no more gradient sources remaining, a non-blocking allreduce will be launched on the gradient, if needed.

◆ reset_counters()

virtual void lbann::optimizer::reset_counters ( )
inlinevirtual

Reset stats counters.

Definition at line 200 of file optimizer.hpp.

Here is the call graph for this function:

◆ serialize()

template<class Archive >
void lbann::optimizer::serialize ( Archive &  ar)

Store state to archive for checkpoint and restart.

◆ set_comm()

void lbann::optimizer::set_comm ( lbann_comm comm)
inline

Definition at line 275 of file optimizer.hpp.

◆ set_gradient_status()

void lbann::optimizer::set_gradient_status ( const optimizer_gradient_status  status)
inline

Definition at line 267 of file optimizer.hpp.

◆ set_learning_rate()

virtual void lbann::optimizer::set_learning_rate ( double  )
pure virtual

◆ set_step_time()

void lbann::optimizer::set_step_time ( EvalType  time)
inline

Definition at line 277 of file optimizer.hpp.

◆ setup()

virtual void lbann::optimizer::setup ( weights w)
pure virtual

◆ start_gradient_allreduce()

void lbann::optimizer::start_gradient_allreduce ( )

Launch non-blocking allreduce on the gradient, if needed.

Does nothing if an allreduce is not needed or has already been started.

◆ step()

virtual void lbann::optimizer::step ( )
pure virtual

Perform optimization step.

◆ write_proto()

virtual void lbann::optimizer::write_proto ( lbann_data::Optimizer &  proto) const
pure virtual

Add optimizer data to prototext.

Member Data Documentation

◆ gradients_

std::unordered_map<std::type_index, gradient_manager_ptr> lbann::optimizer::gradients_
private

Definition at line 330 of file optimizer.hpp.

◆ m_comm

lbann_comm* lbann::optimizer::m_comm
private

LBANN communicator.

Definition at line 304 of file optimizer.hpp.

◆ m_gradient_sources

std::unordered_set<const void*> lbann::optimizer::m_gradient_sources
private

Sources of gradient contributions.

This set contains pointers to objects (e.g. layers and objective function terms) that contribute to the objective function gradient. Objects should register themselves as they use the weights during forward prop and unregister themselves as they add their gradient contributions. Once this set is empty, it is safe to launch a non-blocking allreduce on the gradient, if needed.

Definition at line 316 of file optimizer.hpp.

◆ m_gradient_status

optimizer_gradient_status lbann::optimizer::m_gradient_status
private
Initial value:

Status of values in objective function gradient.

Definition at line 319 of file optimizer.hpp.

◆ m_step_time

EvalType lbann::optimizer::m_step_time = 0
private

Time spent in optimization step.

Definition at line 323 of file optimizer.hpp.


The documentation for this class was generated from the following files: