|
LBANN
0.103.0
LivermoreBigArtificialNeuralNetworkToolkit
|
An implementation of the LTFB training algorithm. More...
#include <ltfb.hpp>
Public Types | |
| using | TermCriteriaType = ltfb::LTFBTerminationCriteria |
| using | ExeContextType = ltfb::LTFBExecutionContext |
Public Member Functions | |
Life-cycle management | |
| LTFB (std::string name, std::unique_ptr< TrainingAlgorithm > local_training_algorithm, std::unique_ptr< ltfb::MetaLearningStrategy > meta_learning_strategy, ltfb::LTFBTerminationCriteria stopping_criteria, bool suppress_timer) | |
| Construct LTFB from its component pieces. More... | |
| ~LTFB () noexcept=default | |
| LTFB (LTFB const &other)=delete | |
| LTFB & | operator= (LTFB const &)=delete |
| LTFB (LTFB &&)=default | |
| LTFB & | operator= (LTFB &&)=default |
| std::string | get_type () const final |
| Queries. More... | |
Apply interface | |
| void | apply (ExecutionContext &context, model &m, data_coordinator &dc, execution_mode mode) final |
| Apply the training algorithm to refine model weights. More... | |
Public Member Functions inherited from lbann::TrainingAlgorithm | |
| TrainingAlgorithm (std::string name) | |
| Constructor. More... | |
| virtual | ~TrainingAlgorithm ()=default |
| std::string const & | get_name () const noexcept |
| A user-defined string identifying the algorithm object. More... | |
| void | apply (model &model, data_coordinator &dc) |
| Apply the algorithm to the given model. More... | |
| void | setup_models (std::vector< observer_ptr< model >> const &models, size_t max_mini_batch_size, const std::vector< El::Grid *> &grids) |
| Setup a collection of models. More... | |
| std::unique_ptr< ExecutionContext > | get_new_execution_context () const |
| Get a default-initialized execution context that fits this training algorithm. More... | |
Protected Member Functions | |
| ltfb::LTFBExecutionContext * | do_get_new_execution_context () const final |
Covariant return-friendly implementation of get_new_exection_context(). More... | |
Protected Member Functions inherited from lbann::TrainingAlgorithm | |
| TrainingAlgorithm (const TrainingAlgorithm &other)=delete | |
| TrainingAlgorithm & | operator= (const TrainingAlgorithm &other)=delete |
| TrainingAlgorithm (TrainingAlgorithm &&other)=default | |
| TrainingAlgorithm & | operator= (TrainingAlgorithm &&other)=default |
Private Attributes | |
| std::unique_ptr< TrainingAlgorithm > | m_local_algo |
| The training algorithm for trainer-local training. More... | |
| std::unique_ptr< ltfb::MetaLearningStrategy > | m_meta_learning_strategy |
| The strategy for postprocessing local training outputs. More... | |
| ltfb::LTFBTerminationCriteria | m_termination_criteria |
| The LTFB stopping criteria. More... | |
| bool | m_suppress_timer = false |
| Suppress timer output. More... | |
An implementation of the LTFB training algorithm.
This is an example of a "meta-learning" training algorithm in which multiple models are trained in parallel – one model per trainer participating in the lbann_comm object. Following local training, some postprocessing strategy is applied to further optimize the solution. In the case of "classical LTFB", the trainers in the communicator are paired off randomly and "compete" in tournaments. The winner of each tournament is returned from the postprocessing to either undergo further local training or to be returned from the training algorithm.
The salient thing to realize is that every local training will be followed by this postprocessing. Therefore, it is expected that the output of the postprocessing be "at least as good" (by some relevant metric) as the one that went in. If, say, you want to "randomize" your model in some way, and then do some training, and then do some other stuff, this class can certainly serve as a useful guide, but is not likely to be the out-of-the-box solution.
Definition at line 66 of file execution_algorithms/ltfb.hpp.
Definition at line 70 of file execution_algorithms/ltfb.hpp.
Definition at line 69 of file execution_algorithms/ltfb.hpp.
|
inline |
Construct LTFB from its component pieces.
| [in] | name | A string identifying this instance of LTFB. |
| [in] | local_training_algorithm | The training algorithm to be used for (trainer-)local training. |
| [in] | meta_learning_strategy | The postprocessing algorithm. |
| [in] | stopping_criteria | When to stop the training algorithm. |
Definition at line 83 of file execution_algorithms/ltfb.hpp.
|
defaultnoexcept |
|
delete |
|
default |
|
finalvirtual |
Apply the training algorithm to refine model weights.
| [in,out] | context | The persistent execution context for this algorithm. |
| [in,out] | m | The model to be trained. |
| [in,out] | dc | The data source for training. |
| [in] | mode | Completely superfluous. |
Implements lbann::TrainingAlgorithm.
|
inlinefinalprotectedvirtual |
Covariant return-friendly implementation of get_new_exection_context().
Implements lbann::TrainingAlgorithm.
Definition at line 123 of file execution_algorithms/ltfb.hpp.
|
inlinefinalvirtual |
Queries.
Implements lbann::TrainingAlgorithm.
Definition at line 103 of file execution_algorithms/ltfb.hpp.
|
private |
The training algorithm for trainer-local training.
Definition at line 130 of file execution_algorithms/ltfb.hpp.
|
private |
The strategy for postprocessing local training outputs.
Definition at line 133 of file execution_algorithms/ltfb.hpp.
|
private |
Suppress timer output.
Definition at line 143 of file execution_algorithms/ltfb.hpp.
|
private |
The LTFB stopping criteria.
Definition at line 136 of file execution_algorithms/ltfb.hpp.