Layers
Layers in a neural network are arranged as a directed acyclic graph. They take one input tensor from each parent layer and send an output tensor to each child layer. Some layers may recieve tensors from weights objects (trainable parameters).
LBANN performs implicit mini-batching. If the user specifies a layer to handle 3D image data (in channel-height-width format), it is stored internally as a 4D tensor (in NCHW format). This scheme implies that computation is mostly independent between mini-batch samples (with a few exceptions like batchnorm).
The default value for fields in protobuf messages is zero-like (false for bool, empty string for string). Thus, all defaults are zero-like unless otherwise stated.
Using Layers
Layers are used by adding them to the python front end with the appropriate arguments and passing them as a list into the model. More information about LBANN’s layers can be found in LBANN Layers List. See Common Layer Arguments for a description of layer parameters. For example, the input layer, relu layer, and mean squared error layer could be included with the following:
Python Front End
images = lbann.Input(data_field='samples', name="images")
labels = lbann.Input(data_field='labels', name="labels")
relu = lbann.Relu(images, name="relu")
mse = lbann.MeanSquaredError([images, relu], name="mse")
layer_list = list(lbann.traverse_layer_graph([images, labels]))
model = lbann.Model(num_epochs,
layer_list,
objective_function,
metrics,
callbacks)
Common Layer Arguments
- name
(
string
, optional) default =layer<index>
Unique identifier for layer
Must not contain spaces
- parents
(
Iterable of Layer
, optional) Parent layers, i.e. Sources of input tensorsList of layer names
- children
(
Iterable of Layer
, optional) Child layers, i.e. Destinations for output tensorsList of layer names
- weights
(
Iterable of Weights
, optional) Weights objects.Typically used as trainable parameters
List of weights names
- device_allocation
(
string
, optional) Data tensor deviceOptions: CPU or GPU
If LBANN has been built with GPU support, default is GPU. Otherwise, CPU
- datatype
(
lbann.DataType
, optional)Data type used for activations and weights
Advanced Layer Options
- hint_layer
(
Layer
, optional) Hint layer for configuring output dimensionsTypically used to specify that a layer has the same output dimensions as another.
- data_layout
(
string
, optional) Data tensor layoutOptions: data_parallel (default) or model_parallel
- parallel_strategy
(
dictionary
, optional) Configuration for advanced parallelization strategiesParallel Strategy Options:
- sample_groups
(
int64
)- sample_splits
(
int64
)- height_groups
(
int64
)- height_splits
(
int64
)- width_groups
(
int64
)- width_splits
(
int64
)- channel_groups
(
int64
)- channel_splits
(
int64
)- filter_groups
(
int64
)- filter_splits
(
int64
)For fully-connected layers:
- replications
(
int64
)- procs_per_replica
(
int64
)- depth_groups
(
int64
)- depth_splits
(
int64
)Sub-grid parallelism:
- sub_branch_tag
(
int64
)- sub_branch_resource_percentage
(
int64
)- enable_subgraph
(
bool
)
Deprecated Layer Options
Deprecated:
- num_neurons_from_data_reader
(
bool
)- freeze
(
bool
)
Deprecated and unused:
- WeightsData weights_data
(
repeated
)- top
(
string
)- bottom
(
string
)- type
(
string
)
LBANN Layers List
- I/O Layers
- Operator Layer
- Transform Layers
- BatchwiseReduceSum
- IdentityZero
- Bernoulli
- Concatenation
- Constant
- Crop
- Cross_Grid_Sum
- Cross_Grid_Sum_Slice
- Dummy
- Evaluation
- Gather
- Gaussian
- Hadamard
- InTopK
- Pooling
- Reduction
- Reshape
- Scatter
- Slice
- Sort
- Split
- StopGradient
- Sum
- Tessellate
- Uniform
- Unpooling
- WeightedSum
- WeightsLayer
- CategoricalRandom (Deprecated)
- DiscreteRandom (Deprecated)
- Learning Layers
- Loss Layers
- Math Layers
- Regularization Layers
- Activation Layers
- Image Layers
- Miscellaneous Layers