NAME

AI::MXNet::Executor::Group - Manager for a group of executors working in different contexts.

DESCRIPTION

DataParallelExecutorGroup is a group of executors that lives on a group of devices.
This is a helper class used to implement data parallelization. Each mini-batch will
be split and run on the devices.

Parameters for constructor
----------
symbol : AI::MXNet::Symbol
    The common symbolic computation graph for all executors.
contexts : ArrayRef[AI::MXNet::Context]
    A array ref of contexts.
workload : ArrayRef[Num]
    If not undef, could be an array ref of numbers that specify the workload to be assigned
    to different context. Larger number indicate heavier workload.
data_shapes : ArrayRef[NameShape|AI::MXNet::DataDesc]
    Should be a array ref of [name, shape] array refs, for the shapes of data. Note the order is
    important and should be the same as the order that the `DataIter` provide the data.
label_shapes : Maybe[ArrayRef[NameShape|AI::MXNet::DataDesc]]
    Should be a array ref of [$name, $shape] array refs, for the shapes of label. Note the order is
    important and should be the same as the order that the `DataIter` provide the label.
param_names : ArrayRef[Str]
    A array ref of strings, indicating the names of parameters (e.g. weights, filters, etc.)
    in the computation graph.
for_training : Bool
    Indicate whether the executors should be bind for training. When not doing training,
    the memory for gradients will not be allocated.
inputs_need_grad : Bool
    Indicate whether the gradients for the input data should be computed. This is currently
    not used. It will be useful for implementing composition of modules.
shared_group : AI::MXNet::DataParallelExecutorGroup
    Default is undef. This is used in bucketing. When not undef, it should be a executor
    group corresponding to a different bucket. In other words, it will correspond to a different
    symbol with the same set of parameters (e.g. unrolled RNNs with different lengths).
    In this case the memory regions of the parameters will be shared.
logger : Logger
    Default is AI::MXNet::Logging->get_logger.
fixed_param_names: Maybe[ArrayRef[Str]]
    Indicate parameters to be fixed during training. Parameters in this array ref will not allocate
    space for gradient, nor do gradient calculation.
grad_req : ArrayRef[GradReq]|HashRef[GradReq]|GradReq
    Requirement for gradient accumulation. Can be 'write', 'add', or 'null'
    (default to 'write').
    Can be specified globally (str) or for each argument (array ref, hash ref).
state_names: Maybe[ArrayRef[Str]]

decide_slices

Decide the slices for each context according to the workload.

Parameters
----------
$data_shapes : ArrayRef[AI::MXNet::DataDesc]

bind_exec

Bind executors on their respective devices.

Parameters
----------
$data_shapes  : ArrayRef[AI::MXNet::DataDesc]
$label_shapes : Maybe[ArrayRef[AI::MXNet::DataDesc]]
$shared_group : Maybe[AI::MXNet::DataParallelExecutorGroup]
$reshape      : Bool

reshape

Reshape executors.

Parameters
----------
$data_shapes : ArrayRef[AI::MXNet::DataDesc]
$label_shapes : Maybe[ArrayRef[AI::MXNet::DataDesc]]

set_params

Assign, i.e. copy parameters to all the executors.

Parameters
----------
$arg_params : HashRef[AI::MXNet::NDArray]
    A dictionary of name to AI::MXNet::NDArray parameter mapping.
$aux_params : HashRef[AI::MXNet::NDArray]
    A dictionary of name to AI::MXNet::NDArray auxiliary variable mapping.

get_params

Copy data from each executor to arg_params and aux_params.

Parameters
----------
$arg_params : HashRef[AI::MXNet::NDArray]
    target parameter arrays
$aux_params : HashRef[AI::MXNet::NDArray]
    target aux arrays

Notes
-----
- This function will inplace update the NDArrays in arg_params and aux_params.

forward

Split the data_batch according to a workload and run forward on each devices.

Parameters
----------
data_batch : AI::MXNet::DataBatch
Or could be any object implementing similar interface.

is_train : bool
The hint for the backend, indicating whether we are during training phase.
Default is undef, then the value $self->for_training will be used.

get_outputs

Gets outputs of the previous forward computation.

Parameters
----------
merge_multi_context : bool
Default is 1. In the case when data-parallelism is used, the outputs
will be collected from multiple devices. A 1 value indicates that we
should merge the collected results so that they look like from a single
executor.

Returns
-------
If merge_multi_context is 1, it is [$out1, $out2]. Otherwise, it
is [[$out1_dev1, $out1_dev2], [$out2_dev1, $out2_dev2]]. All the output
elements are `AI::MXNet::NDArray`.

get_input_grads

Get the gradients with respect to the inputs of the module.

Parameters
----------
merge_multi_context : bool
Default is 1. In the case when data-parallelism is used, the outputs
will be collected from multiple devices. A 1 value indicates that we
should merge the collected results so that they look like from a single
executor.

Returns
-------
If merge_multi_context is 1, it is [$grad1, $grad2]. Otherwise, it
is [[$grad1_dev1, $grad1_dev2], [$grad2_dev1, $grad2_dev2]]. All the output
elements are AI::MXNet::NDArray.

backward

Run backward on all devices. A backward should be called after
a call to the forward function. Backward cannot be called unless
$self->for_training is 1.

Parameters
----------
out_grads : NDArray or array ref of NDArray, optional
Gradient on the outputs to be propagated back.
This parameter is only needed when bind is called
on outputs that are not a loss function.

update_metric

Accumulate the performance according to eval_metric on all devices.

Parameters
----------
eval_metric : AI::MXNet::EvalMetric
    The metric used for evaluation.
labels : array ref of NDArray
    Typically comes from label of AI::MXNet::DataBatch.

_sliced_shape

Get the sliced shapes for the i-th executor.

Parameters
----------
shapes : array ref of (str, array ref)
    The original (name, shape) pairs.
i : int
Which executor we are dealing with.

install_monitor

Install monitor on all executors

Parameters
----------
$mon : AI::MXNet::Monitor