NAME

AI::MXNet::Gluon::Trainer

DESCRIPTION

Applies an `Optimizer` on a set of Parameters. Trainer should
be used together with `autograd`.

Parameters
----------
params : AI::MXNet::Gluon::ParameterDict
    The set of parameters to optimize.
optimizer : str or Optimizer
    The optimizer to use. See
    `help <http://mxnet.io/api/python/optimization/optimization.html#the-mxnet-optimizer-package>`_
    on Optimizer for a list of available optimizers.
optimizer_params : hash ref
    Key-word arguments to be passed to optimizer constructor. For example,
    {learning_rate => 0.1}. All optimizers accept learning_rate, wd (weight decay),
    clip_gradient, and lr_scheduler. See each optimizer's
    constructor for a list of additional supported arguments.
kvstore : str or KVStore
    kvstore type for multi-gpu and distributed training. See help on
    mx->kvstore->create for more information.
compression_params : hash ref
    Specifies type of gradient compression and additional arguments depending
    on the type of compression being used. For example, 2bit compression requires a threshold.
    Arguments would then be {type => '2bit', threshold => 0.5}
    See AI::MXNet::KVStore->set_gradient_compression method for more details on gradient compression.
update_on_kvstore : Bool, default undef
    Whether to perform parameter updates on kvstore. If undef, then trainer will choose the more
    suitable option depending on the type of kvstore.

Properties
----------
learning_rate : float
    The current learning rate of the optimizer. Given an Optimizer object
    optimizer, its learning rate can be accessed as optimizer->learning_rate.

step

Makes one step of parameter update. Should be called after
`autograd->backward()` and outside of `record()` scope.

For normal parameter updates, `step()` should be used, which internally calls
`allreduce_grads()` and then `update()`. However, if you need to get the reduced
gradients to perform certain transformation, such as in gradient clipping, then
you may want to manually call `allreduce_grads()` and `update()` separately.

Parameters
----------
$batch_size : Int
    Batch size of data processed. Gradient will be normalized by `1/batch_size`.
    Set this to 1 if you normalized loss manually with `loss = mean(loss)`.
$ignore_stale_grad : Bool, optional, default=False
    If true, ignores Parameters with stale gradient (gradient that has not
    been updated by `backward` after last step) and skip update.

allreduce_grads

For each parameter, reduce the gradients from different contexts.

Should be called after `autograd.backward()`, outside of `record()` scope,
and before `trainer.update()`.

For normal parameter updates, `step()` should be used, which internally calls
`allreduce_grads()` and then `update()`. However, if you need to get the reduced
gradients to perform certain transformation, such as in gradient clipping, then
you may want to manually call `allreduce_grads()` and `update()` separately.

set_learning_rate

Sets a new learning rate of the optimizer.

Parameters
----------
lr : float
    The new learning rate of the optimizer.

update

Makes one step of parameter update.

Should be called after autograd->backward() and outside of record() scope,
and after trainer->update`.


For normal parameter updates, step() should be used, which internally calls
allreduce_grads() and then update(). However, if you need to get the reduced
gradients to perform certain transformation, such as in gradient clipping, then
you may want to manually call allreduce_grads() and update() separately.

Parameters
----------
$batch_size : Int
    Batch size of data processed. Gradient will be normalized by `1/$batch_size`.
    Set this to 1 if you normalized loss manually with $loss = mean($loss).
$ignore_stale_grad : Bool, optional, default=False
    If true, ignores Parameters with stale gradient (gradient that has not
    been updated by backward() after last step) and skip update.

save_states

Saves trainer states (e.g. optimizer, momentum) to a file.

Parameters
----------
fname : str
    Path to output states file.

load_states

Loads trainer states (e.g. optimizer, momentum) from a file.

Parameters
----------
fname : str
    Path to input states file.