NAME

AI::MXNet::KVStore - Key value store interface of MXNet.

DESCRIPTION

Key value store interface of MXNet for parameter synchronization, over multiple devices.

init

Initialize a single or a sequence of key-value pairs into the store. For each key, one must init it before push and pull. Only worker 0's (rank == 0) data are used. This function returns after data have been initialized successfully

Parameters ---------- key : int or an array ref of int The keys. value : NDArray or an array ref of NDArray objects The values.

Examples
--------
>>> # init a single key-value pair
>>> $shape = [2,3]
>>> $kv = mx->kv->create('local')
>>> $kv->init(3, mx->nd->ones($shape)*2)
>>> $a = mx->nd->zeros($shape)
>>> $kv->pull(3, out=>$a)
>>> print $a->aspdl
[[ 2  2  2]
[ 2  2  2]]

>>> # init a list of key-value pairs
>>> $keys = [5, 7, 9]
>>> $kv->init(keys, [map { mx->nd->ones($shape) } 0..@$keys-1])

push

Push a single or a sequence of key-value pairs into the store. Data consistency: 1. this function returns after adding an operator to the engine. 2. push is always called after all previous push and pull on the same key are finished 3. there is no synchronization between workers. One can use _barrier() to sync all workers

Parameters ---------- key : int or array ref of int value : NDArray or array ref of NDArray or array ref of array refs of NDArray priority : int, optional The priority of the push operation. The higher the priority, the faster this action is likely to be executed before other push actions.

Examples
--------
>>> # push a single key-value pair
>>> $kv->push(3, mx->nd->ones($shape)*8)
>>> $kv->pull(3, out=>$a) # pull out the value
>>> print $a->aspdl()
    [[ 8.  8.  8.]
    [ 8.  8.  8.]]

>>> # aggregate the value and the push
>>> $gpus = [map { mx->gpu($_) } 0..3]
>>> $b = [map { mx->nd->ones($shape, ctx => $_) } @$gpus]
>>> $kv->push(3, $b)
>>> $kv->pull(3, out=>$a)
>>> print $a->aspdl
    [[ 4.  4.  4.]
    [ 4.  4.  4.]]

>>> # push a list of keys.
>>> # single device
>>> $kv->push($keys, [map { mx->nd->ones($shape) } 0..@$keys-1)
>>> $b = [map { mx->nd->zeros(shape) } 0..@$keys-1]
>>> $kv->pull($keys, out=>$b)
>>> print $b->[1]->aspdl
    [[ 1.  1.  1.]
    [ 1.  1.  1.]]

>>> # multiple devices:
>>> $b = [map { [map { mx->nd->ones($shape, ctx => $_) } @$gpus] } @$keys-1]
>>> $kv->push($keys, $b)
>>> $kv->pull($keys, out=>$b)
>>> print $b->[1][1]->aspdl()
    [[ 4.  4.  4.]
    [ 4.  4.  4.]]

pull

Pull a single value or a sequence of values from the store.

Data consistency:

1. this function returns after adding an operator to the engine. But any further read on out will be blocked until it is finished. 2. pull is always called after all previous push and pull on the same key are finished. 3. It pulls the newest value from the store.

Parameters ---------- key : int or array ref of int Keys out: NDArray or array ref of NDArray or array ref of array refs of NDArray According values

priority : int, optional The priority of the push operation. The higher the priority, the faster this action is likely to be executed before other push actions.

Examples
--------
>>> # pull a single key-value pair
>>> $a = mx->nd->zeros($shape)
>>> $kv->pull(3, out=>$a)
>>> print $a->aspdl
    [[ 2.  2.  2.]
    [ 2.  2.  2.]]

>>> # pull into multiple devices
>>> $b = [map { mx->nd->ones($shape, $_) } @$gpus]
>>> $kv->pull(3, out=>$b)
>>> print $b->[1]->aspdl()
    [[ 2.  2.  2.]
    [ 2.  2.  2.]]

>>> # pull a list of key-value pairs.
>>> # On single device
>>> $keys = [5, 7, 9]
>>> $b = [map { mx->nd->zeros($shape) } 0..@$keys-1]
>>> $kv->pull($keys, out=>$b)
>>> print $b->[1]->aspdl()
    [[ 2.  2.  2.]
    [ 2.  2.  2.]]
>>> # On multiple devices
>>> $b = [map { [map { mx->nd->ones($shape, ctx => $_) } @$gpus ] } 0..@$keys-1]
>>> $kv->pull($keys, out=>$b)
>>> print $b->[1][1]->aspdl()
    [[ 2.  2.  2.]
    [ 2.  2.  2.]]

set_optimizer

Register an optimizer to the store

If there are multiple machines, this process (should be a worker node) will pack this optimizer and send it to all servers. It returns after this action is done.

Parameters ---------- optimizer : Optimizer the optimizer

type

Get the type of this kvstore

Returns ------- type : str the string type

rank

Get the rank of this worker node

Returns ------- rank : int The rank of this node, which is in [0, get_num_workers())

num_workers

Get the number of worker nodes

Returns ------- size :int The number of worker nodes

save_optimizer_states

Save optimizer (updater) state to file

Parameters ---------- fname : str Path to output states file.

load_optimizer_states

Load optimizer (updater) state from file.

Parameters ---------- fname : str Path to input states file.

_set_updater

Set a push updater into the store.

This function only changes the local store. Use set_optimizer for multi-machines.

Parameters ---------- updater : function the updater function

Examples
--------
>>> my $update = sub { my ($key, input, stored) = @_;
    ...     print "update on key: $key\n";
    ...     $stored += $input * 2; };
    >>> $kv->_set_updater($update)
    >>> $kv->pull(3, out=>$a)
    >>> print $a->aspdl()
    [[ 4.  4.  4.]
    [ 4.  4.  4.]]
    >>> $kv->push(3, mx->nd->ones($shape))
    update on key: 3
    >>> $kv->pull(3, out=>$a)
    >>> print $a->aspdl()
    [[ 6.  6.  6.]
    [ 6.  6.  6.]]

_barrier

Global barrier among all worker nodes

For example, assume there are n machines, we want to let machine 0 first init the values, and then pull the inited value to all machines. Before pulling, we can place a barrier to guarantee that the initialization is finished.

_send_command_to_servers

Send a command to all server nodes Send a command to all server nodes, which will make each server node run KVStoreServer.controller This function returns after the command has been executed in all server nodes

Parameters ---------- head : int the head of the command body : str the body of the command

create

Create a new KVStore.

Parameters ---------- name : {'local'} The type of KVStore - local works for multiple devices on a single machine (single process) - dist works for multi-machines (multiple processes) Returns ------- kv : KVStore The created AI::MXNet::KVStore