NAME
Sub::Slice - split long-running tasks into manageable chunks
SYNOPSIS
# Client
# Assume methods in the Server:: package are magically remoted
my $token = Server::create_token();
for(1 .. MAX_ITERATIONS) {
Server::do_work($token);
last if $token->{done};
}
# Server
# Imagine this is on a remote machine
package Server;
use Sub::Slice;
sub create_token {
# create a new job:
my $job = new Sub::Slice(
backend => 'Filesystem',
storage_options => {
path => '/var/tmp/myproject/',
}
);
return $job->token;
}
sub do_work {
# loading an existing job:
my $job = new Sub::Slice(
token => $token
backend => 'Filesystem',
storage_options => {
path => '/var/tmp/myproject/',
}
);
at_start $job
sub {
$job->store('foo', '1');
$job->store('bar', { abc = > 'def' });
# store data, initialise
$job->set_estimate(10); # estimate number of steps
return ( $job->fetch('foo') );
};
my $foo = $job->fetch('foo');
at_stage $job "stage_one",
sub {
my $bar = $job->fetch('bar');
# do stuff
$job->next_stage('stage_two') if $some_condition;
};
at_stage $job "stage_two",
sub {
# ...do more stuff...
# mark job as ready to be deleted
$job->done() if $job->count() == $job->estimate();
};
return $job->return_value(); #Pass back any return value from coderefs
}
DESCRIPTION
Sub::Slice breaks up a long process into smaller chunks that can be executed one at a time over a stateless protocol such as HTTP/SOAP so that progress may be reported. This means that the client can display progress or cancel the operation part-way through.
It works by the client requesting a token from the server, and passing the token back to the server on each iteration. The token passed to the client contains status information which the client can use to determine if the job has completed/failed and to display status/error messages.
Within the routine called on each iteration, the server defines a set of coderefs, one of which will be called for a given iteration. In addition the server may define coderefs to be called at the start and end of the job. The server may provide the client with an estimate of the number of iterations the job is likely to take.
It is possible to balance performance/usability by modifying the number of iterations that will be executed before returning progress to the client.
METHODS
- new( %options )
-
Create a new job object. Valid options are:
- token
-
A token for an existing job (optional)
- iterations
-
The number of chunks to execute before saving the state and returning. Defaults to '1'. This value may be overridden later on by setting the value in the token. Set to 0 for unlimited.
- backend
-
The storage backend. This should either be a fully qualified package name or if no namespace is included it's assumed to be in the Sub::Slice::Backend namespace (e.g. Database would be interpreted as Sub::Slice::Backend::Database). Defaults to Sub::Slice::Backend::Filesystem.
- pin_length
-
The size of the random PIN used to sign the token. Default is 1e9.
- random_pin ($l)
-
Generates a random PIN of length $l. We do this using rand(). You might want to override this method if you require cryptographic-quality randomness for your environment.
- auto_blob_threshold
-
If this is set, any strings longer than this number of bytes will be stored as BLOBs automatically (possibly taking advantage of a more efficient BLOB storage mechanism offered by the backend). Note that this does not apply when you store references, only to strings of characters/bytes.
- storage_options
-
A hash of configuration options for the backend storage. See the POD of the backend module (default is Sub::Slice::Backend::Filesystem).
Returns an existing job object with session data for
$token
METHODS DEFINING STAGES OF ITERATION
- at_start $job \&coderef
-
Code to initialise the job. This isn't counted as an iteration and will only run once per job.
- at_stage $job $stage_name, \&coderef
-
Executes
\&coderef
up-toiterate
times, if$stage_name
is the current stage and if the number of executions in the current session is not greater thaniterate
. It is currently required that you have at least oneat_stage
defined.If the current stage hasn't been set with
next_stage()
, it will implicitly be set to the firstat_stage
block that is seen. - at_end $job \&coderef
-
Code to run after the last iteration (unless the job is aborted before then). This isn't counted as an iteration and will only run once per job. It's typically used as a "commit" stage.
If a job dies in one of these blocks, Sub::Slice sets $job->abort($@) and rethrows the exception. Note that at_end
may not be run if a job is aborted during one of the earlier stages. See Sub::Slice::Manual for an example of defensive coding to prevent resources allocated in at_start
leaking if the job is aborted part-way through.
ACCESSOR METHODS
- $job->token()
-
Returns the token object for this job. The token object will be updated automatically as stages of the sub execute. The token has the following properties which the client can make use of:
- done
-
Read/write boolean value. Is the job done? Setting this to 1 on the client will cause iterations on the server to cease, and any
at_end
cleanup to be done. - abort
-
Read-only boolean value. Was the job aborted on the server?
- error
-
Read-only. Error message if the job was aborted.
- count
-
Read-only. Number of iterations performed so far.
- estimate
-
Read-only. An estimate of the total number of iterations that will be performed. This may not be totally accurate, depending if new work is "discovered" as the iterations proceed.
- status
-
Read-only. Status message.
- iterations
-
A write-only property the client can use to control the number of iterations run on the server in the next call. This overrides the default number of iterations set in the Sub::Slice constructor.
- $job->id()
-
Returns the ID of the job (issued by the
new_id
function in the backend). This is mainly of interest if you are writing a backend and need to get the ID from a job. - $job->count()
-
Returns the total number of iterations that have been executed.
- $job->estimate()
-
Returns an estimate of how many iterations are required for the job.
- $job->is_done()
-
Returns a boolean value. Is the job done?
- $job->stage()
-
Returns the name of the executing code block, as set by
next_stage()
- $job->fetch( $key )
-
Returns the user data stored under
$key
. If no data is found against$key
, it automatically triesfetch_blob
to see if data was stored as a blob. - $job->fetch_blob($key)
-
Returns a lump of data stored using
store_blob
- see the MUTATOR METHODS. - $job->return_value()
-
return_value()
returns the return value of the stage. Thisreturn_value()
method will help you avoid mistakes like this:sub do_work { my $job = new Sub::Slice(token => shift()); at_stage $job 'mystage', sub { # do stuff return 'abc' #only returns 1 level up }; #nowt returned from do_work }
The caller of do_work() will not receive the return value inside the 'mystage' sub {} This might be better written as :
sub do_work { my $job = new Sub::Slice(token => shift()); at_stage $job 'mystage', sub { # do stuff return 'abc' #only returns 1 level up }; return $job->return_value(); # 'abc' }
MUTATOR METHODS THAT SET VALUES IN THE TOKEN
- $job->set_estimate( $int )
-
Populates the
estimate
field in the token with an estimate of how many iterations are required for this job to complete. - $job->done()
-
Mark the job as completed successfully. This sets the done flag in the token. Serialised object data will be removed when the object is destroyed.
- $job->abort( $reason )
-
Mark the job as aborted. This sets the abort flag in the token. The optional $reason message will be stored in the token's error string. Serialised object data will be removed when the object is destroyed.
- $job->status( $status_text )
-
Set the status field in the token. This might be useful to inform users about what is about to happen in the next iteration of the job.
OTHER MUTATOR METHODS
- $job->next_stage( $stage_name )
-
Tell the
$job
object that the next time the routine is called, it should execute the block named$stage_name
. Unlessnext_stage
is set, the first at_stage block will be executed. - $job->store( $key => $value, $key2 => $value2, ... )
-
Store some user data in the object.
$value
can be a scalar containing any perl data type (such as hash/array references) - it will be automatically serialised.Note that some objects may not be suited to serialisation. For example if an object is blessed into a package that is
require
d at runtime, when it is deserialised, the required package may not actually be loaded.There may also be issues serialising some objects like DBI database handles and XML::Parser objects, although this is potentially backend-specific (Filesystem uses Storable, and some objects may provide serialisation hooks).
$value
is optional (if not specified,$value
will be set to undef). - $job->store_blob($key => $blob)
-
Allows large lumps of data to be stored efficiently by the back end.
VERSION
$Revision: 1.47 $ on $Date: 2005/01/12 16:51:18 $ by $Author: simonf $
AUTHOR
Simon Flack and John Alden with additions by Tim Sweetman <cpan _at_ bbc _dot_ co _dot_ uk>
COPYRIGHT
(c) BBC 2005. This program is free software; you can redistribute it and/or modify it under the GNU GPL.
See the file COPYING in this distribution, or http://www.gnu.org/licenses/gpl.txt