NAME

Disbatch - a scalable distributed batch processing framework using MongoDB.

VERSION

version 4.102

SUBROUTINES

new(class => $class, ...)

"class" defaults to "Disbatch", and the value is then lowercased.

"node" is the hostname.

Anything else is put into $self.

logger($type)

Parameters: type (string, optional)

Returns a Log::Log4perl object.

mongo

Parameters: none

Returns a MongoDB::Database object.

nodes

Parameters: none

Returns a MongoDB::Collection object for collection "nodes".

queues

Parameters: none

Returns a MongoDB::Collection object for collection "queues".

tasks

Parameters: none

Returns a MongoDB::Collection object for collection "tasks".

balance

Parameters: none

Returns a MongoDB::Collection object for collection "balance".

load_config

Parameters: none

Loads $self->{config_file} only if $self->{config} is undefined.

Anything in the config file at startup is static and cannot be changed without restarting disbatchd.

Returns nothing.

ensure_indexes

Parameters: none

Ensures the proper MongoDB indexes are created for tasks, tasks.files, and tasks.chunks collections.

Returns nothing.

validate_plugins

Parameters: none

Validates plugins for defined queues.

Returns nothing.

revalidate_plugins

Parameters: none

Clears plugin validation and re-runs validate_plugins().

Returns nothing.

scheduler_report

Parameters: none

Used by the Disbatch Command Interface to get queue information.

Returns an ARRAY containing HASHes of queue information.

Throws errors.

update_node_status

Parameters: none

Updates the node document with the current timestamp and queues as returned by scheduler_report().

Returns nothing.

claim_task($queue)

Parameters: queue document

Claims a task (sets status to -1 and sets node to hostname) for the given queue.

Returns a task document, or undef if no queued task found.

unclaim_task($task_id)

Parameters: MongoDB::OID object for a task

Sets the task's node to null, status to -2, and update mtime if it has status -1 and this node's hostname.

Returns a task document, or undef if a matching task is not found.

orphaned_tasks

Parameters: none

Sets status to -6 for all tasks for this node with status -1 and an mtime of more than 300 seconds ago.

Returns nothing.

start_task($queue, $task)

Parameters: queue document, task document

Will fork and exec $self->{config}{task_runner} to start the given task. If the exec fails, it will set threads to 0 for the given queue and call unclaim_task().

Returns nothing.

count_tasks($queue_id, $status, $node)

Parameters: MongoDB::OID object for a queue or a query operator value or undef, a status or a query operator value or undef, a node or undef.

Counts all tasks for the given $queue_id with given $status and $node.

Used by the below count_* subroutines. If any of the parameters are undef, they will not be added to the query.

Returns: a non-negative integer, or undef if an error.

count_queued($queue_id)
count_running($queue_id)
count_node_running($queue_id)
count_completed($queue_id)
count_total($queue_id)

Parameters: MongoDB::OID object for a queue or a query operator value or undef

Counts queued (status <= -2), running (status of 0 or -1), running on this node, completed (status >= 1), or all tasks for the given queue (status <= -2).

Returns: a non-negative integer, or undef if an error.

is_active_queue($queue_id)

Parameters: MongoDB::OID object for a queue

Checks config.activequeues if it has entries, and returns 1 if given queue is defined in it or 0 if not. If it does not have entries, checks config.ignorequeues if it has entries, and returns 0 if given queue is defined in it or 1 if not.

Returns 1 or 0.

process_queues

Parameters: none

Will claim and start as many tasks for each queue as allowed by the current node's maxthreads and each queue's threads.

Returns nothing.

put_gfs($content, $filename, $metadata)

Parameters: UTF-8 content to store, optional filename to store it as, optional metadata HASH

Stores UTF-8 content in a custom GridFS format that stores data as strings instead of as BinData.

Returns a MongoDB::OID object for the ID inserted in the tasks.files collection.

get_gfs($filename_or_id, $metadata)

Parameters: filename or MongoDB::OID object, optional metadata HASH

Gets UTF-8 content from the custom GridFS format. Metadata is only used if given a filename instead of a MongoDB::OID object.

Returns: content string.

SEE ALSO

Disbatch::Web

Disbatch::Roles

Disbatch::Plugin::Demo

disbatchd

disbatch.pl

task_runner

disbatch-create-users

AUTHORS

Ashley Willis <awillis@synacor.com>

Matt Busigin

COPYRIGHT AND LICENSE

This software is Copyright (c) 2016, 2019 by Ashley Willis.

This is free software, licensed under:

The Apache License, Version 2.0, January 2004