NAME
Helios::Service - base class for services in the Helios job processing system
DESCRIPTION
Helios::Service is the base class for all services intended to be run by the Helios parallel job processing system. It handles the underlying TheSchwartz job queue system and provides additional methods to handle configuration, job argument parsing, logging, and other functions.
A Helios::Service subclass must implement only one method: the run() method. The run() method will be passed a Helios::Job object representing the job to performed. The run() method should mark the job as completed successfully, failed, or permanently failed (by calling completedJob(), failedJob(), or failedJobPermanent(), respectively) before it ends.
TheSchwartz HANDLING METHODS
The following 3 methods are used by the underlying TheSchwartz job queuing system to determine what work is to be performed and, if a job fails, how it should be retried.
YOU DO NOT NEED TO TOUCH THESE METHODS TO CREATE HELIOS SERVICES. These methods manage interaction between Helios and TheSchwartz. You only need to be concerned with these methods if you are attempting to extend core Helios functionality.
max_retries()
Controls how many times a job will be retried.
retry_delay()
Controls how long (in secs) before a failed job will be retried.
These two methods should return the number of times a job can be retried if it fails and the minimum interval between those retries, respectively. If you don't define them in your subclass, they default to zero, and your job(s) will not be retried if they fail.
work()
The work() method is the method called by the underlying TheSchwartz::Worker (which in turn is called by the helios.pl service daemon) to perform the work of a job. Effectively, work() sets up the worker process for the Helios job, and then calls the service subclass's run() method to run it.
The work() method is passed a job object from the underlying TheSchwartz job queue system. The service class is instantiated, and the the job is recast into a Helios::Job object. The service's configuration parameters are read from the system and made available as a hashref via the getConfig() method. The job's arguments are parsed from XML into a Perl hashref, and made available via the job object's getArgs() method. Then the service object's run() method is called, and is passed the Helios::Job object.
Once the run() method has completed the job and returned, work() determines whether the worker process should exit or stay running. If OVERDRIVE mode is enabled and the service hasn't been HALTed or told to HOLD, the worker process will stay running, and work() will be called to setup and run another job. If the service is not in OVERDRIVE mode, the worker process will exit.
metarun($job)
Given a metajob, the metarun() method runs the job, returning 0 if the metajob was successful and nonzero otherwise.
This is the default metarun() for Helios. In the default Helios system, metajobs consist of multiple simple jobs. These jobs are defined in the metajob's argument XML at job submission time. The metarun() method will burst the metajob apart into its constituent jobs, which are then run by another service.
Metajobs' primary use in the base Helios system is to speed the job submission process of large job batches. One metajob containing a batch of thousands of jobs can be submitted and burst apart by the system much faster than thousands of individual jobs can be submitted. In addition, the faster jobs enter the job queue, the faster Helios workers can be launched to handle them. If you have thousands (or tens of thousands, or more) of jobs to run, especially if you are running your service in OVERDRIVE mode, you should use metajobs to greatly increase system throughput.
ACCESSOR METHODS
These accessors will be needed by subclasses of Helios::Service.
get/setConfig()
get/setHostname()
get/setIniFile()
get/setJob()
get/setJobType()
get/setAltJobTypes(), addAltJobType()
get/setJobTypeid()
get/setAltJobtypeids(), addAltJobtypeid()
errstr()
debug()
Most of these are handled behind the scenes simply by calling the prep() method.
After calling prep(), calling getConfig() will return a hashref of all the configuration parameters relevant to this service class on this host.
If debug mode is enabled (the HELIOS_DEBUG env var is set to 1), debug() will return a true value, otherwise, it will be false. Some of the Helios::Service methods will honor this value and log extra debugging messages either to the console or the Helios log (helios_log_tb table). You can also use it within your own service classes to enable/disable debugging messages or behaviors.
CONSTRUCTOR
new()
The new() method creates a new service class instance. It initializes all of the underlying attribute values and sets the instance's jobType to the name of the class.
INTERNAL SERVICE CLASS METHODS
When writing normal Helios services, the methods listed in this section will have already been dealt with before your run() method is called. If you are extending Helios itself or instantiating a Helios service outside of Helios (for example, to retrieve a service's config params), you may be interested in some of these, primarily the prep() method.
prep()
The prep() method is designed to call all the various setup routines needed to get the service ready to do useful work. It:
Pulls in the contents of the HELIOS_DEBUG and HELIOS_INI env vars, and sets the appropriate instance variables if necessary.
Calls the getConfigFromIni() method to read the appropriate configuration parameters from the INI file.
Calls the getConfigFromDb() method to read the appropriate configuration parameters from the Helios database.
Normally it returns a true value if successful, but if one of the getConfigFrom*() methods throws an exception, that exception will be raised to your calling routine.
getConfigFromIni([$inifile]) DEPRECATED
The getConfigFromIni() method opens the helios.ini file, grabs global params and config params relevant to the current service class, and returns them in a hash to the calling routine. It also sets the class's internal {config} hashref, so the config parameters are available via the getConfig() method.
Typically service classes will call this once near the start of processing to pick up any relevant parameters from the helios.ini file. However, calling the prep() method takes care of this for you, and is the preferred method.
getConfigFromDb() DEPRECATED
The getConfigFromDb() method connects to the Helios database, retrieves config params relevant to the current service class, and returns them in a hash to the calling routine. It also sets the class's internal {config} hashref, so the config parameters are available via the getConfig() method.
Typically service classes will call this once near the start of processing to pick up any relevant parameters from the helios.ini file. However, calling the prep() method takes care of this for you.
There's an important subtle difference between getConfigFromIni() and getConfigFromDb(): getConfigFromIni() erases any previously set parameters from the class's internal {config} hash, while getConfigFromDb() merely updates it. This is due to the way helios.pl uses the methods: the INI file is only read once, while the database is repeatedly checked for configuration updates. For individual service classes, the best thing to do is just call the prep() method; it will take care of things for the most part.
getFuncidFromDb() [DEPRECATED]
Queries the collective database for the funcid of the service class and returns it to the calling routine. The service name used in the query is the value returned from the getJobType() accessor method.
This method is most commonly used by helios.pl to get the funcid associated with a particular service class, so it can scan the job table for waiting jobs. If their are jobs for the service waiting, helios.pl may launch new worker processes to perform these jobs.
As of Helios 2.80, getFuncidFromDb() has been replaced by lookupJobtypeid(). This method is thus deprecated.
lookupAltJobtypeids(@jobtypenames)
The lookupAltJobtypeids() method uses the lookupJobtypeid() method to determine the jobtypeids of all of the service instance's alternate jobtypes. If given a list of jobtype names, these will override any jobtypes previously set with the setAltJobTypes() or addAltJobType() methods.
Usually, "alternate" jobtypes and jobtypes specified on the helios.pl command line using the --jobtypes option. The "primary" jobtype is the jobtype matching the service class's name.
lookupJobtypeid($jobtypename)
Given the name of a jobtype, lookupJobtypeid() uses the Helios::JobType class to find the jobtypeid of the jobtype and returns it to the calling routine. If the jobtype does not exist, the method returns undef.
jobsWaiting()
Scans the job queue for jobs that are ready to run. Returns the number of jobs waiting. Only meant for use with the helios.pl service daemon.
initDriver()
Creates a Data::ObjectDriver object connected to the Helios database and returns it to the calling routine. Normally called by getDriver() if an D::OD object has not already been initialized.
The initDriver() method calls setDriver() to cache the D::OD object for use by other methods. This will greatly reduce the number of open connections to the Helios database.
shouldExitOverdrive()
Determine whether or not to exit if OVERDRIVE mode is enabled. The config params will be checked for HOLD, HALT, or OVERDRIVE values. If HALT is defined or HOLD == 1 this method will return a true value, indicating the worker process should exit().
This method is used by helios.pl and Helios::Service->work(). Normal Helios services do not need to use this method directly.
METHODS AVAILABLE TO SERVICE SUBCLASSES
The methods in this section are available for use by Helios services. They allow your service to interact with the Helios environment.
dbConnect($dsn, $user, $password, $options)
Method to connect to a database in a "safe" way. If the connection parameters are not specified, a connection to the Helios collective database will be returned. If a connection to the given database already exists, dbConnect() will return a database handle to the existing connection rather than create a new connection.
The dbConnect() method uses the DBI->connect_cached() method to reuse database connections and thus reduce open connections to your database (often important when you potentially have hundreds of active worker processes working in a Helios collective). It "tags" the connections it creates with the current PID to prevent reusing a connection that was established by a parent process. That, combined with helios.pl clearing connections after the fork() to create a worker process, should allow for safe database connection/disconnection in a forking environment.
logMsg([$job,] [$priority_level,] $message)
Given a message to log, an optional priority level, and an optional Helios::Job object, logMsg() will record the message in the logging systems that have been configured. The internal Helios logging system is the only system enabled by default.
In addition to the log message, there are two optional parameters:
- $job
-
The current Helios::Job object being processed. If specified, the jobid will be logged in the database along with the message.
- $priority
-
The priority level of the message as defined by Helios::LogEntry::Levels. These are really integers, but if you import Helios::LogEntry::Levels (with the :all tag) into your namespace, your logMsg() calls will be much more readable. There are 8 log priority levels, corresponding (for historical reasons) to the log priorities defined by Sys::Syslog:
name priority LOG_EMERG 0 LOG_ALERT 1 LOG_CRIT 2 LOG_ERR 3 LOG_WARNING 4 LOG_NOTICE 5 LOG_INFO 6 LOG_DEBUG 7
LOG_DEBUG, LOG_INFO, LOG_NOTICE, LOG_WARNING, and LOG_ERR are the most common used by Helios itself; LOG_INFO is the default.
The host, process id, and service class are automatically recorded with your log message. If you supplied either a Helios::Job object or a priority level, these will also be recorded with your log message.
This method returns a true value if successful and throws a Helios::Error::LoggingError if errors occur.
LOGGING SYSTEM CONFIGURATION
Several parameters are available to configure Helios logging. Though these options can be set either in helios.ini or in the Ctrl Panel, it is strongly recommended these options only be set in helios.ini. Changing logging configurations on-the-fly could potentially cause a Helios service (and possibly your whole collective) to become unstable!
The following options can be set in either a [global] section or in an application section of your helios.ini file.
loggers
loggers=HeliosX::Logger::Syslog,HeliosX::Logger::Log4perl
A comma delimited list of interface classes to external logging systems. Each of these classes should implement (or otherwise extend) the Helios::Logger class. Each class will have its own configuration parameters to set; consult the documentation for the interface class you're trying to configure.
internal_logger
internal_logger=on|off
Whether to enable the internal Helios logging system as well as the loggers specified with the 'loggers=' line above. The default is on. If set to off, the only logging your service will do will be to the external logging systems.
log_priority_threshold
log_priority_threshold=1|2|3|4|5|6
You can specify a logging threshold to better control the logging of your service on-the-fly. Unlike the above parameters, log_priority_threshold can be safely specified in your Helios Ctrl Panel. Specifying a 'log_priority_threshold' config parameter in your helios.ini or Ctrl Panel will cause log messages of a lower priority (higher numeric value) to be discarded. For example, a line in your helios.ini like:
log_priority_threshold=6
will cause any log messages of priority 7 (LOG_DEBUG) to be discarded.
This configuration option is supported by the internal Helios logger (Helios::Logger::Internal). Other Helios::Logger systems may or may not support it; check the documentation of the logging module you plan to use.
If anything goes wrong with calling the configured loggers' logMsg() methods, this method will attempt to catch the error and log it to the Helios::Logger::Internal internal logger. It will then rethrow the error as a Helios::Error::LoggingError exception.
initConfig()
The initConfig() method is called to initialize the configuration parsing class. This method is normally called by the prep() method before a service's run() method is called; most Helios application developers do not need to worry about this method.
The normal Helios config parsing class is Helios::Config. This can be changed by specifying another config class with the ConfigClass() method in your service.
This method will throw a Helios::Error::ConfigError if anything goes wrong with config class initialization.
initLoggers()
The initLoggers() method is called to initialize all of the configured Helios::Logger classes. This method is normally called by the prep() method before a service's run() method is called.
This method sets up the Helios::Logger subclass's configuration by calling setConfig(), setHostname(), setJobType(), and setDriver(). It then calls the logger's init() method to finish the initialization phase of the logging class.
This method will throw a Helios::Error::Logging error if anything goes wrong with the initialization of a logger class. It will also attempt to fall back to the Helios::Logger::Internal logger to attempt to log the initialization error.
getJobArgs($job)
Given a Helios::Job object, getJobArgs() returns a hashref representing the parsed job argument XML. It actually calls the Helios::Job object's parseArgs() method and returns its value.
JOB COMPLETION METHODS
These methods should be called in your Helios service class's run() method to mark a job as successfully completed, failed, or failed permanently. They actually call the appropriate methods of the given Helios::Job object.
completedJob($job)
Marks $job as completed successfully.
failedJob($job [, $error][, $exitstatus])
Marks $job as failed. Allows job to be retried if your subclass supports that (see max_retries()).
failedJobPermanent($job [, $error][, $exitstatus])
Marks $job as permanently failed (no more retries allowed).
deferredJob($job)
Defers processing of a job until its grabbed_until interval expires (default is 60 minutes). This feature requires TheSchwartz 1.10.
burstJob($metajob)
Given a metajob, burstJob bursts it into its constituent jobs for other Helios workers to process. Normally Helios::Service's internal methods will take care of bursting jobs, but the method can be overridden if a job service needs special bursting capabilities.
SERVICE CLASS DEFINITION
These are the basic methods that define your Helios service. The run() method is the only one required.
run($job)
This is a default run method for class completeness. You have to override it in your own Helios service class.
MaxRetries(), RetryInterval(), and JobLockInterval()
The MaxRetries(), RetryInterval(), and JobLockInterval() methods specify to Helios the number of reattempts it should make at running a job and the frequency of those attempts. If you don't define these, jobs will not be retried if they fail.
MaxRetries() is straightforward; set it to the number of times you want a job to be retried if it fails.
RetryInterval() is the amount of time (in seconds) to wait after a job fails before a job is available to try again.
JobLockInterval() is the amount of time (in seconds) a job is locked for processing. This amount of time should be enough time to make sure a job can be completed or at marked as failed. The default is 3600 sec (1 hour).
RetryInterval() and JobLockInterval() can interact in an odd way: for example, if you want to retry a job every 60 secs, you can add:
sub RetryInterval { 60 }
to your service class. However, your jobs will still be locked for an hour, because 3600 is the JobLockInterval() default. If you want to retry jobs more frequently than a hour, you need to add a JobLockInterval() method to your service class as well as a RetryInterval() method. So, to retry jobs every 60 secs, add both of the following methods to your service class:
sub RetryInterval { 60 }
sub JobLockInterval { 60 }
Keep in mind this will reduce the amount of time available for your service to mark a job as completed or failed. If it has not done so by the time the JobLockInterval() value has expired, the job will be seen by the Helios system as available for processing again, and another worker process will pick up and attempt to run the job. So always make sure your JobLockInterval() allows enough time to actually complete a job. Another rule of thumb is to set RetryInterval() and JobLockInterval() to the same value if RetryInterval() is less than 3600.
JobClass()
Defines which job class to instantiate the job as. The default is Helios::Job, which should be fine for most purposes. If necessary, however, you can create a subclass of Helios::Job and set your JobClass() method to return that subclass's name. The service's work() method will instantiate the job as an instance of the class you specified rather than the base Helios::Job.
NOTE: Please remember that "jobs" in Helios are most often only used to convey arguments to services, and usually only contain enough logic to properly parse those arguments and mark jobs as completed. It should be rare to need to extend the Helios::Job object. OTOH, if you are attempting to extend Helios itself to provide new abilities and not just writing a normal Helios application, you can use JobClass() to use your extended job class rather than the default.
ConfigClass()
Defines which configuration class to use to parse your service's configuration. The default is Helios::Config, which should work fine for most applications. If necessary, you can create a subclass of Helios::Config and set your ConfigClass() method to return that subclass's name. The service's prep() method will initialize your custom config class and use it to parse your service's configuration information.
See the Helios::Config documentation for more information about creating custom config classes.
SEE ALSO
Helios, helios.pl, Helios::Job, Helios::Error, Helios::Config, Helios::JobType
AUTHOR
Andrew Johnson, <lajandy at cpan dot org>
COPYRIGHT AND LICENSE
Copyright (C) 2008-9 by CEB Toolbox, Inc., except as noted.
Portions of this software, where noted, are Copyright (C) 2009 by Andrew Johnson.
Portions of this software, where noted, are Copyright (C) 2011-2012 by Andrew Johnson.
Portions of this software, where noted, are Copyright (C) 2012-3 by Logical Helion, LLC.
This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself, either Perl version 5.8.0 or, at your option, any later version of Perl 5 you may have available.
WARRANTY
This software comes with no warranty of any kind.