NAME

TaskForest::Family - A collection of jobs

SYNOPSIS

use TaskForest::Family;

my $family = TaskForest::Family->new(name => 'Foo');
# the associated job dependencies are read within new();

$family->getCurrent();
# get the status of all jobs, what's failed, etc.

$family->cycle();
# runs any jobs that are ready to be run

$family->display();
# print to stdout a list of all jobs in the family
# and their statuses

DOCUMENTATION

If you're just looking to use the taskforest application, the only documentation you need to read is that for TaskForest. You can do this either of the two ways:

perldoc TaskForest

OR

man TaskForest

If you're a developer and you want to understand the code, I would recommend that you read the pods in this order:

  • TaskForest

  • TaskForest::Job

  • TaskForest::Family

  • TaskForest::TimeDependency

  • TaskForest::LogDir

  • TaskForest::Options

  • TaskForest::StringHandleTier

  • TaskForest::StringHandle

Finally, read the documentation in the source. Great efforts have been made to keep it current and relevant.

DESCRIPTION

A family is a group of jobs that share the following characteristics:

  • They all start on or after a common time known as the family start time.

  • They run only on the days specified in the family file.

  • They can be dependent on each other. These dependencies are represented by the location of jobs with respect to each other in the family file.

For more information about jobs, please look at the documentation for the TaskForest class.

ATTRIBUTES

The following are attributes of objects of the family class:

name

The name is the same as the name of the config file that contains the job dependency information.

start

The family start time in 'HH:MM' format using the 24-hour clock. e.g.: '17:30' for 5:30 p.m.

tz

The time zone with which the family start time is to be interpreted.

days

An array reference of days of the week on which this family's jobs may run. Valid days are 'Mon', 'Tue', 'Wed', 'Thu', 'Fri', 'Sat' and 'Sun'. Anything else will be ignored.

options

A hash reference that contains the values of the options retrieved from the command line or the environment,

jobs

A hash reference of all the jobs that are members of this family. The keys of this hash are the names of the jobs. The names of the jobs are in the family configuration file and they're the same as the filenames of the jobs on disk.

current

A boolean that is set to true after all the details of the family's jobs are read from status files in the log directory. This boolean is set to false when an attempt is made to run any jobs, and when the family config file is first read (before getCurrent() is called).

ready_jobs

A temporary hash reference of jobs that are ready to be run - jobs whose dependencies have been met.

dependencies

A hash reference of dependencies of all jobs (things that the jobs depend ON). The keys of this hash are the job names. The values are array references. Each array reference can contain 1 or more references to objects of type TaskForest::Job or TaskForest::TimeDependency.

All jobs have at least one dependency - a TimeDependency that's set to the start time of the Family. In other words, after the start time of the Family passes, the check() method of the TimeDependency will return 1. Before that, it will return 0.

time_dependencies

For convenience, all time dependencies encountered in this family (including that of the family start time) are saved in this array reference. The other types of time dependencies are those that apply to individual jobs.

family_time_dependency

This is the TaskForest::TimeDependency that refers to the family start time.

year, mon, mday and wday

These attributes refer to the current day. They're saved within the Family object so that we don't have to call localtime over and over again. I really should have this cached this somewhere else. Oh, well.

filehandle

The readFromFile function was *really* long, so I refactored it into smaller functions. Since at least two of the functions read from the file, I saved the file handle within the object.

current_dependency, last_dependency

These are temporary attributes that builds dependency lists while parsing the file.

METHODS

new()
Usage     : my $family = TaskForest::Family->new();
Purpose   : The Family constructor is passed the family name.  It
            uses this name along with the location of the family
            directory to find the family configuration file and
            reads the file.  The family object is configured with
            the data read in from the file.
Returns   : Self
Argument  : A hash that has the properties of he family.  Of these,
            the only required one is the 'name' property.
Throws    : "No family name specified" if the name property is
             blank.  
display()
Usage     : $family->display()
Purpose   : This method displays the status of all jobs in all
            families that are scheduled to run today. 
Returns   : Nothing
Argument  : None
Throws    : Nothing
getCurrent()
Usage     : $family->getCurrent()
Purpose   : This method reads all the semaphore files in the log
            directory and gets the current status of the entire
            family.  Each run job can have succeeded or failed.  As
            a result of this, other jobs may be Ready to be run.  If
            a job's dependencies have not yet been met, it is said
            to be in the Waiting state.  Once a family is current,
            the only thing that makes it 'uncurrent' is if any jobs
            are run, or if its configuration file changes.
Returns   : Nothing
Argument  : None
Throws    : Nothing
cycle()
Usage     : $family->cycle()
Purpose   : This is the main method that is invoked once in every
            loop, to run any jobs that are in a Ready state.  It
            gets the current status of the family and runs any jobs
            that are in the Ready state.
Returns   : Nothing
Argument  : None
Throws    : Nothing
updateJobStatuses()
Usage     : $family->updateJobStatuses()
Purpose   : This method looks at all the semaphore files in the
            current day's log directory and updates job statuses
            based on those semaphore files. 
Returns   : Nothing
Argument  : None
Throws    : Nothing
runReadyJobs()
Usage     : $family->runReadyJobs()
Purpose   : This method uses the fork and exec model to run all jobs
            currently in the Ready state.  The script that is
            actually exec'ed is the run wrapper.  The wrapper takes
            a whole bunch of arguments, some of which can be derived
            by others.  The intent is to make it flexible and make
            it easy for others to write custom wrappers.  The code
            that's executed in the child process before the exec is
            rather paranoid and is taken from perldoc perlsec.
Returns   : Nothing
Argument  : None
Throws    : "Can't drop privileges" if the userids cannot be
            changed
checkAllTimeDependencies()
Usage     : $family->checkAllTimeDependencies()
Purpose   : Runs td->check() on all time dependencies, to see
            whether they have been met or not
Returns   : Nothing
Argument  : None
Throws    : Nothing
getAllWaitingJobs()
Usage     : $family->getAllWaitingJobs()
Purpose   : This method gets a hash of all jobs that are currently
            in the Waiting state
Returns   : Nothing
Argument  : None
Throws    : Nothing
readFromFile()
Usage     : $family->readFromFile
Purpose   : This is the most crucial method of the application.  It
            reads the Family configuration file and constructs a
            data structure that represents all the configuration
            parameters of the family.
Returns   : Nothing
Argument  : None
Throws    : "Can't read dir/file" if the config file cannot be read
            "No start time specified for Family",
            "No time zone specified for Family",
            "No run days specified for Family",
               if any of the 3 required headers are not present in
               the file
            Generic croak if the data cannot be extracted after an
            eval.
okToRunToday()
Usage     : $family->okToRunToday
Purpose   : This method checks whether today is in the list of days
            of the week that this family is eligible to run
Returns   : 1 if it is, 0 if it's not.
Argument  : $wday - the day of the week today
Throws    : Nothing
_initializeDataStrauctures()
Usage     : $self->_intializeDataStructures
Purpose   : Used in readFrom file, before a file is opened for reading
Returns   : Nothing
Argument  : None
Throws    : Nothing
_getSections()
Usage     : $self->_getSections
Purpose   : Read concurrent sections from the family file 
Returns   : A list of sections, or () if the file is empty
Argument  : None
Throws    : Nothing
_parseHeaderLine()
Usage     : $self->_parseHeaderLine()
Purpose   : Read the first non-empty line from the family file.
            If this family is not scheduled to run today, then just
            close the file and return 0.  This means that you
            could change the days in the header file in the middle
            of the day, and add today to the list of valid
            days. This would cause the family to now become
            eligible to run today, when earlier in the  day it was
            not. 
Returns   : 1 if the family is to run today, 0 otherwise.
Argument  : None
Throws    : Nothing
_parseLine()
Usage     : $self->_parseLine($line)
Purpose   : Get a list of all jobs on the line and parse them,
            creating the data structure.
            As we process each line, we add to each job's
            dependencies the dependencies in
            $self->{last_dependency}.  We also add each job to the
            list of 'current' dependencies.  When we're done parsing
            the line, we set 'last' to 'current', for the benefit of
            the next line.
Returns   : Nothing
Argument  : None
Throws    : Nothing
_parseJob()
Usage     : $self->_parseJob($job)
Purpose   : Parse the job definition, create additional dependencies
            if necessary, and create the job.  If it's a recurring
            job, then create a bunch of 'repeat' jobs that are not
            dependent on the original job's predecessors, but on
            time dependencies only.
Returns   : Nothing
Argument  : None
Throws    : Nothing
_createRecurringJobs()
Usage     : $self->_createRecurringJobz($job_name, $args)
Purpose   : If a job is a recurring job, create new jobs with a
            prefix of --Repeat_$n-- where $n specifies the
            cardinality of the repeat job.  The newly created jobs
            are *not* dependent on each other. They're only
            dependent on their start times. 
Returns   : Nothing
Argument  : None
Throws    : Nothing