NAME

RCGI - Remote CGI distributed processing

SYNOPSIS

    use RCGI;
    @result = Invoke('jobone',@arguments);
    $result = Invoke('jobtwo',@arguments);

    $remote_subroutine = new RCGI($base_url,$library_path,$module,$subroutine);
    @my_result = $remote_subroutine->Call(@arguments);
    if ($remote_subroutine->Success()) {
         print @my_result,'';
    } else {
         print STDERR "Call to " . $remote_subroutine->Base_URL() .
	   " failed with status: " . $remote_subroutine->Status() .
	          ' ' . $remote_subroutine->Error_Message() . "\n";
    }
    $remote_subroutine->Async(1);
    $remote_subroutine->Wantarray(1);
    $remote_subroutine->Call(@arguments);
    while(! $remote_subroutine->Done()) {
        # This should really be something usefull--like calls to other servers!
        sleep 1;
    }
    @my_result = $remote_subroutine->Read();
    if ($remote_subroutine->Success()) {
         print @my_result,'';
    } else {
         print STDERR "Call to " . $remote_subroutine->Base_URL() .
	   " failed with status: " . $remote_subroutine->Status() .
	          ' ' . $remote_subroutine->Error_Message() . "\n";
    }

    $result = RCGI::run_cgi_command($base_url, \%cgi_form,
	                            method => $method,
                                    username => $username,
                                    password => $password,
                                    timeout => $timeout,
                                    user_agent => $user_agent,
                                    nph => $bool_nph);

    # In a CGI script
    ($cgi_form, %options) = RCGI::Process_Parameters( new CGI );
    $result = RCGI::run_cgi_command($base_url, $cgi_form, %options);

ABSTRACT

This perl library provides remote execution using CGI on remote web servers.

INSTALLATION:

Installation overview

The installation of RCGI for full functionality consists of the following steps:

1) Edit RCGI/Config.pm to change the location of the configuration directory to an appropriate place.

2) Install the RCGI library itself by doing: perl Makefile.PL make make test make install

3) Put the perlcall.cgi CGI script and the SAR.pm module on every computer which will be running remote subroutines in the computer's webserver's cgi-bin directory.

4) Create sard.conf, server.conf, and jobs.conf files in the /usr/rcgi directory.

5) Start the sard daemon running on a computer which has read and write access to the /usr/rcgi directory.

6) (Optional) Edit the line in sardcheck ($sard_user = 'sard_user'), to be the user which ran the the sard daemon in step five. As the same use as step five, add a crontab entry which looks similar to:

30 * * * * /usr/local/bin/perl /path_to_sardcheck/sardcheck

Steps 2 and 3 are the only steps necessary if the load balancing calls: Invoke, Async_Invoke, or new_job RCGI will not be used. Step 3 may be neglected if only RCGI::run_cgi_command will be used.

The /usr/rcgi directory

If you wish to change the location of the configuration directory from the default value of /usr/rcgi, edit RCGI/Config.pm. The configuration must then be made, mkdir /usr/rcgi and set to the correct permissions: chgrp rcgi /usr/rcgi ; chmod g+rw /usr/rcgi. The DBM files: load.dir and load.pag are created in this directory and must be writable by any user process attempting to Invoke remote subroutines.

The sard.conf, server.conf, and jobs.conf files need to then be created in the /usr/rcgi directory. Following is the format for those files:

sard.conf

# machine URL_of_perlcall.cgi path_to_SAR.pm_module
# Items on a line must be seperated by a single tab
machine_name	http://www.webserver.url/cgi-bin/perlcall.cgi	path_to_SAR.pm_module

sard daemon

Usage is: sard /usr/rcgi/sard.conf /usr/rcgi/sar [timeperiod_in_seconds] [bool_verbose]

The sard (System Activity Report Daemon) runs in the background to collect usage from the machines configured in the sard.conf file. It uses the RCGI library to call (via perlcall.cgi) the SAR.pm module which, on Unix, uses the sar program to collect system activity over, by default, 10 minute periods. This information is stored in the DBM sar file located in the /usr/rcgi directory.

This system activity information is used by the RCGI library to implement load balancing of job requests.

server.conf

# machine number_of_processors processes_per_processor reserve_idle(in percent)
# the high reserve_idle should prevent those machines from being used
# Items on a line must be seperated by a single tab
medium	4	2	10
shared	4	1	50
dud	1	2	100000
mine	1	1	100000
super	12	1	10

jobs.conf

# job_type server task_url library_path module subroutine option option_value
# where option can be: timeout, username, password, user_agent
# Items on a line must be seperated by a single tab
jobone	machine1	http://webserver1/cgi-bin/perlcall.cgi	module_path	Module	subroutine	option_name	option_value
jobone	machine2	http://webserver2/cgi-bin/perlcall.cgi	module_path	Module	subroutine	option_name	option_value
jobtwo	machine2	http://webserver2/cgi-bin/perlcall.cgi	module_path	Module	subroutine	option_name	option_value
jobtwo	machine3	http://webserver3/cgi-bin/perlcall.cgi	module_path	Module	subroutine	option_name	option_value

perlcall.cgi and SAR.pm and other user modules installation

The perlcall.cgi perl CGI script and the SAR.pm module will need to be installed in a cgi-bin directory of the web server of every computer which will be set up to allow jobs to be Invoke'ed or Call'ed. The SAR.pm module can alternatively be installed anywhere in the standard perl @INC path.

Perl modules to call must be in the standard perl @INC path or in the library path given in the calls or in jobs.conf.

RCGI libraries installation

To install this package, just change to the directory in which this file is found and type the following:

perl Makefile.PL
make
make test
make install

In order for a job to be invoked, the sard daemon must be running to collect computer processor loads. The perlcall.cgi CGI script and the SAR.pm module must be installed properly on each computer.

DESCRIPTION

The RCGI library allows calling Perl subroutines in modules on remote computers with load balancing.

Load balancing using RCGI

RCGI calculates which machine to invoke a job on by using the machine which has the maximum idle time as determined by:

1) Take the measured idle time for each machine if it is newer than the last calculated idle time for the machine.

2) Subtract the reserve_idle for each machine.

3) If two machines have similar resulting idle times, use the machine with the most increase in measured idle time.

The resulting idle time then has a process usage amount subtracted from it and which is then stored in the DBM load file stored in the /usr/rcgi directory for subsequent usage for other job requests.

The process usage for a machine is calculated according to the following formula:

process_usage = (100 / (machine_processors * processes_per_processor));

Usage of RCGI

A perl program which is written as:

use lib 'module_path';
use Module;
$result = Module::subroutine(@arguments); # or
@result = Module::subroutine(@arguments);

can be converted to use a job, jobone:

jobs.conf entry:

jobone	machine1	http://webserver1/cgi-bin/perlcall.cgi	module_path	Module	subroutine
jobone	machine2	http://webserver2/cgi-bin/perlcall.cgi	module_path	Module	subroutine

by being rewritten as:

use RCGI;
$remote_subroutine = new_job RCGI('jobone');
$result = $remote_subroutine->Call(@arguments); # or
@result = $remote_subroutine->Call(@arguments);

or

use RCGI;
$result = Invoke('jobone',@arguments); # or
@result = Invoke('jobone',@arguments);

or can be rewritten to directly call a specific machine, machine1, as:

    use RCGI;
    $remote_subroutine = new RCGI('http://webserver1/cgi-bin/perlcall.cgi',
				  'module_path',
				  'Module',
				  'subroutine');
    $result = $remote_subroutine->Call(@arguments); # or
    @result = $remote_subroutine->Call(@arguments);

with the error checking for failure of the remote call by:

    if ($remote_subroutine->Success()) {
	print $result;
    } else {
	print "Call to " . $remote_subroutine->Base_URL() .
	    " failed with status: " . $remote_subroutine->Status() .
		' ' . $remote_subroutine->Error_Message() . "\n";
    }

RCGI Structure

There are four possible uses or layers for RCGI:

1. Invoking a module subroutine as a job via perlcall.cgi on the least busy computer defined for that job type.  This may be either synchronous or asynchronous.

2. Getting a RCGI remote subroutine object for the least busy computer defined for that job type.

3. Calling a module subroutine via perlcall.cgi on a particular computer.  This may be either synchronous or asynchronous.

4. Retrieving HTML pages from static HTML pages or CGI scripts using RCGI::run_cgi_command.

RCGI Structure Diagram

The arrows, '# ==>', show the usable API of RCGI.

                                                            sard.conf
                                                           /
                                                          L
                                                      sard
                                                      |
1 ==> Invoke or Async_Invoke                          V
                |             jobs.conf, server,conf, DBM file sar
                |            /
                V           L
2 ==>           new_job RCGI <-----> DBM file load
                     |
                     |
                     V
3 ==>             new RCGI
                     |
                     |
                     V
                $remote_subroutine->Call ==> http://www/cgi-bin/perlcall.cgi
                |                     /|         A
                |                     /|         |
                |                    / |         V
                V                   /  |     return( eval '
4 ==> RCGI::run_cgi_command        /   |     use lib 'library_path';
                                  /    |     use Module;
                                 /     |     Module::subroutine(@arguments);
                                L      |     ' );
                $result or @result     V
       $remote_subroutine->Success     $remote_subroutine->Done
 $remote_subroutine->Error_Message            |
                                              |
                                              V
                                       $remote_subroutine->Read
                                              |
                                              |
                                              V
                                            result
                                      $remote_subroutine->Sucess
                                      $remote_subroutine->Error_Message

Functions and Methods

Invoke a job request

@my_result = Invoke('job_name',@arguments);

Invoke a job to synchronously call a remote subroutine.

Where @arguments is the normal list of arguments for the remote subroutine.

Async_Invoke a job request

$remote_subroutine = Async_Invoke('job_name',@arguments);

Invoke a job to asynchronously call a remote subroutine.

Where @arguments is the normal list of arguments for the remote subroutine.

Get a new RCGI object using a job type

$remote_subroutine = new_job RCGI('job_name');

 OR

$remote_subroutine = new_job RCGI('job_name',$minimum_load);

This will create a new object which will allow a remote subroutine call for a particular job type. $minimum_load is the minimum percentage of idle to leave when assigning jobs.

Creating a new RCGI object:

    $remote_subroutine = new RCGI($base_url,$library_path,$module,$subroutine)

     OR

    $remote_subroutine = new RCGI($base_url,$library_path,$module,$subroutine,
				  -option => value)

The arguments are:

$base_url -- the base URL for the remote subroutine call. This is the URL for perlcall.cgi on the remote web server.

$library_path -- the location of the module which contains the subroutine for the remote subroutine call. This is optional--undef may be passed instead if the module is located relative to the perl @INC path. A '.' may be passed to specify the cgi-bin directory on the remote web server.

$module -- the module which contains the subroutine for the remote subroutine call.

$subroutine -- the name of the subroutine to call for the remote subroutine call. This subroutine must be callable in the form Module_Name::subroutine();. Please remember that no executation state is maintained by default on the remote computer.

Options are passed as: -option => value, where -option is one of:

-async          Do an asynchronous call.
-wantarray      Force array or scalar result (useful for using with async).
-username       Username to login to remote web server, if any.
-password       Password to login to remote web server, if any.
-user_agent     User_agent to use for remote web server.
-timeout        Timeout in seconds for web connection (default is 180).

This will create a new object which allows remote subroutine calls.

Calling the remote subroutine with Call

Synchronous Call

@my_result = $remote_subroutine->Call(@arguments);

 OR

$my_result = $remote_subroutine->Call(@arguments);

Where @arguments is the normal list of arguments for the remote subroutine.

Asynchronous Call

    $remote_subroutine->Call(@arguments);

     while(! $remote_subroutine->Done()) {
	# This should really be something useful
	sleep 1;
     }
     @my_result = $remote_subroutine->Read();

     OR

     $my_result = $remote_subroutine->Read();

Where @arguments is the normal list of arguments for the remote subroutine.

Check to see if an asynchronous call is Done

$remote_subroutine->Done();

Return true when the asynchronous call has completed.

Read the results from an asynchronous call

@my_result = $remote_subroutine->Read();

 OR

$my_result = $remote_subroutine->Read();

Fetch the result from an asynchronous call.

Success or failure of the remote subroutine call

$remote_subroutine->Success()

This returns true if the remote subroutine call was completed with no errors.

The return Status of rhte remote subroutine call

$remote_subroutine->Status()

This returns the status code from the remote subroutine call. Possible values are:

 -30 -- Machines are busy, the load is less than the load minimum (default is zero idle)

 -27 -- Missing task_url, module, or subroutine for job type definition for assigned machine

 -26 -- Missing job type definition for assigned machine

 -25 -- Missing sar measurement for assigned machine

 -24 -- Unable to assign machine

 -20 -- No job types defined match asked for job type

 -13 -- Unable to open load file

 -12 -- Unable to open sar file

 -11 -- Unable to open server.conf file

 -10 -- Unable to open jobs.conf file

  -1 -- The base URL, the module, or the subroutine were not given.

  -2 -- Unable to fork background process for asynchronous call.

 200 -- Successful call.

>200 -- Error code from remote web server or CGI script

Error_Message

$remote_subroutine->Error_Message()

This returns the associated error message, if any, from an unsuccessful remote subroutine call. If the Status is greater than 200, then the error message is from the remote web server.

Base_URL

$base_url = $remote_subroutine->Base_URL();

 OR

$remote_subroutine->Base_URL($base_url);

Get or set the base URL for the remote subroutine call. This is the URL for perlcall.cgi on the remote web server.

Library_Path

$library_path = $remote_subroutine->Library_Path();

 OR

$remote_subroutine->Library_Path($library_path);

Get or set the location of the module which contains the subroutine for the remote subroutine call. This is optional--undef may be passed instead if the module is located relative to the perl @INC path. A '.' may be passed to specify the cgi-bin directory on the remote web server.

Module

$module = $remote_subroutine->Module();

 OR

$remote_subroutine->Module($module);

Get or set the module which contains the subroutine for the remote subroutine call.

Subroutine

$subroutine = $remote_subroutine->Subroutine();

 OR

$remote_subroutine->Subroutine($subroutine);

Get or set the name of the subroutine to call for the remote subroutine call. This subroutine must be callable in the form Module_Name::subroutine();. Please remember that no executation state is maintained by default on the remote computer.

Async

$async = $remote_subroutine->Async();

 OR

$remote_subroutine->Async($async);

Get or set whether the call is asynchronous.

Wantarray

$wantarray = $remote_subroutine->Wantarray();

 OR

$remote_subroutine->Wantarray($wantarray);

Get or set whether the call returns a scalar or an array (or associative array).

Username

$username = $remote_subroutine->Username();

 OR

$remote_subroutine->Username($username);

Get or set the username, if any, used to login to the remote server.

Password

$password = $remote_subroutine->Password();

 OR

$remote_subroutine->Password($password);

Get or set the password, if any, used to login to the remote server.

User_Agent

$user_agent = $remote_subroutine->User_Agent();

 OR

$remote_subroutine->User_Agent($user_agent);

Get or set the user_agent used to when connecting to the remote server.

Timeout

$timeout = $remote_subroutine->Timeout();

 OR

$remote_subroutine->Timeout($timeout);

Get or set the timeout in seconds used in the connection to the remote server. Default is 180 seconds.

Process_Parameters

($cgi_form, %options) = RCGI::Process_Parameters( new CGI , \%TRANSLATE, \%IGNORE );

This processes the CGI parameters, using the passed CGI object reference. The optional %TRANSLATE associative array allows passing CGI parameters with a different parameter field name (i.e., translate paramter foo=1 to bar=1). The optional %IGNORE associative array specifies CGI parameters which should not be passed on.

Returned are $cgi_form which is a reference to an associative array which contains the CGI parameters in a form ready to pass to RCGI::run_cgi_command() and the %options associative array of options to pass to RCGI::run_cgi_command().

run_cgi_command

$result = RCGI::run_cgi_command($base_url, \%cgi_form, %options);

This fetches an HTML page from either a static HTML page or a CGI script.

$base_url is the URL of the page to get. \%cgi_form is an associate array whose index is CGI parameters to pass and whose values are the CGI parameter values to pass to the remote CGI script. If a parameter's name has 'upload:' prepended to it, then the values will be passed using the multipart/form-data file upload method. (Example $cgi_form = { 'upload:seq_file' => "> sequence\nAAAAA\n" }.)

Options are passed as: -option => value, where -option is one of:

-method         CGI method to use (GET is default).
                Values are 0 or undef for GET and 1 for POST
-nph            Use 1 to treat the remote CGI script as NPH.
-username       Username to login to remote web server, if any.
-password       Password to login to remote web server, if any.
-user_agent     User_agent to use for remote web server.
-timeout        Timeout in seconds for web connection (default is 180).

Example Job Invoke Script

     #!/usr/local/bin/perl
     
     use RCGI;
     @result = Invoke('jobtest1','one');
     print @result;

     $result = Invoke('jobtest2','two');
     print $result;
     exit;

    $remote = new_job RCGI('jobtest1');
    @out = $remote->Call('one');
    if ( $remote->Success() == 0) {
	print " Failed with error: " .
	    $remote->Error_Message() . "\n";
	undef @out;
    }
     

Example Remote Subroutine Call Script

#!/usr/local/bin/perl
#
#

use RCGI;

$base_url = 'http://www.sandrock.edu/cgi-bin/perlcall.cgi';
$library_path = '/my/module/directory';
$module = 'MyModule';
$subroutine = 'my_subroutine';
$remote_subroutine = new RCGI($base_url,$library_path,$module,$subroutine);

@my_result = $remote_subroutine->Call(0, 'a', 'b');
$, = "\n";
if ($remote_subroutine->Success()) {
    print @my_result,'';
} else {
    print STDERR "Call to " . $remote_subroutine->Base_URL() .
	" failed with status: " . $remote_subroutine->Status() .
	    ' ' . $remote_subroutine->Error_Message() . "\n";
}

$my_result = $remote_subroutine->Call(0, 'a', 'b', 'c');
if ($remote_subroutine->Success()) {
    print $my_result,'';
} else {
    print STDERR "Call to " . $remote_subroutine->Base_URL() .
	" failed with status: " . $remote_subroutine->Status() .
	    ' ' . $remote_subroutine->Error_Message() . "\n";
}

$remote_subroutine->Async(1);
$remote_subroutine->Wantarray(1);
$remote_subroutine->Call(5, 'async', 'hronous');
$| = 1;
while(! $remote_subroutine->Done()) {
    # This should really be something usefull--like calls to other servers!
    sleep 1;
    print ".";
}
@my_result = $remote_subroutine->Read();
$, = "\n";
if ($remote_subroutine->Success()) {
    print @my_result,'';
} else {
    print STDERR "Call to " . $remote_subroutine->Base_URL() .
	" failed with status: " . $remote_subroutine->Status() .
	    ' ' . $remote_subroutine->Error_Message() . "\n";
}

REPORTING BUGS

When reporting bugs/problems please include as much information as possible.

A small script which yields the problem will probably be of help. If you cannot include a small script then please include a Debug trace from a run of your program which does yield the problem.

AUTHOR INFORMATION

Brian H. Dunford-Shore brian@ibc.wustl.edu David J. States states@ibc.wustl.edu

Copyright 1998, Washington University School of Medicine, Institute for Biomedical Computing. All rights reserved.

This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

Address bug reports and comments to: www@ibc.wustl.edu

TODO

Save the result dump in a file for batch mode
Save the arguments in a file for queued batch mode

SEE ALSO

CREDITS

BUGS

You really mean 'extra' features ;). None known.

COPYRIGHT

Copyright (c) 1997 Washington University, St. Louis, Missouri. All rights reserved. This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.