Restarting techniques

All of these techniques require that you know the server PID (Process ID). The easiest way to find the PID is to look it up in the httpd.pid file. It's easy to discover where to look at, by checking out the httpd.conf file. Open the file and locate the entry PidFile. Here is the line from one of my own httpd.conf files:

PidFile /usr/local/var/httpd_perl/run/httpd.pid

As you see, with my configuration the file is /usr/local/var/httpd_perl/run/httpd.pid.

Another way is to use the ps and grep utilities. Assuuming that the binary is called httpd_perl, we would do:

% ps auxc | grep httpd_perl

or maybe:

% ps -ef | grep httpd_perl

This will produce a list of all the httpd_perl (parent and children) processes. You are looking for the parent process. If you run your server as root, you will easily locate it since it belongs to root. If you run the server as some other user (when you don't have root access, the processes will belong to that user unless defined differently in httpd.conf. It's still easy to find which is the parent - it's the smallest one.

You will see many httpd executables running on your system, but you should never need to send signals to any of them except the parent, whose pid is in the PidFile. There are three signals that you can send to the parent: SIGTERM, SIGHUP, and SIGUSR1.

Some folks prefer to specify signals using numerical values, rather than using symbols. If you are looking for these, check out your kill(1) man page. My page points to /usr/include/sys/signal.h, the relevant entries are:

#define SIGHUP     1    /* hangup, generated when terminal disconnects */ 
#define SIGKILL    9    /* last resort */
#define SIGTERM   15    /* software termination signal */
#define SIGUSR1   30    /* user defined signal 1 */

Note that to send these signals from the command line the SIG prefix must be omitted.

Implications of sending TERM, HUP, and USR1 to the server

We will concentrate here on the implications of sending these signals to a mod_perl enabled server. See http://www.apache.org/docs/stopping.html for documentation on the implications of sending these signals to a plain Apache server.

TERM Signal: Stop Now

Sending the TERM signal to the parent causes it immediately to attempt to kill off all its children. Any requests in progress are terminated, and no further requests are served. This process may take quite a few seconds to complete. To stop a child, the parent sends SIGHUP signal. If that fails it sends another. If that fails it sends SIGTERM signal, and as a last resort it sends SIGKILL signal. For each failed attempt to kill a child it makes an entry in the error_log.

Finally the parent itself exits and any open log files are closed. This is when all the accumulated END blocks, but the ones from Apache::Registry or Apache::PerlRun scripts, will be executed. The latter gets executed after each request is served.

HUP Signal: restart now

Sending the HUP signal to the parent causes it to kill off its children as if you had sent TERM (any requests in progress are terminated) but the parent doesn't exit.

The parent will reread its configuration files, close and re-open any log files, flush all the compiled and preloaded modules, and rerun any startup files. Then it spawns a new set of children and continues serving hits. It's equivalent to stopping then restarting the server.

Note: If your configuration files have errors when you issue a restart then the parent not restart but will exit with an error and your sever will be stopped. See below for a way of avoiding this.

USR1 Signal: graceful restart

The USR1 signal causes the parent process to advise the children to exit after serving the current request, or to exit immediately if they're not serving a request. The parent re-reads its configuration files and re-opens its log files. As each child dies off the parent replaces it with a child from the new generation (the new children use the new configuration) and it begins serving new requests immediately.

The only difference between USR1 and HUP is that USR1 allows the children to complete any current requests prior to killing them off.

By default, if a server is restarted (using kill -USR1 `cat logs/httpd.pid` or with the HUP signal), Perl scripts and modules are not reloaded. To reload PerlRequire's, PerlModule's, other use()'d modules and flush the Apache::Registry cache, use this directive in httpd.conf:

PerlFreshRestart On

Make sure you read Evil things might happen when using PerlFreshRestart.

We've already mentioned that restart or termination can sometimes take quite a long time for a mod_perl server. You have an option to set the PERL_DESTRUCT_LEVEL Environment Variable during the perl Makefile.PL stage. You can also simply set it to -1 directly. This can speed things up, and can lead to more robust operation in the face of problems such as running out of memory.

Using apachectl to control the server

Apache's distribution provides a script to control the server. It's called apachectl and it is installed into the same location as the httpd executable. Let's say in our scenario it's in /usr/local/sbin/httpd_perl/apachectl:

To start httpd_perl:

% /usr/local/sbin/httpd_perl/apachectl start 

To stop httpd_perl:

% /usr/local/sbin/httpd_perl/apachectl stop

To restart httpd_perl (if it is running, send SIGHUP; if it is not already running just start it):

% /usr/local/sbin/httpd_perl/apachectl restart

Do a graceful restart by sending a SIGUSR1, or start if not running:

% /usr/local/sbin/httpd_perl/apachectl graceful    

To do a configuration test:

% /usr/local/sbin/httpd_perl/apachectl configtest 

Replace httpd_perl with httpd_docs in the above calls to control the httpd_docs server.

There are other options for apachectl, use the help option to see them all.

It's important to remember that apachectl uses the PID file, which is specified by the PIDFILE directive in httpd.conf. If you delete the PID file by hand, apachectl will fail to run.

Also note that apachectl is suitable for use from within a Unix system's startup files so that the Web server is automatically restarted at system reboot.

Either copy the apachectl file to the appropriate location (/etc/rc.d/rc3.d/S99apache works on my RedHat Linux system) or create a symlink with that name pointing to the canonical location. (If you do this, make certain that the script is writable only by root! The startup scripts have root privileges during initialisation, and you don't want to open any security holes.)

Safe Code Updates on a Live Production Server

You have prepared a new version of code, uploaded it into a production server, restarted it and it doesn't work. What could be worse than that? You also cannot go back, because you have overwritten the good working code.

It's quite easy to prevent it, just don't overwrite the previous good files!

Personally I do all updates on the live server with the following sequence. Assume that the server root directory is /home/httpd/perl/rel. When I'm about to update the files I create a new directory /home/httpd/perl/beta, copy the old files from /home/httpd/perl/rel and update it with the new files. Then I do some last sanity checks (check file permissions are [read+executable], and run perl -c on the new modules to make sure there no errors in them). When I think I'm ready I do:

% cd /home/httpd/perl
% mv rel old && mv beta rel && stop && sleep 3 && restart && err

Let me explain what I'm doing.

Firstly, note that I put all the commands on one line, separated by &&, and only then press the Enter key. As I am working remotely, this ensures that if I suddenly get a lost connection (sadly that happens sometimes) I won't leave the server down if only the stop command squeezed in. && also ensures that if any command fails, the rest won't be executed. I am using aliases (which I have already defined) to make the typing easier:

% alias | grep apachectl
graceful /usr/local/apache/bin/apachectl graceful
rehup   /usr/local/apache/sbin/apachectl restart
restart /usr/local/apache/bin/apachectl restart
start   /usr/local/apache/bin/apachectl start
stop    /usr/local/apache/bin/apachectl stop

% alias err
tail -f /usr/local/apache/logs/error_log

Taking the line apart piece by piece:

mv rel old &&

back up the working directory to C<old>

mv beta rel &&

put the new one in its place

stop &&

stop the server

sleep 3 &&

give it a few seconds to shut down (it might take even longer)

restart &&

C<restart> the server

err

view of the tail of the C<error_log> file in order to see that
everything is OK

apachectl generates the status messages a little too early (e.g. when you issue apachectl stop it says the server has been stopped, while in fact it's still running) so don't rely on it, rely on the error_log file instead.

Also you have noticed that I use restart and not just start. I do this because of Apache's potentially long stopping times (it depends on what you do with it of course!). If you use start and Apache hasn't yet releases the port it's listening to, the start would fail and error_log would tell you that port is in use, e.g.:

Address already in use: make_sock: could not bind to port 8080

But if you use restart, it will wait for the server to quit and then will cleanly restart it.

Now what happens if the new modules are broken? First of all, I see immediately an indication of the problems reported in the error_log file, which I tail -f immediately after a restart command. If there's a problem, I just put everything as it was before:

% mv rel bad && mv old rel && stop && sleep 3 && restart && err

Usually everything will be fine, and I have had only about 10 seconds of downtime, which is pretty good!

An Intentional Disabling of Live Scripts

What happens if you really must take down the server or disable the scripts? This situation might happen when you need to do some maintenance work on your database server. If you have to take it down the scripts using this database server will fail.

If you do nothing, the user will see either the grey An Error has happened message or perhaps a customized error message if you have added code to trap and customize the errors. See Redirecting Errors to the Client instead of to the error_log for the latter case.

A much more friendlier approach is to confess to your users that you are doing some maintenance work and plead for patience, promising (keep the promise!) that the service will become fully functional in X minutes. There are a few ways to do this:

The first doesn't require messing with the server. It works when you have to disable a script, but not a module! Just prepare a little script like this:

/home/http/perl/construction.pl
----------------------------
#!/usr/bin/perl -wT

use strict;
use CGI;
my $q = new CGI;
print $q->header,
"Sorry, the service is temporarily down for maintenance. 
 It will be back in ten to fifteen minutes.
 Please, bear with us.
 Thank you!";

And if now you have to disable a script at /home/http/perl/chat.pl, just do this:

% mv /home/http/perl/chat.pl /home/http/perl/chat.pl.orig
% ln -s /home/http/perl/construction.pl /home/http/perl/chat.pl

Of course you server configuration should allow symbolic links for this trick to work. Make sure you have the directive

Options FollowSymLinks

in the <Location> or <Directory> section of your httpd.conf.

When you're done, it's easy to restore the previous setup. Just do this:

% mv /home/http/perl/chat.pl.orig /home/http/perl/chat.pl

which overwrites the symbolic link.

Now make sure that the script will have the current timestamp:

% touch /home/http/perl/chat.pl

Apache will automatically detect the change and will use the moved script instead.

The second approach is to change the server configuration and configure a whole directory to be handled by a Construction handler (which you must write). For example if you write something like this:

Construction.pm
---------------
use strict;
use CGI;
use Apache::Constants;
sub handler{
  my $q = new CGI;
  print $q->header,
"Sorry, the service is temporarily down for maintenance. 
 It will be back in ten to fifteen minutes.
 Please, bear with us.
 Thank you!";
  return OK;
}

and put it in a directory that in the server's @INC, to disable all the scripts in Location /perl you would replace:

<Location /perl>
  SetHandler perl-script
  PerlHandler Apache::Registry
  [snip]
</Location>

with

<Location /perl>
  SetHandler perl-script
  PerlHandler Construction
  [snip]
</Location>

Now restart the server. Your user will be happy to go and read slashdot.org for ten minutes, knowing that you are working on a much better version of the service.

If you need to disable a location handled by some module, the second approach would work just as well.

SUID Start-up Scripts

For those who want to use SUID startup scripts, here is an example. This script is SUID root, and should be executable only by members of some special group at your site. Note the tenth line, which fixes an obscure error when starting apache/mod_perl by setting the real to the effective UID. As others have pointed out, a mismatch between the real and the effective UIDs causes Perl to croak on the -e switch.

Note that you must be using a version of Perl that recognizes and emulates the suid bits in order for this to work. The script will do different things depending on whether it is named start_http, stop_http or restart_http. You can use symbolic links for this purpose.

#!/usr/bin/perl

# These constants will need to be adjusted.
$PID_FILE = '/home/www/logs/httpd.pid';
$HTTPD = '/home/www/httpd -d /home/www';

# These prevent taint warnings while running suid
$ENV{PATH}='/bin:/usr/bin';
$ENV{IFS}='';

# This sets the real to the effective ID, and prevents
# an obscure error when starting apache/mod_perl
$< = $>;
$( = $) = 0; # set the group to root too

# Do different things depending on our name
($name) = $0 =~ m|([^/]+)$|;

if ($name eq 'start_http') {
    system $HTTPD and die "Unable to start HTTP";
    print "HTTP started.\n";
    exit 0;
}

# extract the process id and confirm that it is numeric
$pid = `cat $PID_FILE`;
$pid =~ /(\d+)/ or die "PID $pid not numeric";
$pid = $1;

if ($name eq 'stop_http') {
    kill 'TERM',$pid or die "Unable to signal HTTP";
    print "HTTP stopped.\n";
    exit 0;
}

if ($name eq 'restart_http') {
    kill 'HUP',$pid or die "Unable to signal HTTP";
    print "HTTP restarted.\n";
    exit 0;
}

die "Script must be named start_http, stop_http, or restart_http.\n";

Preparing for Machine Reboot

When you run your own development box, it's OK to start the webserver by hand when you need it. On a production system, there is chance that the machine the server is running on will have to be rebooted. When the reboot is completed, who is going to rememeber to start the server? It's easy to forget this task, and what happens if you aren't around when the machine is rebooted?

After the server installation is complete, it's important not to forget that you need to put a script to perform the server startup and shutdown into the standard system location, for example /etc/rc.d.

This is the directory which contains scripts to start and stop all the other daemons. The directory and file names vary from one Operating System to another, and even between different distributions of the same OS.

Generally the simplest solution is to copy the apachectl script to your startup directory. You will find apachectl in the same directory as the httpd executable after Apache installation. If you have more than one Apache server you need a script for each one, and of course you will have to rename them.

For example on a RedHat Linux machine with two servers, I have the following setup:

/etc/rc.d/init.d/httpd_docs
/etc/rc.d/init.d/httpd_perl
/etc/rc.d/rc3.d/S86httpd_docs -> ../init.d/httpd_docs
/etc/rc.d/rc3.d/S87httpd_perl -> ../init.d/httpd_perl
/etc/rc.d/rc6.d/K86httpd_docs -> ../init.d/httpd_docs
/etc/rc.d/rc6.d/K87httpd_perl -> ../init.d/httpd_perl

The scripts themselves reside in the init.d directory. There are symbolic links to these scripts in other directories, the names are the same as the script names but they have numbers prepended, which used for executing the scripts in a particular order: the lower numbers are executed earlier.

When a machine is booted and its runlevel set to 3 (multiuser+network), Linux goes into /etc/rc.d/rc3.d/ and executes the scripts the symbolic links point to with the start argument. When it sees S87httpd_perl, it executes:

/etc/rc.d/init.d/httpd_perl start

When the machine is shut down, the scripts are executed through links from the /etc/rc.d/rc6.d/ directory. This time the scripts are called with the stop argument, like this:

/etc/rc.d/init.d/httpd_perl stop

Most of the systems have GUI utilites to automate the creation of symbolic links. For example RedHat Linux includes the control-panel utility, which amongst other things includes the RunLevel Manager. This will help you to create the proper symbolic links. Of course before you use it, you should put apachectl or similar scripts into the init.d or equivalent directory.

Monitoring the Server. A watchdog.

With mod_perl many things can happen to your server. It is possibile that the server might die when you are not around. As with any other critical service you need to run some kind of watchdog.

One simple solution is to use a slightly modified apachectl script, which I've named apache.watchdog. Call it from the crontab every 30 minutes -- or even every minute -- to make sure the server is up all the time.

The crontab entry for 30 minutes intervals:

0,30 * * * * /path/to/the/apache.watchdog >/dev/null 2>&1

The script:

#!/bin/sh
  
# this script is a watchdog to see whether the server is online
# It tries to restart the server, and if it's
# down it sends an email alert to admin 

# admin's email
EMAIL=webmaster@somewhere.far
#EMAIL=root@localhost
  
# the path to your PID file
PIDFILE=/usr/local/var/httpd_perl/run/httpd.pid
  
# the path to your httpd binary, including options if necessary
HTTPD=/usr/local/sbin/httpd_perl/httpd_perl
      
# check for pidfile
if [ -f $PIDFILE ] ; then
  PID=`cat $PIDFILE`
  
  if kill -0 $PID; then
    STATUS="httpd (pid $PID) running"
    RUNNING=1
  else
    STATUS="httpd (pid $PID?) not running"
    RUNNING=0
  fi
else
  STATUS="httpd (no pid file) not running"
  RUNNING=0
fi
    
if [ $RUNNING -eq 0 ]; then
  echo "$0 $ARG: httpd not running, trying to start"
  if $HTTPD ; then
    echo "$0 $ARG: httpd started"
    mail $EMAIL -s "$0 $ARG: httpd started" </dev/null >& /dev/null
  else
    echo "$0 $ARG: httpd could not be started"
    mail $EMAIL -s "$0 $ARG: httpd could not be started" </dev/null >& /dev/null
  fi
fi

Another approach, probably even more practical, is to use the cool LWP perl package to test the server by trying to fetch some document (script) served by the server. Why is it more practical? Because while the server can be up as a process, it can be stuck and not working. Failing to get the document will trigger restart, and "probably" the problem will go away. Just replace start with restart in the $restart_command below.

Again we put this script into the crontab to call it every 30 minutes. Personally I call it every minute, to fetch some very light script. Why so often? If your server starts to spin and trash your disk space with multiple error messages, in five minutes you might run out of free disk space which might bring your system to its knees. Chances are that no other child will be able to serve requests, since the system will be too busy writing to the error_log file. Think big -- if you are running a heavy service (which is very fast since you are running under mod_perl) adding one more request every minute will not be felt by the server at all.

So we end up with a crontab entry like this:

* * * * * /path/to/the/watchdog.pl >/dev/null 2>&1

And the watchdog itself:

#!/usr/local/bin/perl -w

use strict;
use diagnostics;
use URI::URL;
use LWP::MediaTypes qw(media_suffix);

my $VERSION = '0.01';
use vars qw($ua $proxy);
$proxy = '';    

require LWP::UserAgent;
use HTTP::Status;

###### Config ########
my $test_script_url = 'http://www.stas.com:81/perl/test.pl';
my $monitor_email   = 'root@localhost';
my $restart_command = '/usr/local/sbin/httpd_perl/apachectl restart';
my $mail_program    = '/usr/lib/sendmail -t -n';
######################

$ua  = new LWP::UserAgent;
$ua->agent("$0/Stas " . $ua->agent);
# Uncomment the proxy if you don't use it!
#  $proxy="http://www-proxy.com";
$ua->proxy('http', $proxy) if $proxy;

# If returns '1' it's we are alive
exit 1 if checkurl($test_script_url);

# Houston, we have a problem.
# The server seems to be down, try to restart it. 
my $status = system $restart_command;
#  print "Status $status\n";

my $message = ($status == 0) 
            ? "Server was down and successfully restarted!" 
            : "Server is down. Can't restart.";
  
my $subject = ($status == 0) 
            ? "Attention! Webserver restarted"
            : "Attention! Webserver is down. can't restart";

# email the monitoring person
my $to = $monitor_email;
my $from = $monitor_email;
send_mail($from,$to,$subject,$message);

# input:  URL to check 
# output: 1 for success, 0 for failure
#######################  
sub checkurl{
  my ($url) = @_;

  # Fetch document 
  my $res = $ua->request(HTTP::Request->new(GET => $url));

  # Check the result status
  return 1 if is_success($res->code);

  # failed
  return 0;
} #  end of sub checkurl

# send email about the problem 
#######################  
sub send_mail{
  my($from,$to,$subject,$messagebody) = @_;

  open MAIL, "|$mail_program"
      or die "Can't open a pipe to a $mail_program :$!\n";
 
  print MAIL <<__END_OF_MAIL__;
To: $to
From: $from
Subject: $subject

$messagebody

__END_OF_MAIL__

  close MAIL;
} 

Running a Server in Single Process Mode

Often while developing new code, you will want to run the server in single process mode. See Sometimes it works Sometimes it does Not and Names collisions with Modules and libs. Running in single process mode inhibits the server from "daemonizing", and this allows you to run it under the control of a debugger more easily.

% /usr/local/sbin/httpd_perl/httpd_perl -X

When you use the -X switch the server will run in the foreground of the shell, so you can kill it with Ctrl-C.

Note that in -X (single-process) mode the server will run very slowly while fetching images.

Note for Netscape users:

If you use Netscape while your server is running in single-process mode, HTTP's KeepAlive feature gets in the way. Netscape tries to open multiple connections and keep them open. Because there is only one server process listening, each connection has to time out before the next succeeds. Turn off KeepAlive in httpd.conf to avoid this effect while developing. If you use the image size parameters, Netscape will be able to render the page without the images so you can press the browser's STOP button after a few seconds.

In addition you should know that when running with -X you will not see the control messages that the parent server normally writes to the error_log ("server started", "server stopped" etc). Since httpd -X causes the server to handle all requests itself, without forking any children, there is no controlling parent to write the status messages.

Starting a Personal Server for Each Developer

If you are the only developer working on the specific server:port you have no problems, since you have complete control over the server. However, often you will have a group of developers who need to develop mod_perl scripts and modules concurrently. This means that each developer will want to have control over the server - to kill it, to run it in single server mode, to restart it etc., as well as having control over the location of the log files, configuration settings like MaxClients, and so on.

You can work around this problem by preparing a few httpd.conf files and forcing each developer to use

httpd_perl -f /path/to/httpd.conf  

but I approach it in a different way. I use the -Dparameter startup option of the server. I call my version of the server

% http_perl -Dsbekman

In httpd.conf I write:

# Personal development Server for sbekman
# sbekman uses the server running on port 8000
<IfDefine sbekman>
Port 8000
PidFile /usr/local/var/httpd_perl/run/httpd.pid.sbekman
ErrorLog /usr/local/var/httpd_perl/logs/error_log.sbekman
Timeout 300
KeepAlive On
MinSpareServers 2
MaxSpareServers 2
StartServers 1
MaxClients 3
MaxRequestsPerChild 15
</IfDefine>

# Personal development Server for userfoo
# userfoo use the server running on port 8001
<IfDefine userfoo>
Port 8001
PidFile /usr/local/var/httpd_perl/run/httpd.pid.userfoo
ErrorLog /usr/local/var/httpd_perl/logs/error_log.userfoo
Timeout 300
KeepAlive Off
MinSpareServers 1
MaxSpareServers 2
StartServers 1
MaxClients 5
MaxRequestsPerChild 0
</IfDefine>

With this technique we have achieved for each server full control over start/stop, number of children, a separate error log file, and port selection. This saves me from getting called every few minutes - "Stas, I'm going to restart the server".

In the above technique, you need to discover the PID of your parent httpd_perl process, which is written in /usr/local/var/httpd_perl/run/httpd.pid.userfoo. To make things even easier we change the apachectl script to do the work for us. We make a copy for each developer called apachectl.username and we change two lines in each script:

PIDFILE=/usr/local/var/httpd_perl/run/httpd.pid.sbekman
HTTPD='/usr/local/sbin/httpd_perl/httpd_perl -Dsbekman'

You might think you can use only one control file and know who is calling from the uid, but since you have to be root to start the server it is not so simple.

The last thing was to let developers an option to run in single process mode by:

/usr/local/sbin/httpd_perl/httpd_perl -Dsbekman -X

In addition to making life easier, we decided to use relative links everywhere in the static documents, including the calls to CGIs. You may ask how using relative links will get to the right server. It's very simple, we use mod_rewrite.

To use mod_rewrite you have to configure your httpd_docs server with --enable-module=rewrite and recompile, or use DSO and load the module in httpd.conf. In the access.conf of our httpd_docs server we have the following code:

# sbekman' server
# port = 8000
RewriteCond  %{REQUEST_URI} ^/(perl|cgi-perl)	 
RewriteCond  %{REMOTE_ADDR} 123.34.45.56
RewriteRule ^(.*)           http://nowhere.com:8000/$1 [R,L]

# userfoo's server
# port = 8001
RewriteCond  %{REQUEST_URI} ^/(perl|cgi-perl)	 
RewriteCond  %{REMOTE_ADDR} 123.34.45.57
RewriteRule ^(.*)           http://nowhere.com:8001/$1 [R,L]

# all the rest
RewriteCond  %{REQUEST_URI} ^/(perl|cgi-perl)	 
RewriteRule ^(.*)           http://nowhere.com:81/$1 [R]

the IP addresses are the addresses of the developer client machines (where they are running their web browsers). I tried to use REMOTE_USER since we have all the users authenticated but it did not work for me.

So if written in some file.html I have a relative URL like /perl/test.pl or even http://www.nowhere.com/perl/test.pl (the user at the machine of sbekman) it will be redirected by httpd_docs to http://www.nowhere.com:8000/perl/test.pl.

You have another problem: the CGI script may generate some html which the client may then use to request further action from the server. If the script generates a URL with a hard coded PORT, the above scheme will not work. There two solutions:

First, generate relative URLs so it will reuse the technique above, with redirect (which is transparent for the user). But this will not work if you have something to POST, because the redirect loses all the data!

Second, use a general configuration module which generates a correct full URL according to REMOTE_USER, so if $ENV{REMOTE_USER} eq 'sbekman', I return http://www.nowhere.com:8000/perl/ as cgi_base_url. Again this will work if the user is authenticated.

All this is good for development. It is better to use the full URLs in production, since if you have a static form and the Action is relative but the static document is located on another server, pressing the form's submit will cause a redirect to the mod_perl server and all the form's data will be lost during the redirect.

Wrapper to Emulate the Server Environment

Often you will start off debugging your script by running it from your favorite shell. Sometimes you encounter a very weird situation when the script runs from the shell but dies when called as a CGI. The real problem often lies in the difference between the environment that is used by your server and the one used by your shell. For example you may have a different Perl path, a PERL5LIB environment variable which includes paths that are not in the @INC array of the copy of Perl which is linked into the mod_perl server and configured during startup.

The best debugging approach is to write a wrapper that emulates the exact environment of the server, first deleting environment variables like PERL5LIB and then calling the same perl binary that it is being used by the server. Next, set the environment identical to the server's by copying the Perl run directives from the server startup and configuration files. This will also allow you to remove completely the first line of the script, since mod_perl skips it and the wrapper knows how to call the script.

Below is the example of such a script. Note that we force the use of -Tw when we call the real script. I have also added the ability to pass parameters, which will not happen when you call the CGI script from the Web.

  #!/usr/local/bin/perl -w    
   
  # This is a wrapper example 
   
  # It simulates the web server environment by setting @INC and other
  # stuff, so what will run under this wrapper will run under Web and
  # vice versa. 
  
  #
  # Usage: wrap.pl some_cgi.pl
  #
  
  BEGIN{
    use vars qw($basedir);
    $basedir = "/usr/local";
  
    # we want to make a complete emulation, 
    # so we must remove the user's environment
    @INC = ();
  
    # local perl libs
    push @INC,
      qw($basedir/lib/perl5/5.00502/aix
  	 $basedir/lib/perl5/5.00502
	 $basedir/lib/perl5/site_perl/5.005/aix
	 $basedir/lib/perl5/site_perl/5.005
	);
  }
  
  use strict;
  use File::Basename;
  
    # process the passed params
  my $cgi = shift || '';
  my $params = (@ARGV) ? join(" ", @ARGV) : '';
  
  die "Usage:\n\t$0 some_cgi.pl\n" unless $cgi;
  
    # Set the environment
  my $PERL5LIB = join ":", @INC;
  
    # if the path includes the directory 
    # we extract it and chdir there
  if ($cgi =~ m|/|) {
    my $dirname = dirname($cgi);
    chdir $dirname or die "Can't chdir to $dirname: $! \n";
    $cgi =~ m|$dirname/(.*)|;
    $cgi = $1;
  }
  
    # run the cgi from the script's directory
    # Note that we set Warning and Taint modes ON!!!
  system qq{$basedir/bin/perl -I$PERL5LIB -Tw $cgi $params};

Log Rotation

A little bit off topic, but useful to know and use with mod_perl where your error_log can grow at 10-100Mb per day if your scripts spit out lots of warnings...

To rotate the logs do this:

mv access_log access_log.renamed
kill -HUP `cat httpd.pid`
sleep 10; # allow some children to complete requests and logging
# now it's safe to use access_log.renamed
.....

The effect of SIGUSR1 and SIGHUP is detailed in: http://www.apache.org/docs/stopping.html .

I use this script:

#!/usr/local/bin/perl -Tw

# This script does log rotation. Called from crontab.

use strict;
$ENV{PATH}='/bin:/usr/bin';

### configuration
my @logfiles = qw(access_log error_log);
umask 0;
my $server = "httpd_perl";
my $logs_dir = "/usr/local/var/$server/logs";
my $restart_command = "/usr/local/sbin/$server/apachectl restart";
my $gzip_exec = "/usr/bin/gzip";

my ($sec,$min,$hour,$mday,$mon,$year) = localtime(time);
my $time = sprintf "%0.4d.%0.2d.%0.2d-%0.2d.%0.2d.%0.2d", $year+1900,++$mon,$mday,$hour,$min,$sec;
$^I = ".".$time;

# rename log files
chdir $logs_dir;
@ARGV = @logfiles;
while (<>) {
  close ARGV;
}

# now restart the server so the logs will be restarted
system $restart_command;

# compress log files
foreach (@logfiles) {
    system "$gzip_exec $_.$time";
}

Randal L. Schwartz contributed this:

Cron fires off setuid script called log-roller that looks like this:

#!/usr/bin/perl -Tw
use strict;
use File::Basename;

$ENV{PATH} = "/usr/ucb:/bin:/usr/bin";

my $ROOT = "/WWW/apache"; # names are relative to this
my $CONF = "$ROOT/conf/httpd.conf"; # master conf
my $MIDNIGHT = "MIDNIGHT";  # name of program in each logdir

my ($user_id, $group_id, $pidfile); # will be set during parse of conf
die "not running as root" if $>;

chdir $ROOT or die "Cannot chdir $ROOT: $!";

my %midnights;
open CONF, "<$CONF" or die "Cannot open $CONF: $!";
while (<CONF>) {
  if (/^User (\w+)/i) {
    $user_id = getpwnam($1);
    next;
  }
  if (/^Group (\w+)/i) {
    $group_id = getgrnam($1);
    next;
  }
  if (/^PidFile (.*)/i) {
    $pidfile = $1;
    next;
  }
 next unless /^ErrorLog (.*)/i;
  my $midnight = (dirname $1)."/$MIDNIGHT";
  next unless -x $midnight;
  $midnights{$midnight}++;
}
close CONF;

die "missing User definition" unless defined $user_id;
die "missing Group definition" unless defined $group_id;
die "missing PidFile definition" unless defined $pidfile;

open PID, $pidfile or die "Cannot open $pidfile: $!";
<PID> =~ /(\d+)/;
my $httpd_pid = $1;
close PID;
die "missing pid definition" unless defined $httpd_pid and $httpd_pid;
kill 0, $httpd_pid or die "cannot find pid $httpd_pid: $!";


for (sort keys %midnights) {
  defined(my $pid = fork) or die "cannot fork: $!";
  if ($pid) {
    ## parent:
    waitpid $pid, 0;
  } else {
    my $dir = dirname $_;
    ($(,$)) = ($group_id,$group_id);
    ($<,$>) = ($user_id,$user_id);
    chdir $dir or die "cannot chdir $dir: $!";
    exec "./$MIDNIGHT";
    die "cannot exec $MIDNIGHT: $!";
  }
}

kill 1, $httpd_pid or die "Cannot sighup $httpd_pid: $!";

And then individual MIDNIGHT scripts can look like this:

#!/usr/bin/perl -Tw
use strict;

die "bad guy" unless getpwuid($<) =~ /^(root|nobody)$/;
my @LOGFILES = qw(access_log error_log);
umask 0;
$^I = ".".time;
@ARGV = @LOGFILES;
while (<>) {
  close ARGV;
}

Can you spot the security holes? Our trusted user base can't or won't. :) But these shouldn't be used in hostile situations.

Preventing mod_perl Processes From Going Wild

Sometimes calling an undefined subroutine in a module can cause a tight loop that consumes all memory. Here is a way to catch such errors. Define an autoload subroutine:

sub UNIVERSAL::AUTOLOAD {
  my $class = shift;
  warn "$class can't \$UNIVERSAL::AUTOLOAD!\n";
}

This will produce a nice error in error_log, giving the line number of the call and the name of the undefined subroutine.

Sometimes an error happens and causes the server to write millions of lines into your error_log file and in a few minutes to bring your server to its knees. For example sometimes I get bursts of an error Callback called exit showing up in my error_log. The file grows 300 Mbytes in a few minutes. You should run a cron job to make sure this does not happen, and if it does to take care of it. Andreas J. Koenig runs this shell script every minute:

S=`ls -s /usr/local/apache/logs/error_log | awk '{print $1}'`
if [ "$S" -gt 100000 ] ; then
  mv  /usr/local/apache/logs/error_log /usr/local/apache/logs/error_log.old
  /etc/rc.d/init.d/httpd restart
  date | /bin/mail -s "error_log $S kB on inx" myemail@domain.com
fi

On my server I run a watchdog every five minutes which restarts the server if it gets stuck it always works since when some mod_perl child process goes wild, the I/O it causes is so heavy that its sibling processes cannot serve requests. See Monitoring the Server for more hints.

Also check out the daemontools from ftp://koobera.math.uic.edu/www/daemontools.html :

,-----
| cyclog writes a log to disk.  It automatically synchronizes the log
| every 100KB (by default) to guarantee data integrity after a crash.
| It automatically rotates the log to keep it below 1MB (by default).
| If the disk fills up, cyclog pauses and then tries again, without
| losing any data.
`-----