Curing The "Internal Server Error"
You have just installed this new CGI script and when you try it out you see the grey screen of death saying "Internal Server Error"... Or even worse you have a script running on production server for a long time without problems, when suddenly the same grey screen occasionally shows up.
What are you going to do? How can find out what the problem is? You code in Perl for years, and whenever an error was occuring you always saw it as it was displayed in the same terminal window you've started the script from. But when you work with webserver, there is no terminal to look for errors, since the server in most cases has no terminal to send the error messages to.
Actually, the error messages don't disappear, there end up in the error_log
file, that located in the directory specified by an ErrorLog
directive in the httpd.conf
file. The default setting is generally:
ErrorLog /usr/local/apache/logs/error_log
So whenever you see the "Internal Server Error" it's a time to look at this file. We have solved the first problem, where to look for error messages.
There is a chance that seeing the error message doesn't really help to spot and fix the error. The error message can be of immediate help, but it might not help at all. The usefulness of the error message depends solely on the programmers coding style.
Let's take an example of the call to a function that opens a file passed as a parameter and does nothing with it. The first version of the code:
my $r = shift;
$r->send_http_header('text/plain');
sub open_file{
my $filename = shift || '';
die "No filename passed!" unless $filename;
open FILE, $filename or die;
}
open_file("/tmp/test.txt");
I assume that /tmp/test.txt
doesn't exist so the open() would fail to open the file. When we call this script from our browser, the browser returns an "internal error" message and we see the following error at at the end of error_log
file:
Died at /home/httpd/perl/test.pl line 9.
So we use the hint, Perl kindly gave to us to find where in the code the die() was called. What we still don't know is what filename that was passed to this subroutine caused the program termination. When we have only once function call like in the example above -- the task of finding the problematic file is trivial.
Now let's add two more open_file() function calls and assume that among tree files only /tmp/test2.txt
exists:
open_file("/tmp/test.txt");
open_file("/tmp/test2.txt");
open_file("/tmp/test3.txt");
When you execute the above call, you will see the same error message for two times.
Died at /home/httpd/perl/test.pl line 9.
Died at /home/httpd/perl/test.pl line 9.
Based on this error message, can you tell what files your program failed to open? Probably not. Let's fix it by passing to die() the name of the file is question.
sub open_file{
my $filename = shift || '';
die "No filename passed!" unless $filename;
open FILE, $filename or die "failed to open $filename";;
}
open_file("/tmp/test.txt");
When we execute the above code, we see:
failed to open /tmp/test.txt at /home/httpd/perl/test.pl line 9.
Which makes a big difference, since we know what file we should be checking on.
By the way, if you append a newline at the end of the message you pass to die(), perl wouldn't report the line number the error has happened at, so if you code:
open FILE, $filename or die "failed to open a file\n";
The error message in case of failure would be:
failed to open a file
Which gives you no debug information at all. It's very hard to debug this kind of code.
The warn() function, a kinder sister of die(), which logs the message but doesn't cause the program termination, behaves in the same way -- if you don't add a newline at the end of the message, the line number warn() was called at would be logged, otherwise it wouldn't.
You might want to use warn() instead of die(), if the file opening failure isn't critical, consider the following code:
if(open FILE, $filename){
# do something with file
} else {
warn "failed to open a $filename";
}
# more code here...
So, we improved our code to report to us the names of the problematic files, but we still don't know the reason for open()'s failure. Let's try to improve the warn() example:
if(-r $filename){}
open FILE, $filename;
# do something with file
} else {
warn "$filename doesn't exist or is not readable";
}
We see the warning in the error_log file:
/tmp/test.txt doesn't exist or is not readable
at /home/httpd/perl/test.pl line 9.
Since it tells us the reason for failure and we don't have to go to the code and check what it was trying to do with a file: open it for writing, reading or else. -r
operator tests whether the file is readable.
It could by quite an overhead to explain the possible failure that way But why reinvent the wheel, when we already have the reason of failure stored in $!
variable. Let's go back to the open_file() function:
sub open_file{
my $filename = shift || '';
die "No filename passed!" unless $filename;
open FILE, $filename or die "failed to open $filename: $!";
}
open_file("/tmp/test.txt");
We see:
failed to open /tmp/test.txt: No such file or directory
at /home/httpd/perl/test.pl line 9.
Now we have all the information we ever need to debug this problems: we know what line of code triggered die(), we know what file was attempted to be opened and the last but not least -- the reason, which an operational system gladly tells to us thru $!
variable.
Now let's create the /tmp/test.txt
file, so it would exist
% touch /tmp/test.txt
Now when we execute the latest version of the code, we see:
failed to open a /tmp/test.txt: Permission denied
at /home/httpd/perl/test.pl line 9.
We see a different reason: I've created the file that doesn't belong to user nobody, the server runs as. So it has no permission to read the file.
Now you understand that it's much easier to debug your code if you validate the return values of the system calls, and properly code arguments to die() and warn() calls. open() function is just one of the many system calls perl provides to your convenience.
So now you can code and debug CGI scripts and modules, as easy as if they were plain perl scripts that you used to execute from a shell.
Helping error_log to Help Us
It's a good idea to keep it open all the time in a dedicated terminal with help of tail -f.
% tail -f /usr/local/apache/logs/error_log
So you will see all the errors and warning immediately showing up as they happen.
Another tips is to create an shell alias, to make it easier to execute the above command. In tcsh you would do:
% alias err "tail -f /usr/local/apache/logs/error_log"
and from now on in the shell you set the alias in, executing err will call the tail -f /usr/local/apache/logs/error_log. Since you want this alias to be available to you all the time, you should put it into a .tcshrc file or its equivalent if you don't use tcsh. (.bashrc for bash users)
The Importance of Warnings
Just like errors, perl's mandatory warnings are going to the error_log
file, if the they are enabled.
The code you write lives a dual life. In the first life it's being written, tested, debugged, improved, tested, debugged, rewritten, tested, debugged. In the second life it's being used, period.
A significant part of the first life the script spends at the developers, its personal God's machine. The other part is being spent at the production server where the developer's creature is supposed to be perfect, since it was created in his own image...
So when you develop the code you want all the help in the world, to help you spot possible problems, and that's where enabling warning is a must mode to enable. It's very important to get rid of all or at least most of the warnings that appear in the error_log
file. Why?
If there are warnings -- your code is not clean, and if they are waved away -- expect them to hit back on production server, when it's too late.
The other not less important reason, is that when each script's invocation generates more than 5 lines of warnings, it's very hard to catch real problems, as you just cannot see them among all these warnings you believe are unimportant.
On the other hand, on production server, you really *want* to turn warnings off. And there are good reasons for that:
There is no added value in having the same warning showing up, when triggered by thousands of script invocations. If your code isn't very clean and generates even a single warning per script invocation , you will end up with a huge
error_log
file in a short time on the heavily loaded server. Imagine what happens when you've got more than one warning appended to the log file. The warnings elimination phase is supposed to be a part of the development process, and should be done before the code goes live.Enabling runtime warning checking has a small performance impact (in any perl script, not just under mod_perl).
mod_perl gives you a very simple solution to this warnings saga, don't enable warnings in the scripts unless you really have to. Let mod_perl to control this mode globally. All it takes is having a:
PerlWarn On
directive added to httpd.conf
on your development machine and having a:
PerlWarn Off
directive at the live box's configuration file.
If there is a piece of code that generates warnings and really want to disable them only in this code, you can do that too. $^W
special variable allows you to dynamically turn on and off the warnings mode. So just embrace the code into a block, and disable the warnings through the scope of this block. The original value of $^W
will be restored upon exit from the block.
{
local $^W=0;
# some code that generates warnings
}
Again, unless you have a really good reason, for your own sake the advise is avoid this workaround.
Don't forget the local()
operand, as if you do, $^W
will affect all the requests processed by the same process that globally changed this variable.
diagnostics
pragma can shed more light on the errors and warnings as you will see in a moment.
diagnostics pragma
This module extends the terse diagnostics normally emitted by both the perl compiler and the perl interpreter, augmenting them with the more explicative and endearing descriptions found in the perldiag
manpage. Like the other pragmata, it affects the compilation phase of your program rather than merely the execution phase.
To use in your program as a pragma, merely invoke
use diagnostics;
at the start (or near the start) of your program. This pragma turns on the -w
mode as well.
Note that generally this pragma is useful, when you are new to perl, and want a better explanation of the errors and warnings, or when you encounter some warning you've never seen before, e.g. when this new warning was introduced in a newer version of Perl.
If leaving the warnings On on production server, might consume your hard disk space much faster, with diagnostics
pragma you will run out of space about ten times faster if your code generates warnings. Since for each line of text generated by mere warnings mode, diagnostics
generates ten times more.
The other reason, is a huge performance overhead that is being added in comparison with just having warnings On. Let's see some numbers. We will run the same benchmark, once with enabled diagnostics and once disabled on a subroutine test_code which does nothing, but doing a power of two numbers in the loop, a numeric comparison of two strings and assignment of one string to another which never happens, because the conditions is the same all the time and it's false. The wrong comparison choice is intentional and you will understand the choice in a second. By the way, the choice of the rest of the code inside test_code subroutine was absolutely at random.
use Benchmark;
use diagnostics;
my $count = 10000;
disable diagnostics;
$t1 = timeit($count,\&test_code);
enable diagnostics;
$t2 = timeit($count,\&test_code);
print "Diagnostics off:",timestr($t1),"\n";
print "Diagnostics on :",timestr($t2),"\n";
sub test_code{
for my $i (1..10) {
my $j = $i**2;
}
$a = "Hi";
$b = "Bye";
if ($a == $b) {
$c = $a;
}
}
For only a few lines of code we get:
Diagnostics off: 2 wallclock secs ( 1.77 usr + 0.02 sys = 1.79 CPU)
Diagnostics on :17 wallclock secs (13.16 usr + 0.08 sys = 13.24 CPU)
Result: the code running with enabled diagnostics runs seven times slower!!!
Now let's fix the comparison the way it should be, by replacing ==
with eq
, so we get:
$a = "Hi";
$b = "Bye";
if ($a eq $b) {
$c = $a;
}
and run the same benchmark again:
Diagnostics off: 1 wallclock secs ( 1.43 usr + 0.01 sys = 1.44 CPU)
Diagnostics on : 2 wallclock secs ( 1.41 usr + 0.01 sys = 1.42 CPU)
Amazing, but now there is no overhead at all. And why is that? As we find out, that diagnostics
pragma slows things down only when something is wrong with the code.
It was just a little example, but it's obvious that you wouldn't benchmark all your scripts to check whether you have to remove this pragma or not. Just remember to remove it, when your code goes live.
Monitoring the error_log file
While debugging my mod_perl and general CGI code, I keep the error_log
file open in a dedicated terminal window (xterm), so I can see errors and warnings as soon as they are appended to the file. I do it with:
tail -f /usr/local/apache/logs/error_log
which shows all the lines that are being added lately into the file.
If you cannot access your error_log
file because you are unable to telnet to your machine (generally a case with some ISPs who provides user CGI support but no telnet access), you might want to use a CGI script I wrote to fetch the latest lines from the file (with a bonus of colored output for an easier reading). You might need to ask your ISP to install this script for a general usage. See Watching the error_log file without telneting to the server
Hanging processes: Detection and Diagnostics
Sometimes a httpd process might hang in a middle of a request processing, either because there is a bug in your code (i.e. the code is stuck in a while loop, blocked by some system call or because of a resource deadlock) or for some other reason. There are two things we want to know: when and why this happens.
# META: handle this
#=head1 Spinning httpds
#To see where an httpd is "spinning", try adding this to your script or #a startup file:
# use Carp (); # $SIG{'USR1'} = sub { # Carp::confess("caught SIGUSR1!"); # };
#Then issue the command line:
# kill -USR1 <spinning_httpd_pid>
An Example of the Code that Might Hang the Process
Just to give you an idea of what kind of bug might cause the code to hang, let's look at the following example. Your process have to gain lock on some resource (o.e. file) before it continues, so it makes an attempt and if fails (no lock gained), it sleep()s for a second and increment the counter of attempts.
until(gain_lock()){
$tries++;
sleep 1;
}
Either because there are many processes competing on this resource or because there is a deadlock (a situation when two processes X and Y need resources A and B to continue, where X process holds on A and Y on B. There is no possibility for Y process to continue before X releases the resource A. But X cannot release A before it gets Y. Therefore this event is being known as deadlock.
A real world situation that you may encounter very often is an exclusive lock starvation. Generally there are two lock types in use: SHARED lock which allows many processes to perform simultaneously READ operation and EXCLUSIVE lock which ensures an access by a single process, which makes possible a safe WRITE operation.
You can lock any kind of resource, in our example we talk about files.
If there is a READ lock request, it is granted as soon as file becomes unlocked or already READ locked. Lock status becomes READ on success.
If there is a WRITE lock request, it is granted as soon as file becomes unlocked. Lock status becomes WRITE on success.
What happens to the WRITE lock request, is the most important. If the file is being READ locked, a process that requests to write will poll until there will be no reading or writing process left. Lots of processes can successfully read the file, since they do not block each other from doing so. This means that a process that wants to write to the file (first obtaining an exclusive lock) never gets a chance to squeeze in. The following diagram represents a possible scenario where everybody read but no one can write:
[-p1-] [--p1--]
[--p2--]
[---------p3---------]
[------p4-----]
[--p5--] [----p5----]
Let's look at the real code and see it in action. The following script imports flock() related parameters from the Fcntl
module, opens a file that will be locked and we define and set two variables: $lock_type
and $lock_type_verbose
which are set to LOCK_EX
and EX
if the first command line argument ($ARGV[0]) is defined and equal to w indicating that this process will try to gain WRITE (exclusive) lock, otherwise the two are set to
for SHARED (read) lock.LOCK_SH
and <SH
Once the variables are set, we enter the never ending while(1)
loop that attempts to lock the file by the mode set in $lock_type
, report success and type of lock that was gained, then sleeps for a random period between 0 to 9 seconds and unlocks the file. Then the loop starts from the beginning.
lock.pl
-------------------
#!/usr/bin/perl -w
use Fcntl qw(:flock);
$lock = "/tmp/lock";
open LOCK, ">$lock" or die "Cannot open $lock for writing: $!";
my $lock_type = LOCK_SH;
my $lock_type_verbose = 'SH';
if (defined $ARGV[0] and $ARGV[0] eq 'w'){
$lock_type = LOCK_EX;
$lock_type_verbose = 'EX';
}
while(1){
flock LOCK,$lock_type;
# start of critical section
print "$$: $lock_type_verbose\n";
sleep int(rand(10));
# end of critical section
flock LOCK, LOCK_UN;
}
close LOCK;
When spawning a few of the above scripts simultaneously and making sure that the first processes to start are READ processes and there is majority of them, it's very easy to see the WRITE processes starvation. Execute three read and one write processes like:
% ./lock.pl r & ; ./lock.pl r & ; ./lock.pl r & ; ./lock.pl w &
You see something like:
24233: SH
24232: SH
24232: SH
24233: SH
24232: SH
24233: SH
24231: SH
24231: SH
24231: SH
and not a single EX
line... When you kill off the reading processes, then the write lock will be gained. Note that this is a rough example, since I've used sleep() function. To emulate a real situation you need to use Time::HiRes
module which allows you to sleep for microseconds.
The interval between lock and unlock is being called a Critical Section, which should be kept as little as possible in terms of time, and not in terms of amount of the code. As you just saw, a single sleep statement can make the critical section long.
To summarize the presented case, if you have a script that uses both READ and WRITE locks and the critical section isn't very short, The writing process might get into a starvation mode and after a while a browser that initiated this request will timeout the connection and abort the request, but it's more likely that user will press the Stop or Reload button before it happens. Since the process in question just waits, there is no way for Apache to know that the request was aborted and it will hang till the lock will be gained and only when a write to a client's broken connection will be attempted, Apache will terminate the script.
So this was a single example of how the process can hang.
Detecting hanging processes
It's not so easy to detect the hanging process. There is no way you can tell how long the request is being processed by using plain system utilities like ps() and top(). The reason is that each Apache process serves many requests without quitting. System utilities can tell how long the process is running since its creation, but this information is useless in our case, since the long running Apache process is a normal and expected behavior.
However there are a few approaches that can help to detect the hanging process.
If the process hangs and demands lots of resources it's quite easy to bust it by monitoring the output of top() utility. You will see the same process show up in the first few lines of the automatically refreshed report. But many times the hanging process, uses little or close to zero resources, e.g. when waiting for some event to happen.
Another easy spotting is when some process trashes the error_log
and writes millions of error messages there... Generally this process uses lots of resources and spotted by using top() as described above.
What we have to use are the tools that report the status of the Apache processes. You can use either a mod_status module, which usually accessed from /server_status
location, or an Apache::VMonitor
module. Both tools provide counters of processed requests per Apache process. So what you can do is to watch the report for about 5-10 minutes spotting which process number has the same number of processed requests while its status is 'W' (Which means that it hangs), but when you have about 50 processes, it's quite hard to spot such a process. So let's write a watchdog to do the work for us:
.....META??? Apache::SafeHang code
When you've got a real problem and the processes hang one after the other, the moment comes when the number of hanging processes becomes equal to the value of MaxClients
directive, which means that no more processes will be spawned and your service is halted from the point of user. This is easy to detect, attempt to resolve and notify the administrator by a simple crontab watchdog that requests some very light script an every minute or so. (See Monitoring the Server. A watchdog.)
In the watchdog you set a timeout you think is appropriate for your service, which may vary between a few seconds and 1 minute. If the server fails to respond before the timeout expires, watchdog has spotted a trouble and attempts to restart the server. After a restart an email report is being sent to administrator reporting first that there was a problem, second whether the restart was successful or not.
If you get such reports constantly something is wrong with your web service and you should revise your code. Note that it's possible that your server is overloaded when being hit by more requests that it can handle, so the requests are being queued and not processed for awhile, which triggers the watchdog's alarm. If this is a case you need to add more servers, memory and probably to split your single machine across a cluster of webserver machines.
Determination of the reason
Given the process pid, there are two ways to find out where it's hanging. Depending on operating system you should have either truss
or strace
utilities available within your code development software. The usage is simple:
% truss -p PID
or
% strace -p PID
Replace PID with a process number you want to check on.
Let's write a program that hangs and deploy strace to find out the point it hangs at:
hangme.pl
---------
$|=1;
my $r = shift;
$r->send_http_header('text/plain');
print "PID = $$\n";
while(1){
$i++;
sleep 1;
}
The reason this simple code hangs is obvious from its examination -- the program never breaks from the while loop. As you have noticed, I print the PID of the current process to the browser, to learn what process to look after. Of course in a real situation, you cannot do the same trick. In the previous section I have presented a few ways to detect the runaway processes and their PIDs.
I save the above code in a file and execute it from the browser. Note that I've made the STDOUT unbuffered with $|=1;
so I would immediately see the process ID. Once the script make a request the script prints its process PID and obviously hangs. So we press the 'Stop'
button, but the process continues to hang in this code. Isn't apache supposed to detect the broken connection and abort the request processing? Yes and No, you will understand soon what's really happening.
First let's attach to the process and see what's it doing. I use the PID the script printed to the browser, which is 10045 in this case:
% strace -p 10045
[...truncated an identical output...]
SYS_175(0, 0xbffff41c, 0xbffff39c, 0x8, 0) = 0
SYS_174(0x11, 0, 0xbffff1a0, 0x8, 0x11) = 0
SYS_175(0x2, 0xbffff39c, 0, 0x8, 0x2) = 0
nanosleep(0xbffff308, 0xbffff308, 0x401a61b4, 0xbffff308, 0xbffff41c) = 0
time([940973834]) = 940973834
time([940973834]) = 940973834
[...truncated the identical output...]
It doesn't what we have expected to see, does it? These are some system calls we don't see in our little example. What we actually see is how Perl translates our code into a system calls. Since we know that our code hangs in this snippet:
while(1){
$i++;
sleep 1;
}
We "easily" figure out that the first three system calls implement the $i++
, while the other other three are responsible for the sleep 1
call.
Generally the situation is quite opposite. You detect the hanging process, you attach to it and watch the trace of calls it does (or the last commands if the process hangs waiting for something, e.g. when blocking on file lock request). From watching the trace you should figure out what actually it's doing and probably find the corresponding lines in your perl code. For example let's see how one process "hangs" while requesting an exclusive lock on the file exclusively locked by another process:
excl_lock.pl
---------
use Fcntl qw(:flock);
use Symbol;
if ( fork() ) {
my $fh = gensym;
open $fh, ">/tmp/lock" or die "cannot open /tmp/lock $!";
print "$$: I'm going to obtain the lock\n";
flock $fh, LOCK_EX;
print "$$: I've got the lock\n";
sleep 20;
close $fh;
} else {
my $fh = gensym;
open $fh, ">/tmp/lock" or die "cannot open /tmp/lock $!";
print "$$: I'm going to obtain the lock\n";
flock $fh, LOCK_EX;
print "$$: I've got the lock\n";
sleep 20;
close $fh;
}
The code is simple. The process executing the code forks a second process, and both are doing the same thing: generate an unique symbol to be used as a file handler open the lock file for writing using the generated symbol, lock the file in an exclusive mode sleep for 20 seconds, pretending doing some lengthy operations and close the lock file, which also unlocks the file.
gensym
function is a courtesy of Symbol
module the code imports it from. Fcntl
module provides us with a symbolic constant LOCK_EX
which is being imported with :flock
tag, which imports this an other flock() function attributes.
The code used by both processes is identical, therefore we cannot predict which one will get its hands on the lock file and succeed to lock it first, so we add print() statements to find out the PID of the blocking on lock request process.
When the above code executed from the command line, we see that one of the processes gets the lock:
% ./excl_lock.pl
3038: I'm going to obtain the lock
3038: I've got the lock
3037: I'm going to obtain the lock
We see that process 3037 is blocking (waiting to get the lock), so we attach to it:
% strace -p 3037
about to attach c10
flock(3, LOCK_EX
It's clear from the above trace, that the process waits for exclusive lock.
The more you watch traces of different processes, the easier the understanding of what actually happens would be
Another approach to see another kind of trace of the running code is to use gdb
(GNU debugger) (or another debugger). It's supposed to work at any platform the GNU development tools were ported to. Its purpose is to allow you to see what is going on ``inside'' another program while it executes--or what another program was doing at the moment it crashed. gdb
requires the path to the binary program that the process you want to examine is executing, in addition to the process ID. In case of perl code it's /usr/bin/perl
or a different path, for httpd process it would be the path to your httpd executable. I will show a few examples of using gdb to get a better understanding.
For example let's go back to our last locking example, execute it as before and attach to the process that didn't get the lock and waits:
% gdb /usr/bin/perl 3037
The moment the debugger was started, we execute where command to see the trace:
(gdb) where
#0 0x40131781 in __flock ()
#1 0x80a5421 in Perl_pp_flock ()
#2 0x80b148d in Perl_runops_standard ()
#3 0x80592b8 in perl_run ()
#4 0x805782f in main ()
#5 0x400a6cb3 in __libc_start_main (main=0x80577c0 <main>, argc=2,
argv=0xbffff7f4, init=0x8056af4 <_init>, fini=0x80b14fc <_fini>,
rtld_fini=0x4000a350 <_dl_fini>, stack_end=0xbffff7ec)
at ../sysdeps/generic/libc-start.c:78
Again, that's not what we've expected to see and now it's a different trace. #0
tells us the most recent call that was executed, which is a C language level flock()'s implementation, but the previous call (#1
) isn't print() as we would expect, but a higher level of Perl's internal flock(). If we follow the trace of calls, what we actually see is an Opcodes tree, which can be better presented as:
__libc_start_main
main ()
perl_run ()
Perl_runops_standard ()
Perl_pp_flock ()
__flock ()
So I would say that it's less useful than strace
, since it's almost impossible to know which of the flock()s was called if there are more than one in the code, something that is strace
solves by showing the sequence of the system calls that are being executed, so using the sequence we can locate the corresponding lines in the code.
(META: the above is wrong - you can ask to display the previous command! What is it?)
For your information, when you attach to a running process with debugger, the program stops its executing and the control over the program is being passed to a debugger, so you can continue the normal program run with continue
command or to execute it step by step with next
and step
commands you type at the gdb
prompt. (next
steps over any function calls in the line, while step
steps into them).
C/C++ debuggers is a very large topic and I wouldn't discuss it in the scope of this document, but a gdb man page is quite a good document to start with. You might want also to check the ddd
(Data Display Debbuger) which provides a visual interface to gdb
and other debuggers. It even knows to debug perl programs!!!
For a completeness let's see the gdb trace of the httpd process that still hangs in the while(1)
loop of the first example in this section.
% gdb /usr/local/apache/bin/httpd 1005
(gdb) where
#0 0x4014a861 in __libc_nanosleep ()
#1 0x4014a7ed in __sleep (seconds=1) at ../sysdeps/unix/sysv/linux/sleep.c:78
#2 0x8122c01 in Perl_pp_sleep ()
#3 0x812b25d in Perl_runops_standard ()
#4 0x80d3721 in perl_call_sv ()
#5 0x807a46b in perl_call_handler ()
#6 0x8079e35 in perl_run_stacked_handlers ()
#7 0x8078d6d in perl_handler ()
#8 0x8091e43 in ap_invoke_handler ()
#9 0x80a5109 in ap_some_auth_required ()
#10 0x80a516c in ap_process_request ()
#11 0x809cb2e in ap_child_terminate ()
#12 0x809cd6c in ap_child_terminate ()
#13 0x809ce19 in ap_child_terminate ()
#14 0x809d446 in ap_child_terminate ()
#15 0x809dbc3 in main ()
#16 0x400d3cb3 in __libc_start_main (main=0x809d88c <main>, argc=1,
argv=0xbffff7e4, init=0x80606f8 <_init>, fini=0x812b33c <_fini>,
rtld_fini=0x4000a350 <_dl_fini>, stack_end=0xbffff7dc)
at ../sysdeps/generic/libc-start.c:78
Just as before we can see a complete trace of the last executed call.
As you noticed I still didn't provide the promised explanation of the reason, the hanging in while(1)
loop request processing wasn't aborted by Apache. The next section covers the case.
#=head1 Examples of strace (or truss) usage
#(META: below are some snippets of strace outputs from list's emails)
#[there was a talk about Streaming LWP through mod_perl and the topic #was suggested optimal buffer size]
#Optimal buffer size depends on your system configuration, watch #apache with strace -p
(or truss
) when its sending a static file, here #perlfunc.pod on my laptop (linux 2.2.7):
# writev(4, [{"HTTP/1.1 200 OK\r\nDate: Wed, 02"..., 289}, {"=head1 # NAME\n\nperlfunc - Perl b"..., 32768}], 2) = 33057 # alarm(300) = 300 # write(4, "m. In older versions of Perl, i"..., 32768) = 32768 # alarm(300) = 300 # write(4, "hout waiting for the user to hit"..., 32768) = 32768 # alarm(300) = 300 # write(4, ">&STDOUT") || die "Can't dup "..., 32768) = 32768 # alarm(300) = 300 # write(4, "LEHANDLE is supplied. This has "..., 32768) = 32768 # alarm(300) = 300 # write(4, "ite,\nseek, tell, or eo"..., 25657) = 25657
Handling the 'User pressed Stop button' case
When a user presses STOP or RELOAD buttons, Apache detects this event via a SIGPIPE
signal (Broken pipe) and ceases the script execution and performs all the cleanup stuff it has to do. It's important to stress the point that SIGPIPE
will be triggered only when a process, that handles the connection that went broken, will attempt to send some data to the client (browser). If the script is doing some lengthy operation, without writing a thing to the client, it wouldn't be stopped until before the operation is completed and at least one character was sent back to the client.
This will work for apache >= 1.3.6, where it will not catch SIGPIPE anymore and modperl will do it much better. Here is a snippet from a Apache 1.3.6 CHANGES file.
*) SIGPIPE is now ignored by the server core. The request write
routines (ap_rputc, ap_rputs, ap_rvputs, ap_rwrite, ap_rprintf,
ap_rflush) now correctly check for output errors and mark the
connection as aborted. Replaced many direct (unchecked) calls to
ap_b* routines with the analogous ap_r* calls. [Roy Fielding]
Since Apache version 1.3.6:
$r->print
returns true on success, false on failure (broken connection).If you want the old
SIGPIPE
semanics, simply configure:PerlFixupHandler Apache::SIG
Detecting Aborted Connections
Let's use the knowledge we have acquired before to trace the execution of the code and see all the events as they are happening.
Let's take a little script that obviously "hangs" the server:
my $r = shift;
$r->send_http_header('text/plain');
print "PID = $$\n";
$r->rflush;
while(1){
$i++;
sleep 1;
}
The script gets a request object $r
by shift()ing it from the @_
argument list passed by the handler() subroutine. (The magic is being done by Apache::Registry
of course). Then the script sends a Content-type header, saying to the client that we are going to send a plain text.
We print out a single line telling us the number of the process that handles this request, which we need to know in order to run the tracing utility. Then we flush Apache's buffer, since if we don't we would never see the line printed. That's because the length of the output we print is very small and the buffer wouldn't be flushed before it becomes full or the request is over. Since our script intentionally hangs, we have to enforce the buffer to get flushed.
Then we enter a never ending while(1)
loop, which all it does is incrementing a dummy $i
variable and sleeping for a second, before returning on the two operations again and again.
Running strace -p PID
, where PID is the process ID as printed to the browser, we see the following output printed every second:
SYS_175(0, 0xbffff41c, 0xbffff39c, 0x8, 0) = 0
SYS_174(0x11, 0, 0xbffff1a0, 0x8, 0x11) = 0
SYS_175(0x2, 0xbffff39c, 0, 0x8, 0x2) = 0
nanosleep(0xbffff308, 0xbffff308, 0x401a61b4, 0xbffff308, 0xbffff41c) = 0
time([941281947]) = 941281947
time([941281947]) = 941281947
Let's leave the strace
running and press the STOP button now. Anything was changed? No, the same trace printed every second. Which means that Apache didn't detect the broken connection, which verifies the statement that the script has to write something to trigger the SIGPIPE
event.
Let's try to write that will write a NULL \0
character to the client so the detection would be possible as soon the Stop button was pressed:
while(1){
$r->print("\0");
last if $r->connection->aborted;
$i++;
sleep 1;
}
We add a print() statement to print a NULL character and then we check whether the connection was aborted. If it was, we break from the loop.
But if we run this script and strace on it as before, we see that it still doesn't work. What's missing is a flushing of the buffer, when we add it:
my $r = shift;
$r->send_http_header('text/plain');
print "PID = $$\n";
$r->rflush;
while(1){
$r->print("\0");
$r->rflush;
last if $r->connection->aborted;
$i++;
sleep 1;
}
Watch the strace's output on the running process and press the Stop button, we see:
SYS_175(0, 0xbffff41c, 0xbffff39c, 0x8, 0) = 0
SYS_174(0x11, 0, 0xbffff1a0, 0x8, 0x11) = 0
SYS_175(0x2, 0xbffff39c, 0, 0x8, 0x2) = 0
nanosleep(0xbffff308, 0xbffff308, 0x401a61b4, 0xbffff308, 0xbffff41c) = 0
time([941284358]) = 941284358
write(4, "\0", 1) = -1 EPIPE (Broken pipe)
--- SIGPIPE (Broken pipe) ---
select(5, [4], NULL, NULL, {0, 0}) = 1 (in [4], left {0, 0})
time(NULL) = 941284358
write(17, "127.0.0.1 - - [30/Oct/1999:13:52"..., 81) = 81
gettimeofday({941284359, 39113}, NULL) = 0
times({tms_utime=9, tms_stime=8, tms_cutime=0, tms_cstime=0}) = 41551400
close(4) = 0
SYS_174(0xa, 0xbffff4e0, 0xbffff454, 0x8, 0xa) = 0
SYS_174(0xe, 0xbffff46c, 0xbffff3e0, 0x8, 0xe) = 0
fcntl(18, F_SETLKW, {type=F_WRLCK, whence=SEEK_SET, start=0, len=0}
Apache detects the broken pipe as we see from this snippet:
write(4, "\0", 1) = -1 EPIPE (Broken pipe)
--- SIGPIPE (Broken pipe) ---
Then stops the script, and does all the cleanup work, like access logging:
write(17, "127.0.0.1 - - [30/Oct/1999:13:52"..., 81) = 81
That's what we see in a access_log
file, 17 is a file descriptor of this file in this process. We will immediately talk about cleanups, since it's a very critical issue, with aborted scripts. But first let's see how can we make the code more generic.
Apache::SIG
comes to help us, the following script doesn't need to check for aborted connections.
use Apache::SIG ();
Apache::SIG->set;
my $r = shift;
$r->send_http_header('text/plain');
print "PID = $$\n";
$r->rflush;
while(1){
$r->print("\0");
$r->rflush;
$i++;
sleep 1;
}
META: it kills the server!!! ???
Apache::SIG
installs the SIGPIPE
handler, that stops the script's execution for us.
If you would like to log when a request was canceled by a SIGPIPE
in your Apache access_log, you can declare Apache::SIG
as a handler (any Perl*Handler
will do, as long as it is run before PerlHandler
, e.g. PerlFixupHandler
), and you must also define a custom LogFormat
in your httpd.conf, like this:
PerlFixupHandler Apache::SIG LogFormat "%h %l %u %t \"%r\" %s %b %{SIGPIPE}e"
If the server has noticed that the request was canceled via a SIGPIPE
, then the log line will end with 1, otherwise it will just be a dash.
The Importance of Cleanup Code
Now the question is what happen to the locked resources if there are any? Will they be freed or not? Since if there are not, any script using these resources and the same advisory locking scheme, will be unable to run and will hang, waiting for this resource to get free, something that would never happen.
Under mod_cgi this was a problem only if you happened to use external lock files for lock indication, instead of using flock(). (there are systems where flock(2) unavailable, and you can use Perl's emulation of this function). If the script was aborted in the between lock and unlock code and you didn't worry to write a cleanup code to remove locks or otherwise write a code to break locks that are too old and suspected to be dead, you are in a big trouble.
With mod_cgi you can create an END
block, and put the cleanup code there:
END{
# some code that ensures that locks are removed
}
When the script is aborted, Apache will run the END
blocks. But if you use flock()
things are much simpler, since all opened files will be closed and all the internally locked resources will be freed, because when the file is being closed, the lock is being removed as well.
Things are more complex with mod_perl. Unless you explicitly close() the files, they wouldn't be automatically closed, since the processes don't exit upon a single request processing completion. Let's see what problems we might encounter and possible solutions for them.
Critical Section
I want to make a little step aside and discuss a "critical section" issue before we continue.
Let's start with resource locking scheme. A schematic representation of a proper locking technique is as follows:
1. lock a resource
<critical section starts>
2. do something with the resource
<critical section ends>
3. unlock the resource
If the locking is exclusive, only one process can hold the resource at any given time, which means that all the other processes will have to wait, and this code snippet becomes a so called bottleneck. That's why the section of the code where the resource is locked is called critical and you must make it as short as possible.
In a shared locking scheme, where many processes can concurrently access the resource, it's important to keep the critical section as short as possible as well, if there are processes that sometimes want to get an exclusive lock. This code uses a shared lock, but has a non-optimized critical section:
use Fcntl qw(:flock);
use Symbol;
my $fh = gensym;
open $fh, "filename" or die "$!";
flock $fh, LOCK_SH;
# start critical section
seek $fh, 0, 0;
my @lines = <$fh>;
for(@lines){
print if /foo/;
}
# end critical section
close $fh; # close unlocks the file
It opens the file for reading, locks and rewinds to the start, reads all the lines in and prints out the lines that include a foo string in them. Since once the file was read, we don't need it opened and locked anymore, we might close it earlier, since the loop might take some time to complete so we move it after the resource was freed:
use Fcntl qw(:flock);
use Symbol;
my $fh = gensym;
open $fh, "filename" or die "$!";
flock $fh, LOCK_SH;
# start critical section
seek $fh, 0, 0;
my @lines = <$fh>;
# end critical section
close $fh; # close unlocks the file
for(@lines){
print if /foo/;
}
This is another very similar script, but now using a shared lock. It reads in a file and writes it back prepending a number of new text lines to a head of the file.
use Fcntl qw(:flock);
use Symbol;
my $fh = gensym;
open $fh, "+>>filename" or die "$!";
flock $fh, LOCK_EX;
# start critical section
seek $fh, 0, 0;
my @add_lines =
(
qq{Complete documentation for Perl, including FAQ lists,\n},
qq{should be found on this system using `man perl' or\n},
qq{`perldoc perl'. If you have access to the Internet, point\n},
qq{your browser at http://www.perl.com/, the Perl Home Page.\n},
);
my @lines = (@add_lines, <$fh>);
seek $fh, 0, 0;
truncate $fh, 0;
print $fh @lines;
# end critical section
close $fh; # close unlocks the file
First let's explain how the code works. I will discuss in a minute why did I use Symbol
module to generate the file handler variables.
Since we want to read the file, modify and write it back, without anyone changing it on the way, we open it for read and write with help of +>> (you could get away with +< as well, see perldoc -f open or perlfunc manpage for more information about open() function) and lock it with exclusive lock. You cannot safely accomplish this task with opening the file first for read and then reopening for write, since another process might change the file, between the stages.
Next the code prepares the lines of text it wants to prepend to the head of the file, and assigns them and the content of the file to @lines
array. Now when we have a data that ready to be written back to the file, the file is being rewinded to the start with help of seek() and truncate()d to a zero size, which is useless in our case, but a must thing if there is a chance that the file will shrink. In our example the file only grows. But it's better to always use truncate(), as you never know what changes your code might undergo in the future. This operation is not the one that you will blame for a performance overhead.
Finally we write the data to the file and close it, which unlocks it as well. Did you notice that we created the text lines to be prepended, as close to the place of usage as possible, according to a locality of code style, which is good but it makes the critical section longer. In such a places you should sacrifice any style rules you've got used to, in order to make the critical section as short as possible. A corrected version of this script with the shorter critical section looks like:
use Fcntl qw(:flock);
use Symbol;
my @lines =
(
qq{Complete documentation for Perl, including FAQ lists,\n},
qq{should be found on this system using `man perl' or\n},
qq{`perldoc perl'. If you have access to the Internet, point\n},
qq{your browser at http://www.perl.com/, the Perl Home Page.\n},
);
my $fh = gensym;
open $fh, "+>>filename" or die "$!";
flock $fh, LOCK_EX;
# start critical section
seek $fh, 0, 0;
push @lines, <$fh>;
seek $fh, 0, 0;
truncate $fh, 0;
print $fh @lines;
# end critical section
close $fh; # close unlocks the file
The difference is in preparing the text lines before the file is locked and appending the rest of the file to the @lines
array, instead of creating a new array and copying the lines that were available before the locking time after it as in the original example.
Safe Resource Locking
Let's get back to the main issue of this section, which is a safe locking.
If didn't make a habit of closing all the files that you open, you will encounter lots of troubles, unless you use the Apache::PerlRun
handler that does the cleanup for you. If you open the file but doesn't close it, you will have a file descriptor leakage. Since a number of file descriptors available is final, at some point you will run out of them and your service will cease its operations.
This is bad, but you can live with this till before you run out of file descriptors, of course this will happen much faster on a heavily used server. But this is nothing relative to the trouble you enter yourself into if you lock the files and forget to unlock or close them. Since close() always unlocks the file, you don't have to explicitly unlock files. Unlocked file will stay locked after your code finished, and all the other scripts requesting to lock the same resource (file) will wait indefinitely for it to become unlocked. Since it would never happen, until the server restart time, all these processes would hang. This is the offending code:
open IN, "+>>filename" or die "$!";
flock IN, LOCK_EX;
# do something
# quit without closing and unlocking the file
OK, so let's add the close():
open IN, "+>>filename" or die "$!";
flock IN, LOCK_EX;
# start critical section
# do something
# end critical section
# close and unlock the file
close IN;
Is it a safe code now? Unfortunately it is not. If user aborts the request by pressing Stop
or Reload
buttons in the middle of the critical section, there is a chance that script will be aborted in before it had a chance to close() the file, which returns us back to the situation where we were forgetting to close the files in first place.
What is the remedy for this poison? There are few approaches to solve this problem. If you are running under Apache::Registry
and friends handlers, the END
block will perform the cleanup work for you, the same way you might use it in the scripts running under mod_cgi or in the plain perl scripts. Just add the cleanup code to this block and you are all safe. If you are writing your own handlers you register_cleanup() allows you to register code similar to the END
blocks, since END
blocks will be executed only when a process exits, and not after a request completion.
We will see a few examples later. Now I want to show a much easier safe locking solution. The problem we have encountered, is actually lays in the fact that file handlers like IN
are global variables. If we could make them lexically scoped all our worries would go away. You know that lexically scoped (with my() operand) variables are being automatically destroyed when they go out of scope, so when the program quits all the lexical variables will be destroyed, since they leave the file scope. When the variable holding an opened file descriptor is being destroyed, the file will be automatically closed.
So if you use this technique to work with files, you even don't have to close the files! You still want to make sure that you close them as soon as possible if you recall the critical section discussion. In addition to this safe file handling having the file handlers lexically scoped, protect you from names collisions, e.g when you have to open more than one file, you always have to make sure you didn't use the same name somewhere else in the code and that file is might still be open. To emphasize the risk of collisions think of subroutine that opens a file for you:
sub open_file{
my $filename = shift;
open FILE, ">$filename" or die "$!";
return \*FILE;
}
my $fh1 = open_file("/tmp/x");
my $fh2 = open_file("/tmp/y");
print $fh1 "X";
print $fh2 "Y";
Obviously this code doesn't do what you think it should do. Instead of writing a character X
to /tmp/x file and Y
to /tmp/y, what you see is that /tmp/x is empty and /tmp/y contains a XY
string. Why is that? Because you have used the same global variable, and when you have called open_file() for a second time, it opened a different file using the same variable. Since open_file() returns a reference to a file handler and it's the same global variable all the time -- both $fh1
and $fh2
point to it.
However, as you just saw we can generate unique file handlers that can be lexically scoped with Symbol
module. Symbol::gensym()
creates an anonymous glob and returns a reference to it. Such a glob reference can be used as a file or directory handle.
use Symbol;
my $fh = gensym;
open $fh, "+>>filename" or die "$!";
flock $fh, LOCK_EX;
# do something
Now the file will be always unlocked a the end of the request's processing. Instead of close() you might use a block:
use Symbol;
{
my $fh = gensym;
open $fh, "+>>filename" or die "$!";
flock $fh, LOCK_EX;
# do something
}
# the file will be automatically closed and unlocked at this point
But this is not so obvious to the reader of the code so you might want to avoid the last technique.
You can use the IO::*
modules as well, such as IO::File
or IO::Dir
, but these are much bigger than <Symbol> module, and worth using for file or directory opening only if you are already using them for other features they provide. As a matter of fact, these modules use <Symbol> module themselves. The examples of their usage:
use IO::File;
my $fh = new IO::File "> filename";
# the rest is as before
and:
use IO::Dir;
my $dh = new IO::Dir "dirname";
Cleanup Code
Finally, let's see when do we need a special clean up code. As you just saw we have solved the problem of file handers by lexically scoping them. There are situation, you must write a cleanup code. A good example for this is a tied dbm file.
A reminder: dbm file is a simple database, which allows you to store pairs of keys and values in it. As of this writing Berkeley DB is the most advanced dbm implementation, and allows you to store key/values using the HASH, BTREE and RECNO algorithms. (refer to a DB_File
man page for more info.) DB_File
module provides a Perl interface to 1.x versions of Berkeley DB. (BerkeleyDB
module should handle more recent Berkeley DB versions 2 and 3)
Working with dbm files is very simple, because they are represented in Perl as a simple hash variables, with help of TIE interface, and they behave exactly like hashes. In order to access a dbm file you have to tie it first:
use Fcntl qw(O_RDWR O_CREAT);
use DB_File;
my $filename = "/tmp/mydb";
my %hash;
tie %hash, 'DB_File', $filename, O_RDWR|O_CREAT, 0660, $DB_HASH
or die "Can't tie %hash : $!";
A first argument to tie() is a hash variable, we want the dbm file to be tied to. Following arguments are a name of the module that provides an interface to a dbm implementation we want to use, DB_File
in our case, then a filename the dbm resides in, Fcntl flags, file permissions and finally the interface method (DB_HASH, DB_BTREE or DB_RECNO) to be used.
From now on we use %hash
to read from and write to a dbm file, like:
my $name = $hash{foo};
$hash{foo} = "Larry Wall";
The only nuance is that when we modify the hash by assigning some values, it doesn't write the changes immediately to a file, but caches them to improve a performance. It flushes its cache buffers when either they become full, a sync() method is being called on its database handler or the hash is being untied (closed). So if the program quits abnormally, a dbm file might get corrupted.
To untie the dbm file, you simply call:
untie %hash;
To get the access to sync() method, you should retrieve the database handler which is being returned by tie() method:
my $dbh = tie %hash, 'DB_File', $filename, O_RDWR|O_CREAT, 0660, $DB_HASH
or die "Can't tie %hash : $!";
Now you can flush the cache with:
$hash{foo} = "Larry Wall";
$dbh->sync;
Important: If you have saved a copy of the object returned from tie(), the underlying database file will not be closed until both the tied variable is untied and all copies of the saved object are destroyed. We do it as follows
undef $dbh;
untie %hash;
Of course, you have to lock the dbm file exactly like any other resource if some script modifies its contents. Refer to Locking dbm handlers for more info.
Ok, enough with introduction, let's get to the point. Since both %hash
and $dbh
are lexically scoped variables, they always will be destroyed, no matter whether you forgot to untie() or the request was aborted before the untie() part.
Suppose that you want to take the benefit of mod_perl's persistent global variables in each process and to use this feature to create persistent dbm hashes. So you tie them only once per process, and save the time to tie() and untie() per request. The idea is good, assuming that you remember that you have to flush the cache buffers when you modify the hash that represents the dbm file with sync() method.
Let's code the idea:
use strict;
use vars qw($dbh %hash);
use Fcntl qw(:flock O_RDWR O_CREAT);
use DB_File;
use Symbol;
We declare $dbh
and %hash
as global variables, then pull in the Fcntl
module and import the symbols we are going to use. Actually we need only LOCK_EX
from the tags provided by :flock
. We pull in DB_File
and Symbol
modules.
my $r = shift;
$r->send_http_header('text/plain');
$r->print("PID $$\n");
Send the Content-type header of plain text type and tell the user the PID of the process that serves the request.
my $filename = "/tmp/mydb";
my $lockfile = "$filename.lock";
Configure the location of the dbm file and its lock file.
my $fh = gensym;
open $fh, ">$lockfile" or die "Cannot open $lockfile: $!";
flock $fh, LOCK_EX;
Generate a unique anonymous glob, store it in a lexically scoped variable $fh
and lock the file, which in turn advisory locks the dbm file which will be safely tied now, because for the other copies of this script to access the following code they have to acquire the lock file first, and since it's an exclusive lock, only one replication of the script will be able to tie the dbm file.
$dbh ||= tie %hash, 'DB_File', $filename, O_RDWR|O_CREAT, 0660, $DB_HASH
or die "Can't tie %hash : $!";
This code snippet demands some deeper explanation.
$a ||= $b;
is the same as:
$a = $a || $b;
The ||
check is a boolean one (testing for truth) and it doesn't care about undefined values, since undef
is false
in Perl. So what it does is: leave $a
unmodified if it's a true value, otherwise test $b
and assign its value to $a
if it's true. If it's false as well, $a
stays undefined. (note that 0 and ""
(empty string) are both defined but false values!) (refer to perlop(1) manpage for more info about ||
operator)
Back to our tie() snippet. For each mod_perl process when this code will be executed for the first time, $dbh
variable is undefined, therefore a right part of the statement will be executed, which will tie() the dbm file. On every consequent code execution in the same process, $dbh
will contain a database handler which is a true value, so the tie() call will be saved.
$hash{int rand 10} = (qw(a b c d))[int rand 4];
Fill the dbm file with random keys and values. Each invocation of the code would either generate a new key/value pair or override an old one, if an existing key will be chosen by rand().
$dbh->sync();
The most important part of the code is to flush the modifications.
# unlock the db
close $fh;
Now it's safe to unlock the dbm file. Please refer to Locking dbm handlers to learn why you should use a dbm's file descriptor to lock itself. To make long explanations short -- it may get your dbm file corrupted.
# printout the contents of the the dbm file
print map {"$_ => $hash{$_}\n"} sort keys %hash;
After we leave the critical section, we can take our time and print out the current contents of the dbm file.
Here is the same code in one piece:
use strict;
use vars qw($dbh %hash);
use Fcntl qw(:flock O_RDWR O_CREAT);
use DB_File;
use Symbol;
my $r = shift;
$r->send_http_header('text/plain');
$r->print("PID $$\n");
my $filename = "/tmp/mydb";
my $lockfile = "$filename.lock";
my $fh = gensym;
open $fh, ">$lockfile" or die "Cannot open $lockfile: $!";
# must lock the db file before opening it
flock $fh, LOCK_EX;
$dbh ||= tie %hash, 'DB_File', $filename, O_RDWR|O_CREAT, 0660, $DB_HASH
or die "Can't tie %hash : $!";
# fill the dbmfile with random keys values
$hash{int rand 10} = (qw(a b c d))[int rand 4];
# sync the DB
$dbh->sync();
# unlock the db
close $fh;
# printout the contents of the the dbm file
print map {"$_ => $hash{$_}\n"} sort keys %hash;
Well, if you run this code, you pretty soon figure out that this code doesn't do what we thought it would. What happens is that each process keeps its own copy of the %hash
and modifies it. When the process calls sync() method, the dbm file is being updated and now equal to the contents of the %hash
of this process. If the next request will be processed by the process that didn't yet tie()d the %hash
it would be initialized to the value of the %hash
of the last process that called sync() on this dbm file, but if it would be handled by a process that already tied %hash
before it wouldn't read the contents from the dbm file but use its private value of the %hash
.
In reality things are even more complicated. The above scenario is true only when the hash file is smaller than a buffer size of the dbm file, when it becomes bigger than buffer, its contents are being flushed. So when you do keys %hash
, all the keys should be brought from the dbm file, which causes the process to read the values saved by the previous sync() calls and buffer overflow automatic flushes. Which creates a whole big mess with data and makes the whole idea unreal and useless.
But if we have arrived so far, let's see what other thing is flawed in this code. It's the sync() call. If script is being stopped before sync() called, the dbm will be unlocked, since $fh
is lexically scoped, but it wouldn't be properly sync()ed, which at some point will corrupt the dbm file.
The solution is quite simple -- write an END
block to sync the file:
END{
# make sure that the DB is flushed
$dbh->sync();
}
The above will work only for Apache::Registry
scripts, otherwise the END
will be postponed till the process termination time. If you write a handler in Perl API use the register_cleanup()
method instead. It accepts a reference to a subroutine as an argument:
$r->register_cleanup(sub { $dbh->sync() });
Even a more correct code would be to check whether the connection was aborted, since you if you don't check -- the cleanup code will be always executed, which can be an unwanted thing for a normally finished scripts.
$r->register_cleanup
(sub {
$dbh->sync() if Apache->request->connection->aborted();
});
So in the case of END
block usage you would use:
END{
# make sure that the DB is flushed
$dbh->sync() if Apache->request->connection->aborted();
}
Note that if you use register_cleanup()
it should be used at the beginning of the script, or as soon as variables you want to use in this code becomes available. If you use it at the end of the script, and script is being aborted before this code is reached, there will be no cleanup performed.
For example CGI.pm
registers the cleanup subroutine in its new() method:
sub new {
# code snipped
if ($MOD_PERL) {
Apache->request->register_cleanup(\&CGI::_reset_globals);
undef $NPH;
}
# more code snipped
}
There is also another way to register a cleanup code for Perl API handlers. You may use a PerlCleanupHandler
in the configuration file, like:
<Location /foo>
SetHandler perl-script
PerlHandler Apache::MyModule
PerlCleanupHandler Apache::MyModule::cleanup()
Options ExecCGI
</Location>
where Apache::MyModule::cleanup()
is supposed to perform a cleanup.
Handling the server timeout cases and working with $SIG{ALRM}
A similar situation to Pressed Stop button disease happens when client (browser) timeouts the connection (is it about 2 minutes?) . There are cases when your script is about to perform a very long operation and there is a chance that its duration will be longer than the client's timeout. One case I can think about is the DataBase interaction, where the DB engine hangs or needs a lot of time to return results. If this is the case, use $SIG{ALRM}
to prevent the timeouts:
$timeout = 10; # seconds
eval {
local $SIG{ALRM} =
sub { die "Sorry timed out. Please try again\n" };
alarm $timeout;
... db stuff ...
alarm 0;
};
die $@ if $@;
But, as lately it was discovered local $SIG{'ALRM'}
does not restore the original underlying C handler. It was fixed in the mod_perl 1.19_01 (CVS version). As a matter of fact none of the local $SIG{FOO}
restore the original C handler - read Debugging Signal Handlers ($SIG{FOO}) for a debug technique and a possible workaround.
Looking inside the server
Your server is up and running. But something appears to be wrong. You want to see the numbers to tune your code or server configuration. Finally you just want to know what's really going on inside the server. How do you do it?
There are a few tools that allow you to look inside the server.
Apache::Status -- Embedded interpreter status information
This is a very useful module. It lets you watch what happens to the Perl parts of the server. To see the size of all subroutines and variables, see variables dumps and lexical info, OPcode trees and more.
Minimal Configuration
This configuration enables the Apache::Status
module without all the additional features. Add this to http.conf:
<Location /perl-status>
SetHandler perl-script
PerlHandler Apache::Status
order deny,allow
#deny from all
#allow from
</Location>
If you are going to use Apache::Status
it's important to put it as the first module in the start-up file, or in the httpd.conf:
# startup.pl
use Apache::Status ();
use Apache::Registry ();
use Apache::DBI ();
If you don't put Apache::Status
before Apache::DBI
, you wouldn't get Apache::DBI
's menu entry in status. For more about Apache::DBI
see Persistent DB Connections.
Extended Configuration
PerlSetVar StatusOptionsAll On
This single directive will enable all of the options described below.
PerlSetVar StatusDumper On
When browsing symbol tables, the values of arrays, hashes ans calars can be viewed via
Data::Dumper
.PerlSetVar StatusPeek On
With this option On and the
Apache::Peek
module installed, functions and variables can be viewed alaDevel::Peek
style.PerlSetVar StatusLexInfo On
With this option On and the
B::LexInfo
module installed, subroutine lexical variable information can be viewed.PerlSetVar StatusDeparse On
With this option On and
B::Deparse
version 0.59 or higher (included in Perl 5.005_59+), subroutines can be "deparsed".Options can be passed to
B::Deparse::new
like so:PerlSetVar StatusDeparseOptions "-p -sC"
See the B::Deparse manpage for details.
PerlSetVar StatusTerse On
With this option On, text-based op tree graphs of subroutines can be displayed, thanks to
B::Terse
.PerlSetVar StatusTerseSize On
With this option On and the
B::TerseSize
module installed, text-based op tree graphs of subroutines and their size can be displayed. See theB::TerseSize
docs for more info.PerlSetVar StatusTerseSizeMainSummary On
With this option On and the
B::TerseSize
module installed, a "Memory Usage" will be added to theApache::Status
main menu. This option is disabled by default, as it can be rather cpu intensive to summarize memory usage for the entire server. It is strongly suggested that this option only be used with a development server running in -X mode, as the results will be cached.PerlSetVar StatusGraph
When
StatusDumper
(see above) is enabled, another link "OP Tree Graph" will be present with the dump if this configuration variable is set to On.This requires the B module (part of the Perl compiler kit) and
B::Graph
(version 0.03 or higher) module to be installed along with the dot program.Dot is part of the graph visualization toolkit from AT&T: http://www.research.att.com/sw/tools/graphviz/).
WARNING: Some graphs may produce very large images, some graphs may produce no image if
B::Graph
's output is incorrect.
You will find more information about this module in its manpage.
Usage
Assuming that your mod_perl server listens on port 81, fetch http://www.myserver.com:81/perl-status
Embedded Perl version 5.00502 for Apache/1.3.2 (Unix) mod_perl/1.16
process 187138, running since Thu Nov 19 09:50:33 1998
Below all sections should be links:
Signal Handlers
Enabled mod_perl Hooks
PerlRequire'd Files
Environment
Perl Section Configuration
Loaded Modules
Perl Configuration
ISA Tree
Inheritance Tree
Compiled Registry Scripts
Symbol Table Dump
Let's follow, for example, PerlRequire
'd Files. We see:
PerlRequire Location
/home/perl/apache-startup.pl /home/perl/apache-startup.pl
From some menus you can continue deeper to peek into the internals of the server, to see the values of the global variables in the packages, to the cached scripts and modules, and much more. Just click around...
Compiled Registry Scripts section seems to be empty.
Sometimes when you fetch /perl-status
and follow the Compiled Registry Scripts you see no listing of scripts at all. This is absolutely correct: Apache::Status
shows the registry scripts compiled in the httpd child which is serving your request for /perl-status. If a child has not compiled yet the script you are asking for, /perl-status will just show you the main menu.
mod_status
The Status module allows a server administrator to find out how well their server is performing. A HTML page is presented that gives the current server statistics in an easily readable form. If required this page can be made to automatically refresh (given a compatible browser). Another page gives a simple machine-readable list of the current server state.
This is the Apache module written in C, and it gets compiled by default, so no special build instructions are required.
The details given are:
The number of children serving requests
The number of idle children
The status of each child, the number of requests that child has performed and the total number of bytes served by the child
A total number of accesses and byte count served
The time the server was started/restarted and the time it has been running for
Averages giving the number of requests per second, the number of bytes served per second and the average number of bytes per request
The current percentage CPU used by each child and in total by Apache
The current hosts and requests being processed
To enable it, just uncomment the already prepared configuration section in httpd.conf:
ExtendedStatus On
<Location /server-status>
SetHandler server-status
order deny,allow
deny from all
allow from .foo.com
</Location>
You can now access server statistics by using a Web browser to access the page http://localhost/server-status
Apache::VMonitor -- Visual System and Apache Server Monitor
Apache::VMonitor
is the next generation of mod_status. It provides all the information mod_status provides and much more.
This module emulates the reporting functionalities of top(), mount(), df() and ifconfig() utilities. There is a special mode for mod_perl processes. It has a visual alert capabilities and configurable automatic refresh mode. All the sections can be shown/hidden dynamically through the web interface.
The are two main modes:
o Multi processes mode -- All system processes and information are shown.
o Single process mode -- An indepth information about a single process is shown.
The main advantage of this module is that it reduces the need to telnet to machine in order to monitor it. And provides an indepth information about mod_perl processes that cannot be acquired from telneting to machine.
Configuration
# Configuration in httpd.conf
<Location /sys-monitor>
SetHandler perl-script
PerlHandler Apache::VMonitor
</Location>
# startup file or <Perl> section:
use Apache::VMonitor();
$Apache::VMonitor::Config{BLINKING} = 1;
$Apache::VMonitor::Config{REFRESH} = 0;
$Apache::VMonitor::Config{VERBOSE} = 0;
$Apache::VMonitor::Config{SYSTEM} = 1;
$Apache::VMonitor::Config{APACHE} = 1;
$Apache::VMonitor::Config{PROCS} = 1;
$Apache::VMonitor::Config{MOUNT} = 1;
$Apache::VMonitor::Config{FS_USAGE} = 1;
$Apache::VMonitor::Config{NETLOAD} = 1;
@Apache::VMonitor::NETDEVS = qw(lo eth0);
$Apache::VMonitor::PROC_REGEX = join "\|", qw(httpd mysql squid);
More information available in the extensive module's manpage.
It requires Apache::Scoreboard and GTop to work. GTop in turn requires libgtop library. It's not available for all platfrorms. Visit http://www.home-of-linux.org/gnome/libgtop/ to check whether your platform/flavor is supported.
Sometimes script works, sometimes does not
See Sometimes it Works Sometimes it does Not
Code Debug
When the code doesn't perform what it's expected to, either never or just sometimes we say that this code requires debugging. There are a few levels of debug complexity.
The basic level is when perl terminates the program in the interpretation (compilation) stage before it started to run. Usually that happens when either there are syntax errors or some module is missing. Sometimes it takes an effort to solve this task, since code that uses Apache CORE modules generally wouldn't compile when executed from shell. We will learn how to solve syntax problems in mod_perl code quite easily.
Once the program compiles and begins to run, there might be logical (algorithmic) problems, when the program doesn't do the right thing you programmed it to do. This is somewhat harder to solve, especially when there is a lot of code that need to be observed and reviewed, but it's just a matter of time. Perl helps a lot to locate typos when you enable to warnings, for example it warns you about places when you wanted to compare to numbers, but omitted the second '=' character, so you end up with something like if $yes = 1
instead of if $yes == 1
.
The next level is when the program does what it expected to most of the time, but occasionally it misbehaves, but doing something different. An observation of the code generally doesn't help, and either print() statements or perl debugger come to help. Many times it's quite easy to debug with print(), but sometimes the overhead of typing the debug messages can be very tedious, especially when you didn't yet spot the lines where the bug happens to hide. That's where a perl debugger comes to help.
While print() statements are always work, running the perl debugger for CGI scripts, might be quite a challenge. But with a right knowledge and tools in hand the debug process becomes much easier. Unfortunately there is no way to easy the debug of the program itself, as it depends on the code you wrote, and it can be quite a nightmare to debug a really complex code.
The worst thing you can think of, is when the process terminates in the middle of a request processing and dumps core. Operating system dumps core (read: creates a file called core in directory the process was running at) when the program tries to access a memory area that doesn't belong to it, which generally happens when there is a bug. This is something that you would almost never see with plain perl scripts, but can easily happen if you use modules whose guts are written in C or C++ and something goes wrong with them. Occasionally there is a bug in underlying C code of mod_perl itself, that was in a deep slumber before your code waked it up.
In the following sections we would go in details through each of the presented problems, thoroughly discuss them and present a few techniques to solve them.
Locating and correcting Syntax Errors
While developing code, many times we do some syntax mistakes, like forgetting to put a semicolon at the end of statement ([S] unless it's an end of a block, where it's not required, but better if used since there is a chance that you will add more code at the end, and when you do, you might forget to add the missing semicolon.[/S]), comma in the list ([S] for the same reason, more items might be added to the list and perl has no problem when you finish the list with comma unlike other languages.[/S]) or else.
One of the approaches to locate the syntactically incorrect code, is to execute the script from shell with -c flag that only validates the syntax but wouldn't run the code (Actually, it will execute BEGIN
, END
blocks, and use() calls, because these are considered as occurring outside the execution of your program. Also it's a good idea to add -w
switch to enable the warnings:
perl -cw test.pl
When executed and there are errors in the code, perl will report about the errors and the appropriate line numbers in the script.
Next step is to execute the script, since besides syntax errors there are run time errors, these are the errors that cause the "Internal Server Error" when executes from the browser. With plain CGI scripts it's the same as running a plain perl scripts -- just execute it and see that they work.
However the whole thing is quite different with scripts that use Apache::*
modules which can be used only from within the mod_perl server, since they rely on the code and circumstances , which aren't available when you attempt to execute the script from shell, since there is no Apache request object available to the code.
If you have problems with code, you can either watch the errors and warnings as they are logged to error_log file when you make a request to the script from the browser, or use an Apache::FakeRequest
module written by Doug MacEachern and Andrew Ford.
Using Apache::FakeRequest to Debug Apache Perl Modules
Apache::FakeRequest
is used to set up an empty Apache request object that can be used for debugging. The Apache::FakeRequest
methods just set internal variables of the same name as the method and return the value of the internal variables. Initial values for methods can be specified when the object is created. The print method prints to STDOUT.
Subroutines for Apache constants are also defined so that using Apache::Constants
while debugging works, although the values of the constants are hard-coded rather than extracted from the Apache source code.
Let's write a very simple module, which prints "OK" to the client's browser:
package Apache::Example;
use Apache::Constants;
sub handler{
my $r = shift;
$r->send_http_header('text/plain');
print "You are OK ", $r->get_remote_host, "\n";
return OK;
}
1;
You cannot debug this module unless you configure the server to call its handler from some location. But with help of Apache::FakeRequest
you can write a little script that will emulate a request and return the expected output.
#!/usr/bin/perl
use Apache::FakeRequest ();
use Apache::Example ();
my $r = Apache::FakeRequest->new('get_remote_host'=>'www.foo.com');
Apache::Example::handler($r);
when you execute the script from the command line, you will see the following output:
You are OK www.foo.com
Finding the Line Number the Error/Warning has been Triggered at
Apache::Registry
, Apache::PerlRun
and modules that compile-via-eval confuse the line numbering. Modules that are read normally by Perl from disk have no problem with file name/line number.
If you compile with the experimental PERL_MARK_WHERE=1, it shows you almost the exact line number, where this is happening. Generally a compiler makes a shift in its line counter. You can always stuff your code with special compiler directives, to reset its counter to the value you will tell. At the beginning of the line you should write (the '#' in column 1):
#line 298 myscript.pl
or
#line 890 some_label_to_be_used_in_the_error_message
The label is optional - the filename of the script will be used by default. This specifies the line number of the following line, not the line the directive is on. You can use a little script to stuff every N lines of your code with these directives, but then you will have to rerun this script every time you add or remove code lines. The script:
#!/usr/bin/perl
# Puts Perl line markers in a Perl program for debugging purposes.
# Also takes out old line markers.
die "No filename to process.\n" unless @ARGV;
my $filename = $ARGV[0];
my $lines = 100;
open IN, $filename or die "Cannot open file: $filename: $!\n";
open OUT, ">$filename.marked"
or die "Cannot open file: $filename.marked: $!\n";
my $counter = 1;
while (<IN>) {
print OUT "#line $counter\n" unless $counter++ % $lines;
next if $_ =~ /^#line /;
print OUT $_;
}
close OUT;
close IN;
chmod 0755, "$filename.marked";
Also notice, that another solution is to move most of the code into a separate modules, which ensures that the line number will be reported correctly.
To have a complete trace of calls add:
use Carp ();
local $SIG{__WARN__} = \&Carp::cluck;
Using print() Function for Debugging
The universal debugging tool across nearly all platforms and programming languages is the printf() or equivalent output function, which can send data to the console, a file, application window and so on. In perl we generally use the print() function. With an idea of where and when the bug is triggered, a developer can insert print() statements in the source code to examine the value of data at certain points of execution.
However, it is rather difficult to anticipate all possible directions a program might take and what data to suspect of causing trouble. In addition, inline debugging code tends to add bloat and degrade performance of an application. So you have to comment out or remove the debug printings when you think that you have solved the problem, but if later you discover that you need to debug the same code again you need in the best case to uncomment the debug code lines or write them from scratch.
Let's see a few examples where we use print() to debug some problem. In one of my applications I wrote a function that returns the date that was a week ago. Here it is:
print "Content-type: text/plain\n\n";
print "A week ago date was ",date_a_week_ago(),"\n";
# return a date one week ago as a string in format: MM/DD/YYYY
####################
sub date_a_week_ago{
my @month_len = (31,28,31,30,31,30,31,31,30,31,30,31);
my ($day,$month,$year) = (localtime)[3..5];
for (my $j = 0; $j < 7; $j++) {
$day--;
if ($day == 0) {
$month--;
if ($month == 0) {
$year--;
$month = 12;
}
# there are 29 days in February in a leap year
$month_len[1] =
(($year % 4 or $year % 100 == 0) and $year % 400 )
? 28 : 29;
# set $day to be the last day of the previous month
$day = $month_len[$month - 1];
} # end of if ($day == 0)
} # end of for ($i = 0;$i < 7;$i++)
return sprintf "%02d/%02d/%04d",$month,$day,$year+1900;
}
This code is pretty straightforward. Get today's date and subtract one from the value of the day we get, updating on the way the month and the year if the boundaries are being crossed (end of month, end of year). Do it seven times in loop, and at the end you should get a date that was a week ago.
Note that since locatime() returns year as a value of current_four_digits_format_year-1900, which means that we don't have a century boundary to worry about, since if we are in the middle of the first week of the year 2000, the value of year returned by localtime() would be 100
and not 0
as you mistakenly might assume. So when the code does $year--
it becomes 99
and not -1
. At the end we add 1900 and get back a correct four digit year format.
Also note that we have to cover the case of the leap year, where there are 29 days in the February. For the rest of months we have prepared an array with month lengths.
Now when we run this code and check the result, we see that something is wrong. For example if today is 10/23/1999
and we expect the above code to print 10/16/1999
, it prints: 09/16/1999
, which means that we have lost a month, therefore the above code is buggy.
Let's stuff a few debug print() statements in the code near the $month
variable:
sub date_a_week_ago{
my @month_len = (31,28,31,30,31,30,31,31,30,31,30,31);
my ($day,$month,$year) = (localtime)[3..5];
print "[set] month : $month\n";
for (my $j = 0; $j < 7; $j++) {
$day--;
if ($day == 0) {
$month--;
if ($month == 0) {
$year--;
$month = 12;
}
print "[loop $i] month : $month\n";
# there are 29 days in February in a leap year
$month_len[1] =
(($year % 4 or $year % 100 == 0) and $year % 400 )
? 28 : 29;
# set $day to be the last day of the previous month
$day = $month_len[$month - 1];
} # end of if ($day == 0)
} # end of for ($i = 0;$i < 7;$i++)
return sprintf "%02d/%02d/%04d",$month,$day,$year+1900;
}
When we run it we see:
[set] month : 9
Which is supposed to be the number of current month (10
), when it actually is not. We have spotted a bug, since the only code that sets the $month
variable consists of a call to localtime(). So did we find a bug in Perl? let's look at the man page of the localtime() function:
% perldoc -f localtime
Converts a time as returned by the time function to a 9-element
array with the time analyzed for the local time zone. Typically
used as follows:
# 0 1 2 3 4 5 6 7 8
($sec,$min,$hour,$mday,$mon,$year,$wday,$yday,$isdst) =
localtime(time);
All array elements are numeric, and come straight out of a struct
tm. In particular this means that C<$mon> has the range C<0..11>
and C<$wday> has the range C<0..6> with Sunday as day C<0>. Also,
C<$year> is the number of years since 1900, that is, C<$year> is
C<123> in year 2023, and I<not> simply the last two digits of the
year. If you assume it is, then you create non-Y2K-compliant
programs--and you wouldn't want to do that, would you?
[more info snipped]
Which reveals us that we are supposed to increment a value of <$month>, if we want to count months from 1 to 12 and not 0 to 11. Among other interesting facts about locatime() we also see an explanation about $year
, which as I've mentioned before is being set to the number of years since 1900.
Thus we have found the bug in our code and learned new things about localtime(). To correct the above code we just add a month's increment after we call localtime():
my ($day,$month,$year) = (localtime)[3..5];
$month++;
META: continue (unfinished)!!!
Now let's see some code including conditional and loop statements.
for my $i (1..31)
if( $day > 20) {
}
Using print() and Data::Dumper for Debugging
Sometimes you need to peek into a complex data structures, and trying to print them out can be a non-easy task. That's where Data::Dumper
comes to a resque. For example if we create this complex data structure:
$data =
{
array => [qw(a b c d)],
hash => {
foo => "oof",
bar => "rab",
},
};
How do we print it out? Very easily:
use Data::Dumper;
print Dumper \$data;
What we get is a pretty printed $data
:
$VAR1 = \{
'hash' => {
'foo' => 'oof',
'bar' => 'rab'
},
'array' => [
'a',
'b',
'c',
'd'
]
};
While writing this example I made a mistake and wrote qw(a b c d)
instead of [qw(a b c d)]
, when I pretty printed the contents of $data
I immediately saw my mistake:
$VAR1 = \{
'b' => 'c',
'd' => 'hash',
'HASH(0x80cd79c)' => undef,
'array' => 'a'
};
That's not what I wanted of course, I've spotted the bug and corrected it, as you saw in the original example from above.
The Importance of Good Coding Style and Conciseness
META: rewrite: blabla about -- very hard to find bugs and even understand the code below because of its obscurity. The example from the previous section is hard to debug too, because there is too much redundancy in it, you should develop a good coding style by creating a concise code but which is easy to understand (See the example below)...
it's much easier to find bugs
A shrinked version of the main loop, that wouldn't add for easier code understanding looks like:
for (0..7) {
next if --$day;
$year--,$month=12 unless --$month;
$day = $month != 1 ? $month_len[$month-1] : $year % 4 ? 28 : 29;
}
Don't do that at home :)
Why did I actually present the latter version? The shrinked version is too obfuscated, which makes it not easy to understand and maintain. From the other hand part of this code is easier to understand.
Larry Wall, the author of Perl, is a linguist, so he tried to define the syntax in a way that will make coding in Perl much like in English. So it can be a very good idea to learn perl coding idioms, which might seem inconvenient in the beginning but once you get used to them, you will not understand how could you live without them before. I'll show just a few of most used perl coding style idioms. It's a good idea to write the code that more readable but avoid redundancy, like instead of writing:
if ($i == 0) ...
it's better to write:
unless ($i)
Use a much more concise perlish style:
for my $j (0..7) {
instead of a syntax you've got used from other languages:
for (my $j=0; $j<7; $j++) {
it's much simpler to write and comprehend the code like:
print "something" if $debug;
rather than:
if($debug){
print "something";
}
A good style that improves understanding, readability and reduces a chance to have a bug is shown below in a form of yet another rewrite of the original version of the code:
for (0..7) {
$day--;
next if $day;
$month--;
unless ($month){
$year--;
$month=12
}
if($month == 1){
$day = $year % 4 ? 28 : 29;
} else {
$day = $month_len[$month-1];
}
}
which is a gold middle between the too verbose style as in the first example and too obfuscated second example.
And of course a two liner, which is much faster and easier to understand is:
sub date_a_week_ago{
my ($day,$month,$year) = (localtime(time-604800))[3..5];
return sprintf "%02d/%02d/%04d",$month+1,$day,$year+1900;
}
Just take the current date in seconds since epoch as time() returns, subtract a week in seconds (7*24*60*60 = 604800) and feed the result to localtime() - voila we've got a date from a week ago!
Why the last version is important, when the first one works just fine? Not because of performance issues, and the last one is a twice faster, but because the are more chances that you have a bug in the first version, than in the last one.
Introduction into Perl Debugger
As we saw it's almost always possible to debug code with help of print(). However, it is rather difficult to anticipate all possible directions a program might take and what data to suspect of causing trouble. In addition, inline debugging code tends to add bloat and degrade performance of an application. Although, most applications offer inline debugging as a compile time option to avoid these hits. In any case, this information tends to only be useful to the programmer who added the trace statement in first place.
Sometimes you have to debug tens of thousands lines Perl application and while you can be a very experienced Perl programmer and can understand Perl code quite well by just looking at it, no mere mortal can begin to understand what will actually happen in such a large application, until the code is running. You just don't know where to start adding trusty print() statements to see what is happening inside.
The most effective way to track down a bug is running the program with an interactive debugger. The majority of programming languages have such a tool available that allows one to see what is happening inside an application while it is running. Basic features of an interactive debugger allow you to: <ul>
Stop at a certain point in the code, based on a routine name or specific source file and line number
Stop at a certain point in the code, based on specific conditions such as the value of a given variable
Perform an action without stopping, based on the same criteria above
View and modify the value of variables at any given point
Provide context information such as stack traces and source windows
It does take practice to learn the most effective ways of using an interactive debugger, but the time and effort will be paid back many-fold in the long run.
Most C and C++ programmers are familiar with the interactive GNU debugger (gdb
). gdb
is a stand-alone program that requires your code to be compiled with debugging symbols to be useful. While gdb
can be used to debug the perl interpreter program itself, it cannot be used to debug your own Perl programs. Not to worry, Perl provides its own interactive debugger, called perldb
. Giving control of your Perl program to the interactive debugger is simply a matter of specifying the -d
command line switch. When this switch is used, Perl will insert debugging hooks into the program syntax tree, but leaves the actual job of debugging to a Perl module outside of the perl binary program itself.
I will start by introducing a few basic concepts and commands of the Perl interactive debugger. These basic warm up examples are all run from the command line, outside of the mod_perl, but are all still relevant once we do go inside Apache.
You may want to keep the the perldebug manpage handy for reference while reading this section and for future debugging sessions on your own.
The interactive debugger will attach to the current terminal and present you with a prompt just before the first program statement is executed. For example:
% perl -d -le 'print "mod_perl rules the world"'
Loading DB routines from perl5db.pl version 1.0402
Emacs support available.
Enter h or `h h' for help.
main::(-e:1): print "mod_perl rules the world"
DB<1>
The source line shown is that which Perl is about to execute, the next
command (or just n
) will cause this line to be executed and stop again right before the next line:
main::(-e:1): print "mod_perl rules the world"
DB<1> n
mod_perl rules the world
Debugged program terminated. Use q to quit or R to restart,
use O inhibit_exit to avoid stopping after program termination,
h q, h R or h O to get additional info.
DB<1>
In this case, our example code is only one line long, so we are done interacting after the first line of code is executed. Let's try again with a bit longer example which is the following script:
my $word = 'mod_perl';
my @array = qw(rules the world);
print "$word @array\n";
Save the script in a file named domination.pl and run with the -d
switch:
% perl -d domination.pl
main::(domination.pl:1): my $word = 'mod_perl';
DB<1> n
main::(domination.pl:2): my @array = qw(rules the world);
DB<1>
At this point, the first line of code has been executed and the variable $word
has been assigned to the value mod_perl. We can check that assumption by using the p
command (a shortage for the print
, the two are interchangeable):
main::(domination.pl:2): my @array = qw(rules the world);
DB<1> p $word
mod_perl
The print
command works just like the Perl's builtin print() function, but adds a trailing newline and outputs to the $DB::OUT
file handle, which is normally opened to the terminal where perl was launched from. Let's carry on:
DB<2> n
main::(domination.pl:4): print "$word @array\n";
DB<2> p @array
rulestheworld
DB<3> n
mod_perl rules the world
Debugged program terminated. Use q to quit or R to restart,
use O inhibit_exit to avoid stopping after program termination,
h q, h R or h O to get additional info.
Ouch, p @array
printed rulestheworld
and not rules the world
, as you might expect it to, but it's absolutely normal. If you print an array without expanding it first into a string it would be printed without adding spaces (or other content of the $"
variable, otherwise known as $LIST_SEPARATOR
if English
pragma is being used.) between the members of the array. If you do:
print "@array";
you would get the rules the world
output, since the default value of $"
variable is a single space.
You should notice by now, there is some valuable information to the left of each executable statement:
main::(domination.pl:4): print "$word @array\n";
DB<2>
First is the current package name, in this case main::
. Next is the current filename and statement line number, domination.pl and 4 in the example above. The number presented at the prompt is the command number which can be used to recall commands in session history, with help of !
command followed by this number. For example, !1
would repeat the first command:
% perl -d -e0
main::(-e:1): 0
DB<1> p $]
5.00503
DB<2> !1
p $]5.00503
DB<3>
Where $]
is the perl's version number. As you see !1
prints the value of $]
, prepended by the command that was executed.
Things start to get more interesting as the code does. In the example script below (save it in a file named test.pl) we've increased the number of source files and packages by including the standard Symbol
module, along with invoking its gensym() function:
use Symbol ();
my $sym = Symbol::gensym();
print "$sym\n";
% perl -d test.pl
main::(test.pl:3): my $sym = Symbol::gensym();
DB<1> n
main::(test.pl:5): print "$sym\n";
DB<1> n
GLOB(0x80c7a44)
First, notice the debugger did not stop at the first line of the file, this is because use ...
is a compile-time statement, not a run-time statement. Also notice, there was more work going on, than the debugger revealed. That's because the next
command does not enter subroutine calls. To step into a subroutine code use the step
command (or s
):
% perl -d test.pl
main::(test.pl:3): my $sym = Symbol::gensym();
DB<1> s
Symbol::gensym(/usr/lib/perl5/5.00503/Symbol.pm:86):
86: my $name = "GEN" . $genseq++;
DB<1>
Notice the source line information has changed to the Symbol::gensym
package and the Symbol.pm
file. We can carry on by hitting the return key at each prompt, which causes the debugger to repeat the last step
or next
command. It wouldn't repeat a print
command for example. The debugger will return out of the subroutine and back to our main program:
DB<1>
Symbol::gensym(/usr/lib/perl5/5.00503/Symbol.pm:87):
87: my $ref = \*{$genpkg . $name};
DB<1>
Symbol::gensym(/usr/lib/perl5/5.00503/Symbol.pm:88):
88: delete $$genpkg{$name};
DB<1>
Symbol::gensym(/usr/lib/perl5/5.00503/Symbol.pm:89):
89: $ref;
DB<1>
main::(test.pl:5): print "$sym\n";
DB<1>
GLOB(0x80c7a44)
Our line-by-line debugging approach has served us well for this small program, but imagine the time it takes to step through a large application at the same pace. There are several ways to speed up a debugging session, one of which is known as setting a breakpoint. The breakpoint
command (b
) can be used for instructing the debugger to stop at a named subroutine or at line of a given file. In this example session, we will set a breakpoint at the Symbol::gensym
subroutine at the first prompt, telling the debugger to stop at the first line of this routine when it is called. Rather than move along with next
or step
we enter the continue
command (c
) which tells the debugger to execute each line without stopping until it reaches a breakpoint:
% perl -d test.pl
main::(test.pl:3): my $sym = Symbol::gensym();
DB<1> b Symbol::gensym
DB<2> c
Symbol::gensym(/usr/lib/perl5/5.00503/Symbol.pm:86):
86: my $name = "GEN" . $genseq++;
Now let's pretend we are debugging a large application where Symbol::gensym
might be called in various places. When the subroutine breakpoint is reached, the debugger does not reveal where it was called from by default. One way to find out this information is with the Trace
command (T
):
DB<2> T
$ = Symbol::gensym() called from file `test.pl' line 3
In this example, the call stack is only one level deep, so only that line is printed, we'll look at an example with a deeper stack later. The left-most character reveals the context in which the subroutine was called. $
represents a scalar context, in others you may see @
which represent a list context or .
which represents a void context. In our case we have called:
my $sym = Symbol::gensym();
which calls the Symbol::gensym()
in a scalar context.
Below we've made our test.pl
example a little more complex. First, we've added a My::World
package declaration at the top of the script, so we are no longer working in the main::
package. Next, we've added a subroutine named do_work() which invokes the familiar Symbol::gensym
, along with another function called Symbol::qualify
and returns a hash reference of the results. The do_work() routine is invoked inside a for loop which will be run twice:
package My::World;
use Symbol ();
for (1,2) {
do_work("now");
}
sub do_work {
my($var) = @_;
return undef unless $var;
my $sym = Symbol::gensym();
my $qvar = Symbol::qualify($var);
my $retval = {
'sym' => $sym,
'var' => $qvar,
};
return $retval;
}
We'll start by setting a few breakpoints and then we use List
command (L
) to display them:
% perl -d test.pl
My::World::(test.pl:5): for (1,2) {
DB<1> b Symbol::qualify
DB<2> b Symbol::gensym
DB<3> L
/usr/lib/perl5/5.00503/Symbol.pm:
86: my $name = "GEN" . $genseq++;
break if (1)
95: my ($name) = @_;
break if (1)
The filename and line number of the breakpoint are displayed just before the source line itself. Since both breakpoints located at the same file -- the filename is being displayed only once. After the source line we see the condition on which to stop, in our case as the constant value 1 indicates, we will always stop at these breakpoint. Later on you'll see how to specify a certain condition.
As we see, when continue
command is executed, the normal flow of the program stops at one of these breakpoints, either on line 86 or 95 of /usr/lib/perl5/5.00503/Symbol.pm
file, whichever will be reached first. As you understand the displayed code lines are the first rows of the two subroutines from Symbol.pm
. Lines that qualify to be used as breakpoints cannot be empty lines or comments, there must be a code there.
In our example List
command shows the lines the breakpoints were set on, but we cannot tell which breakpoint belongs to which subroutine. There are two ways to find it out. One is to run continue
command and when it stops, execute the Trace
command we saw before:
DB<3> c
Symbol::gensym(/usr/lib/perl5/5.00503/Symbol.pm:86):
86: my $name = "GEN" . $genseq++;
DB<3> T
$ = Symbol::gensym() called from file `test.pl' line 14
. = My::World::do_work('now') called from file `test.pl' line 6
So we see that it was a Symbol::gensym
. The other way is to ask for a listing of code at some lines range. For example, let's check which subroutine line 86 is a part of. We use a list
(lowercase!) command (l
), which displays parts of the code. Among various arguments it accepts, there is one that we want to use here, a lines range. Since the breakpoint is at line 86, let's print a few lines back and forward:
DB<3> l 85-87
85 sub gensym () {
86==>b my $name = "GEN" . $genseq++;
87: my $ref = \*{$genpkg . $name};
Now we know it's gensym
sub and we also see the breakpoint displayed with help of ==>b
markup. We could also use the name of the sub to display its code:
DB<4> l Symbol::gensym
85 sub gensym () {
86==>b my $name = "GEN" . $genseq++;
87: my $ref = \*{$genpkg . $name};
88: delete $$genpkg{$name};
89: $ref;
90 }
The delete
command (d
) is used to remove certain breakpoints by specifying the line number of the breakpoint. Let's remove the first one:
DB<5> d 95
The Delete
command (with a capital `D') or d
removes all currently installed breakpoints.
Now let's look again at the trace produced at the breakpoint:
DB<3> c
Symbol::gensym(/usr/lib/perl5/5.00503/Symbol.pm:86):
86: my $name = "GEN" . $genseq++;
DB<3> T
$ = Symbol::gensym() called from file `test.pl' line 14
. = My::World::do_work('now') called from file `test.pl' line 6
As you can see, the stack trace prints the values which are passed into the subroutine. Ah, and perhaps we've found our first bug, as we can see do_work() was called in a void context, so the return value was lost into thin air. Let's change the for loop logic to check the return value of do_work():
for (1,2) {
my $stuff = do_work("now");
if ($stuff) {
print "work is done\n";
}
}
In this session we will set a breakpoint at line 7 of test.pl
where we check the return value of do_work():
% perl -d test.pl
My::World::(test.pl:5): for (1,2) {
DB<1> b 7
DB<2> c
My::World::(test.pl:7): if ($stuff) {
DB<2>
Our program is still small, but it is getting more difficult to understand the context of just one line of code, the window
command (w
) will list the first few lines of code that surround the current line:
DB<2> w
4
5: for (1,2) {
6: my $stuff = do_work("now");
7==>b if ($stuff) {
8: print "work is done\n";
9 }
10 }
11
12 sub do_work {
13: my($var) = @_;
The arrow points to the line which is about to be executed and also contains a 'b'
indicating we have set a breakpoint at this line. The breakable lines of code include a `:'
just after the line number.
Now, let's take a look at the value of the $stuff
variable with the trusty old print
command:
DB<2> p $stuff
HASH(0x82b89b4)
That's not very useful information. Remember, the print
command works just as the built-in print() function does. The x
command evaluates a given expression and prints the results in a "pretty" fashion:
DB<3> x $stuff
0 HASH(0x82b89b4)
'sym' => GLOB(0x826a944)
-> *Symbol::GEN0
'var' => 'My::World::now'
There, things seem to be okay, lets double check by calling do_work() with a different value and print the results:
DB<4> x do_work('later')
0 HASH(0x82bacc8)
'sym' => GLOB(0x818f16c)
-> *Symbol::GEN1
'var' => 'My::World::later'
We can see the symbol was incremented from GEN0
to GEN1
and the variable later was qualified, as expected.
Now let's change the test program a little to iterate over a list of arguments held in @args
and print a slightly different message:
package My::World;
use Symbol ();
my @args = qw(now later);
for my $arg (@args) {
my $stuff = do_work($arg);
if ($stuff) {
print "do your work $arg\n";
}
}
sub do_work {
my($var) = @_;
return undef unless $var;
my $sym = Symbol::gensym();
my $qvar = Symbol::qualify($var);
my $retval = {
'sym' => $sym,
'var' => $qvar,
};
return $retval;
}
There are only two arguments in the list, so stopping to look at each one isn't too time consuming, but consider the debugging pace with a large list of 100 or so entries. It is possible to customize breakpoints by specifying a condition. Each time a breakpoint is reached, the condition is evaluated, stopping only if the condition is true. In the session below the window
command shows breakable lines and we set a breakpoint at line 7 with the condition $arg eq 'later'
. As we continue, the breakpoint is skipped when $arg
has the value of now and stops when it has the value of later:
% perl -d test.pl
My::World::(test.pl:5): my @args = qw(now later);
DB<1> w
2
3: use Symbol ();
4
5==> my @args = qw(now later);
6: for my $arg (@args) {
7: my $stuff = do_work($arg);
8: if ($stuff) {
9: print "do your work $arg\n";
10 }
11 }
==>
symbol shows us the line of the code that's about to be executed.
DB<1> b 7 $arg eq 'later'
DB<2> c
do your work now
My::World::(test.pl:7): my $stuff = do_work($arg);
DB<2> n
My::World::(test.pl:8): if ($stuff) {
DB<2> x $stuff
0 HASH(0x82b90e4)
'sym' => GLOB(0x82b9138)
-> *Symbol::GEN1
'var' => 'My::World::later'
DB<5> c
do your work later
Debugged program terminated. Use q to quit or R to restart,
There are plenty more tricks left to pull from the perldb bag, but you should understand enough about the debugger to try them on your own with the perldebug manpage by your side. A quick online help can be reached by typing a h
command. It will display a list of most useful commands and a short explanation of what they are doing.
Interactive Perl Debugging under mod_cgi
Devel::ptkdb
is a visual Perl debugger that uses perlTk for a user interface.
To debug plain perl script with it, invoke it as:
% perl -d:ptkdb myscript.pl
A Tk application will be loaded. Now you can do most of the debugging you did with command line standard Perl debugger, but using a simple GUI to set/remove breakpoints, browse the code, step thru it and more.
With help of ptkdb you can debug your CGI scripts running under mod_cgi. Be sure that that your web server's perl installation includes Tk package. In order to enable the debugger you should change your:
#! /usr/local/bin/perl -wT
to
#! /usr/local/bin/perl -wTd:ptkdb
You can debug scripts remotely if you're using a Unix based server and where you are authoring the script has an Xserver. The Xserver can be another Unix workstation, a Macintosh or Win32 platform with an appropriate XWindows package. In your script insert the following BEGIN
subroutine:
sub BEGIN {
$ENV{'DISPLAY'} = "myHostname:0.0" ;
}
You can use either IP (123.123.123.123:0.0) or DNS convention (myhost.com:0.0). Be sure that your web server has permission to open windows on your Xserver (see the xhost manpage for more info).
Access your web page with your browser and Submit the script as normal. The ptkdb window should appear on your monitor if you have set correctly the $ENV{'DISPLAY'}
variable. At this point you can start debugging your script. Be aware that your browser may timeout waiting for the script to run.
To expedite debugging you may want to setup your breakpoints in advance with a .ptkdbrc file and use the $DB::no_stop_at_start
variable. NOTE: for debugging web scripts you may have to have the .ptkdbrc file installed in the server account's home directory (~www) or whatever username your webserver is running under. Also try installing a .ptkdbrc file in the same directory as the target script.
META: insert snapshots of ptkdb screen
Non-Interactive Perl Debugging under mod_perl
To debug scripts running under mod_perl either use Apache::DB (interactive Perl debugging) or an older non-interactive method as described below.
NonStop
debugger option enables us to get some decent debug info when running under mod_perl. For example, before starting the server:
% setenv PERL5OPT -d
% setenv PERLDB_OPTS "NonStop=1 LineInfo=db.out AutoTrace=1 frame=2"
Now watch db.out for line:filename info. This is most useful for tracking those core dumps that normally leave us guessing, even with a stack trace from gdb. db.out will show you what Perl code triggered the core. 'man perldebug' for more PERLDB_OPTS
. Note, Perl will ignore PERL5OPT
if PerlTaintCheck
is On
.
Interactive mod_perl Debugging
Now we'll turn to looking at how the interactive debugger is used in a mod_perl environment. The Apache::DB
module available from CPAN provides a wrapper around perldb
for debugging Perl code running under mod_perl.
The server must be run in non-forking mode to use the interactive debugger, this mode is turned on by passing the -X
flag to httpd executable. It is convenient to use an IfDefine
section around the Apache::DB
configuration, the example below does this using the name PERLDB. With this setup, debugging is only turned on when starting the server with httpd -D PERLDB
command.
This section should be at the top of your perl configuration section of the configuration file, before any Perl code is pulled in, so debugging symbols will be inserted into the syntax tree, triggered by the call to Apache::DB->init
. The Apache::DB::handler
can be configured using any of the Perl*Handler
directives, in this case we use a PerlFixupHandler
so handlers in the response phase will bring up the debugger prompt:
<IfDefine PERLDB>
<Perl>
use Apache::DB ();
Apache::DB->init;
</Perl>
<Location />
PerlFixupHandler Apache::DB
</Location>
</IfDefine>
Since we have used /
as an argument to Location
directive, the debugger will be invoked for any kind of requests (even for static objects (images, static documents), but of course it would immediately quit, unless there is some perl module registered to handle these static objects).
In our first example, we will debug the standard Apache::Status
module, which is configured like so:
PerlModule Apache::Status
<Location /perl-status>
PerlHandler Apache::Status
SetHandler perl-script
</Location>
When the server is started with the debugging flag, a notice will be printed to the console:
% httpd -X -D PERLDB
[notice] Apache::DB initialized in child 950
The debugger prompt will not be available until the first request is made, in our case to http://localhost/perl-status. Once we are at the prompt, all the standard debugging commands are available. First we run the window for some context of the code being debugged, move to the next statement after $r
has been assigned to and print the request URI. If no breakpoints are set, the continue command will give control back to Apache and the request will finish with the Apache::Status
main menu showing up in the browser window:
Loading DB routines from perl5db.pl version 1.0402
Emacs support available.
Enter h or `h h' for help.
Apache::Status::handler(/usr/lib/perl5/site_perl/5.005/i386-linux/Apache/Status.pm:55):
55: my($r) = @_;
DB<1> w
52 }
53
54 sub handler {
55==> my($r) = @_;
56: Apache->request($r); #for Apache::CGI
57: my $qs = $r->args || "";
58: my $sub = "status_$qs";
59: no strict 'refs';
60
61: if($qs =~ s/^(noh_\w+).*/$1/) {
DB<1> n
Apache::Status::handler(/usr/lib/perl5/site_perl/5.005/i386-linux/Apache/Status.pm:56):
56: Apache->request($r); # for Apache::CGI
DB<1> p $r->uri
/perl-status
DB<2> c
All the techniques we saw while debugging plain perl scripts can be applied to this debugging session.
Debugging Apache::Registry
scripts is somewhat different, because the handler routine does quite a bit of work before it reaches your script. In this example, we make a request for /perl/test.pl
, which consists of this code:
use strict;
my $r = shift;
$r->send_http_header('text/plain');
print "mod_perl rules";
When a request is issued, the debugger stops at line 28 of Apache/Registry.pm. We set a breakpoint at line 140, which is the line that actually calls the script wrapper subroutine. The continue command will bring us to that line, where we can step into the script handler:
Apache::Registry::handler(/usr/lib/perl5/site_perl/5.005/i386-linux/Apache/Registry.pm:28):
28: my $r = shift;
DB<1> b 140
DB<2> c
Apache::Registry::handler(/usr/lib/perl5/site_perl/5.005/i386-linux/Apache/Registry.pm:140):
140: eval { &{$cv}($r, @_) } if $r->seqno;
DB<2> s
Apache::ROOT::perl::test_2epl::handler((eval 87):3):
3: my $r = shift;
Notice the funny package name, that's generated from the URI of the request for namespace protection. The filename is not displayed, since the code was compiled via eval(), but the print
command can be used to show you $r->filename
:
DB<2> n
Apache::ROOT::perl::test_2epl::handler((eval 87):4):
4: $r->send_http_header('text/plain');
DB<2> p $r->filename
/home/httpd/perl/test.pl
The line number might seem off too, but the window command will give you a better idea where you are:
DB<4> w
1: package Apache::ROOT::perl::test_2epl;use Apache qw(exit);sub handler { use strict;
2
3: my $r = shift;
4==> $r->send_http_header('text/plain');
5
6: print "mod_perl rules";
7
8 }
9 ;
The code from the test.pl file is between lines 2 and 7, the rest is the Apache::Registry
magic to cache your code inside a handler subroutine.
It will always take some practice and patience when putting together debugging strategies that make effective use of the interactive debugger for various situations. Once you do have a good strategy in mind, bug squashing can actually be quite a bit of fun!
ptkdb and Interactive mod_perl Debugging
Well as you we saw earlier you can use a ptkdb
visual debugger to debug CGI scripts running under mod_cgi. It wouldn't work for mod_perl though using the same configuration as used in mod_cgi. We have to tweak the Apache/DB.pm module to use Devel/ptkdb.pm instead of Apache/perl5db.pl.
Open the file in your favorite editor and replace:
require 'Apache/perl5db.pl';
with:
require 'Devel/ptkdb.pm';
Now when you use the interactive mod_perl debugger configuration from the previous section and issue a request, a ptkdb visual debugger will be loaded.
If you are debugging Apache::Registry
scripts, exactly like in the terminal debugging mode example, you should go to the line 140 or whatever line the eval { &{$cv}($r, @_) } if $r-
seqno;> located and to <step in> to enter your script.
Note, that you can work with ptkdb in plain multi-server mode, so you don't have to start the server with -X
option.
META: One caveat:
* When the request is completed, ptkdb
would hang. Anyone knows what code should be registered for it to exit on completion? To replace the original Apache::DB
cleanup code, as:
if (ref $r) {
$SIG{INT} = \&DB::catch;
$r->register_cleanup(sub {
$SIG{INT} = \&DB::ApacheSIGINT();
});
}
Any Perl/Tk guru to assist???
Debugging when Server Crashes on Startup before Writing to Log File.
If your server crashes on startup, you need to start it under gdb and ask it to generate the stack trace.
I'll emulate a faulty server by starting a startup file with dump() command:
startup.pl
----------
dump;
1;
and requiring this file from the httpd.conf:
PerlRequire /path/to/startup.pl
Make sure no server is running on port 80 or use an alternate config with an alternate port if you are on a production server.
% gdb /path/to/httpd
(gdb) set args -X
Use:
set args -X -f /path/to/alternate/serverconfig_ifneeded.conf
if you want the server to start from an alternative configuration file.
Now run the program:
(gdb) run
Starting program: /usr/local/apache/bin/httpd -X
Program received signal SIGABRT, Aborted.
0x400da4e1 in __kill () from /lib/libc.so.6
At this point the server should die (because of dump()) and when it happens we ask for a stack trace (using bt
or where
commands):
(gdb) where
#0 0x400da4e1 in __kill () from /lib/libc.so.6
#1 0x80d43bc in Perl_my_unexec ()
#2 0x8119544 in Perl_pp_goto ()
#3 0x8118990 in Perl_pp_dump ()
#4 0x812b2ad in Perl_runops_standard ()
#5 0x80d3a9c in perl_eval_sv ()
#6 0x807ef1c in perl_do_file ()
#7 0x807ef4f in perl_load_startup_script ()
#8 0x807b7ec in perl_cmd_require ()
#9 0x8092af7 in ap_clear_module_list ()
#10 0x8092f43 in ap_handle_command ()
#11 0x8092fd7 in ap_srm_command_loop ()
#12 0x80933e0 in ap_process_resource_config ()
#13 0x8093ca2 in ap_read_config ()
#14 0x809db63 in main ()
#15 0x400d41eb in __libc_start_main (main=0x809d8dc <main>, argc=2,
argv=0xbffffab4, init=0x80606f8 <_init>, fini=0x812b38c <_fini>,
rtld_fini=0x4000a610 <_dl_fini>, stack_end=0xbffffaac)
at ../sysdeps/generic/libc-start.c:90
If you are clueless of what this trace say, send it to the mod_perl mailing list. Make sure to include versions of apache, mod perl and perl.
In our case we already know that server is supposed to die when compiling the startup file and we can clearly see that from the trace. We always read it from its end upward:
We are in config file:
#13 0x8093ca2 in ap_read_config ()
We do require:
#8 0x807b7ec in perl_cmd_require ()
We load the file and compile it:
#6 0x807ef1c in perl_do_file ()
#5 0x80d3a9c in perl_eval_sv ()
dump() gets executed:
#3 0x8118990 in Perl_pp_dump ()
dump() calls __kill():
#0 0x400da4e1 in __kill () from /lib/libc.so.6
Debugging Hanging processes (continued)
META: incomplete
mod_perl comes with a number of useful of gdb macros to ease the debug process . You will find the file with macros at mod_perl source distribution in .gdbinit file (mod_perl-x.xx/.gdbinit). You might want to modify the macros definittions.
In order to use this you need to compile mod_perl with PERL_DEBUG=1
.
To debug the server, start it :
% httpd -X
Issue a request to offending script that hangs. Find the PID number of the process that hangs.
Go to the root of the server:
% cd /usr/local/apache
Now attach to it with gdb (replace PID with actual PID number) and load the macros from .gdbinit:
% gdb /path/to/httpd PID
% source /usr/src/mod_perl-x.xx/.gdbinit
Now you can start the server (httpd below is a gdb macro):
(gdb) httpd
Now run the curinfo
macro:
(gdb) curinfo
It should tell you the line/filename of the offending Perl code.
Add this to the .gdbinit:
define longmess
set $sv = perl_eval_pv("Carp::longmess()", 1)
printf "%s\n", ((XPV*) ($sv)->sv_any )->xpv_pv
end
and when you reload the macros, run:
(gdb) longmess
to produce a Perl stacktrace.
Debugging core Dumping Code
$ perl -e dump
Abort(coredump)
META: should I move the Apache::StatINC here? (I think not, since it relates to other topics like reloading config files, but you should mention it here with a pointer to it)
PERL_DESTRUCT_LEVEL Environment Variable
With Apache versions 1.3.0 and higher, mod_perl will call the perl_destruct() Perl API function during the child exit phase. This will cause proper execution of END blocks found during server startup along with invoking the DESTROY method on global objects who are still alive.
It is possible that this operation may take a long time to finish, causing problems during a restart. If your code does not contain and END blocks or DESTROY methods which need to be run during child server shutdown, this destruction can be avoided by setting the PERL_DESTRUCT_LEVEL environment variable to -1
, which will cause mod_perl to skip the call to perl_destruct() in perl_shutdown().
It is only usable if no significant cleanup has to be done by perl END
blocks and DESTROY
methods when the child terminates, of course. What constitutes significant cleanup? Any change of state outside of the current process that would not be handled by the operating system itself. So committing database transactions is significant but closing an ordinary file isn't.
Enabling PERL_DESTRUCT_LEVEL=-1
speeds the server restart or termination and leads to more robust operation in the face of problems, like running out of memory. If set--
You can also use -DPERL_DESTRUCT_LEVEL
.
PERL_DEBUG=1 Build Option
Building mod_perl with PERL_DEBUG=1
:
perl Makefile.PL PERL_DEBUG=1
will:
- 1 Add `-g' to EXTRA_CFLAGS
- 1 Turn on PERL_TRACE
- 1 Set PERL_DESTRUCT_LEVEL=2
- 1 Link against
libperld
if -e $Config{archlibexp}/CORE/libperld$Config{lib_ext}
Apache::Debug
(META: to be written)
use Apache::Debug ();
Apache::Debug::dump($r, SERVER_ERROR, "Uh Oh!");
This module sends what may be helpful debugging info to the client rather that the error log.
Also, you could try using a larger emergency pool, try this instead of Apache::Debug:
$^M = 'a' x (1<<18); #260K buffer
use Carp ();
$SIG{__DIE__} = \&Carp::confess;
eval { Carp::confess("init") };
Debug Tracing
To enable mod_perl debug tracing configure mod_perl with the PERL_TRACE option:
perl Makefile.PL PERL_TRACE=1
The trace levels can then be enabled via the MOD_PERL_TRACE
environment variable which can contain any combination of:
d - Trace directive handling during configuration read
s - Trace processing of perl sections
h - Trace Perl*Handler callbacks
g - Trace global variable handling, interpreter construction, END blocks, etc.
all - all of the above
add to httpd.conf:
PerlSetVar MOD_PERL_TRACE all
For example if you want to see a trace of the PerlRequire's and PerlModule's as they are loaded, use:
PerlSetVar MOD_PERL_TRACE d
gdb says there are no debugging symbols
As you know you need an unstriped executable to be able to debug it. While you can compile the mod_perl with -g
(or PERL_DEBUG=1
) the apache install
strips the symbols.
Makefile.tmpl contains a line:
IFLAGS_PROGRAM = -m 755 -s
Removing the -s does the trick.
Debugging Signal Handlers ($SIG{FOO})
Current perl implementation does not restore the original apache's C handler when you use local $SIG{FOO}
clause. While save/restore of $SIG{ALRM}
was fixed in the mod_perl 1.19_01 (CVS version), other signals are not yet fixed. The real fix should probably be in Perl itself.
Until recent local $SIG{ALRM}
restored the SIGALRM
handler to Perl's handler, not the handler it was in the first place (apache's alrm_handler()
). if you build mod_perl with PERL_TRACE=1
and set the MOD_PERL_TRACE
environment variable to g, you will see this in the error_log
file:
mod_perl: saving SIGALRM (14) handler 0x80b1ff0
mod_perl: restoring SIGALRM (14) handler from: 0x0 to: 0x80b1ff0
If nobody touched $SIG{ALRM}
, 0x0
would be the same address as the others.
If you work with signal handlers take a look at Sys::Signal
module, which solves the problem:
Sys::Signal
- Set signal handlers with restoration of existing C sighandler. Get it from the CPAN.
The usage is simple, if the original code was:
eval {
local $SIG{ALRM} = sub { die "timeout\n" };
alarm $timeout;
... db stuff ...
alarm 0;
};
die $@ if $@;
If a timeout happens and SIGALRM
is thrown, the alarm() will be reset, otherwise alarm 0
is reached and timer is being reset as well.
Now you would write:
use Sys::Signal ();
eval {
my $h = Sys::Signal->set(ALRM => sub { die "timeout\n" });
alarm $timeout;
... do something that may timeout ...
alarm 0;
};
die $@ if $@;
Note that this example was written before $SIG{ALRM}
was internally fixed. So you don't need to use Sys::Signal
with $SIG{ALRM}
.
mod_perl is only trying to deal with those that cause conflict with Apache's. Currently it's only SIGALRM
. If you've found another one that makes you troubles, add it to the list in perl_config.c after "ALRM", before NULL.
static char *sigsave[] = { "ALRM", NULL };
Code Profiling
(Meta: duplication??? I've started to write about profiling somewhere in this file)
It is possible to profile code run under mod_perl with the Devel::DProf
module available on CPAN. However, you must have apache version 1.3b3 or higher and the PerlChildExitHandler
enabled. When the server is started, Devel::DProf
installs an END
block to write the tmon.out
file, which will be run when the server is shutdown. Here's how to start and stop a server with the profiler enabled:
% setenv PERL5OPT -d:DProf
% httpd -X -d `pwd` &
... make some requests to the server here ...
% kill `cat logs/httpd.pid`
% unsetenv PERL5OPT
% dprofpp
See also: Apache::DProf
Devel::Peek
Devel::Peek - A data debugging tool for the XS programmer
Let's see an example of Perl allocating buffer size only once, regardless of my() scoping, although it will realloc() if the size is > SvLEN
:
use Devel::Peek;
for (1..3) {
foo();
}
sub foo {
my $sv;
Dump $sv;
$sv = 'x' x 100_000;
$sv = "";
}
The output:
SV = NULL(0x0) at 0x8138008
REFCNT = 1
FLAGS = (PADBUSY,PADMY)
SV = PV(0x80e5794) at 0x8138008
REFCNT = 1
FLAGS = (PADBUSY,PADMY)
PV = 0x815f808 ""\0
CUR = 0
LEN = 100001
SV = PV(0x80e5794) at 0x8138008
REFCNT = 1
FLAGS = (PADBUSY,PADMY)
PV = 0x815f808 ""\0
CUR = 0
We can see that on subsequent calls (after the first one) $sv
already has a preallocated memory.
so, if you can afford the memory, the larger the buffer means less brk()
syscalls. if you watch that example with strace, you will only see calls to brk()
in the first time through the loop. So, this is a case where you module might want to pre-allocate the buffer for example for LWP, a file scope lexical, like so:
package Your::Proxy;
my $buffer = ' ' x 100_000;
$buffer = "";
This way, only the parent has to brk() at server startup, each child already will already have an allocated buffer, just reset to "", when you are done.
How can I find if my mod_perl scripts have memory leaks
Apache::Leak
(derived from Devel::Leak
) should help you with this task. Example:
use Apache::Leak;
my $global = "FooAAA";
leak_test {
$$global = 1;
++$global;
};
The argument to leak_test()
is an anonymous sub, so you can just throw it around any code you suspect might be leaking. Beware, it will run the code twice, because the first time in, new SV
s are created, but does not mean you are leaking, the second pass will give better evidence. You do not need to be inside mod_perl to use it, from the command line, the above script outputs:
ENTER: 1482 SVs
new c28b8 : new c2918 :
LEAVE: 1484 SVs
ENTER: 1484 SVs
new db690 : new db6a8 :
LEAVE: 1486 SVs
!!! 2 SVs leaked !!!
Build a debuggable perl to see dumps of the SV
s. The simple way to have both a normal perl and debuggable perl, is to follow hints in the SUPPORT
doc for building libperld.a
, when that is built copy the perl
from that directory to your perl bin directory, but name it dperl
.
Leak explanation: $$global = 1;
: new global variable created FooAAA
with value of 1
, will not be destroyed until this module is destroyed.
Apache::Leak
is not very user-friendly, have a look at B::LexInfo
. You'll see that what might appear to be a leak, is actually just a Perl optimization. e.g. consider this code:
sub foo {
my $string = shift;
}
foo("a string");
B::LexInfo
will show you that Perl does not release the value from $string, unless you undef() it. this is because Perl anticipates the memory will be needed for another string, the next time the subroutine is entered. you'll see similar for @array
length, %hash
keys, and scratch areas of the pad-list for OPs such as join()
, `.
', etc.
Apache::Status
now includes a new StatusLexInfo
option.
Apache::Leak
works better if you've built a libperld.a (see SUPPORT document) and given PERL_DEBUG=1
to mod_perl's Makefile.PL
.
Debugging your code in Single Server Mode
Running in httpd -X mode. (good only for testing during development phase).
You want to test that your application correctly handles global variables (if you have any - the less you have of them the better, but sometimes you just can't without them). It's hard to test with multiple servers serving your cgi since each child has a different value for its global variables. Imagine that you have a random()
sub that returns a random number and you have the following script.
use vars qw($num);
$num ||= random();
print ++$num;
This script initializes the variable $num
with a random value, then increments it on each request and prints it out. Running this script in multiple server environments will result in something like 1
, 9
, 4
, 19
(number per reload), since each time your script will be served by a different child. (On some OSes, the parent httpd process will assign all of the requests to the same child process if all of the children are idle... AIX...). But if you run in httpd -X
single server mode you will get 2
, 3
, 4
, 5
... (assuming that the random()
returned 1
at the first call)
But do not get too obsessive with this mode, since working only in single server mode sometimes hides problems that show up when you switch to a normal (multi) server mode. Consider an application that allows you to change the configuration at run time.
Let's say the script produces a form to change the background color of the page. It's not a good design, but for the sake of demonstrating the potential problem, we will assume that our script doesn't write the changed background color to the disk, but simply changes it in memory, like:
use vars qw($bgcolor);
# assign default value at first invocation
$bgcolor ||= "white";
# modify the color if requested to
$bgcolor = $q->param('bgcolor') || $bgcolor;
So you have typed in a new color, and in response, your script prints back the html with a new color - you think that's it! It was so simple. And if you keep running in single server mode you will never notice that you have a problem...
If you run the same code in the normal server mode, after you submit the color change you will get the result as expected, but when you will call the same URL again (not reload!) chances are that you will get back the original default color (white in our case), since except the child who processed the color change request no one knows about their global variable change. Just remember that children can't share information, other than that which they inherited from their parent on their load. Of course you should use a hidden variable for the color to be remembered or store it on the server side (database, shared memory, etc).
Also note that since the server is running in single mode, if the output returns HTML with <IMG
> tags, then the load of these will take a lot of time.
When you use Netscape client while your server is running in single-process mode, if the output returns a HTML with <IMG
> tags, then the load of these will take a lot of time, since the KeepAlive
feature gets in the way. Netscape tries to open multiple connections and keep them open. Because there is only one server process listening, each connection has to time-out before the next succeeds. Turn off KeepAlive
in httpd.conf
to avoid this effect.
Also note that since the server is running in single mode, if the output returns HTML with <IMG
> tags, then the load of these will take a lot of time. If you use Netscape while your server is running in single-process mode, HTTP's KeepAlive
feature gets in the way. Netscape tries to open multiple connections and keep them open. Because there is only one server process listening, each connection has to time-out before the next succeeds. Turn off KeepAlive
in httpd.conf
to avoid this effect while developing or you can press STOP after a few seconds (assuming you use the image size params, so the Netscape will be able to render the rest of the page).
In addition you should know that when running with -X
you will not see any control messages that the parent server normally writes to the error_log. (Like "server started, server stopped and etc".) Since httpd -X
causes the server to handle all requests itself, without forking any children, there is no controlling parent to write status messages.