NAME
Proc::Watchdog - Perl extension to implement (more) reliable watchdog of processes
SYNOPSIS
use Proc::Watchdog;
my $w = new Proc::Watchdog { -path => '/tmp' };
$w->alarm(30); # Kill me in 30 secs if I don't reset
# Your code goes here
$w->reset; # Reset the kill-clock
DESCRIPTION
This code implements a simple but effective mechanism to support Watchdogs in your code. A watchdog is a timer that fires a determined action after a timeout period has expired and can be used to recover hung processes. In our particular scenario, we found a number of possible failures that would let perl daemons that access database servers hung forever. alarm() was not an option as the client libraries supplied by the vendor already used the ALRM signal internally, so there was no way to quickly recover from these failures.
It works by creating a file in the path supplied by the `-path' argument as seen in the synopsis. If the path is not specified, it will default to '/tmp', which is nice because this dir is usually cleaned-up as part of the boot process.
The file is created each time the ->alarm($time)
method is invoked, and the value of $time
is stored in it. The call to ->reset
unlink()s the file.
A separate daemon (watchd) included along with this module, is called from cron or another similar service to check on the path. It scans the watchdog files in there looking for files older than the number of seconds in them. After files matching this criteria are found, thus hung processes, a SIGTERM followed by a SIGKILL are sent to the pid and the watchdog file is unlinked. The amount of time between the TERM and KILL are configurable in the command line.
Please do a
watchd -h
for more information about its usage.
EXPORT
None by default.
HISTORY
AUTHOR
Luis E. Munoz <lem@cantv.net>
SEE ALSO
perl(1).