NAME
Mail::Graph - draw graphical stats for mails/spams
SYNOPSIS
use Mail::Graph;
$graph = Mail::Graph->new( items => 'spam',
output => 'spams/',
input => '~/Mail/spam/',
);
$graph->generate();
DESCRIPTION
This module parses mailbox files in either compressed or uncompressed form and then generates pretty statistics and graphs about them. Although at first developed to do spam statistics, it works just fine for normal mail.
File Format
The module reads in files in mbox format. These can be compressed by gzip, or just plain text. Since the module read in any files that are in one directory, it can also handle mail-dir style folders, e.g. a directory where each mail resides in an extra file.
The file format is quite simple and looks like this:
From sample_foo@example.com Tue Oct 27 18:38:52 1998
Received: from barfel by foo.example.com (8.9.1/8.6.12)
From: forged_bar@example.com
X-Envelope-To: <sample_foo@example.com>
Date: Tue, 27 Oct 1998 09:52:14 +0100 (CET)
Message-Id: <199810270852.12345567@example.com>
To: <none@example.com>
Subject: Sorry...
X-Loop-Detect: 1
X-Spamblock: caught by rule dummy@
This is a sample spam
Basically, an email header plus email body, separated by the From
lines.
The following fields are examined to determine:
X-Envelope-To the target address/domain
From address@domain the sender
From date the receiving date
METHODS
new()
Create a new Mail::Graph object.
The following options exist:
input Path to a directory containing (gzipped) mbox files
Alternatively, name of an (gzipped) mbox file
index Directory where to write (and read) the index files
output Directory where to write the output stats
items Try 'spams' or 'mails' (can be any string)
generate hash with names of stats to generate (1=on, 0=off):
month per each month of the year
day per each day of the month
hour per each hour of the day
dow per each day of the week
yearly per year
daily per each day (with average)
monthly per each month
toplevel per top_level domain
rule per filter rule that matched
target per target address
domain per target domain
last_x_days items for each of the last x days
set it to the number of days you want
score_histogram show histogram of SpamAssassin scores
set it to the step-width (like 5)
score_daily SA score for each of the last x days
set it to the number of days you want
score_scatter SA scatter score diagram, set it to
the limit of the score (a line will be
draw there)
average set to 0 to disable, otherwise it gives the number
of days/weeks/month to average over
average_daily if not set, uses average, 0 to disable
number of days to average over in the daily graph
height base height of the generated images
template name of the template file (ending in .tpl) that is
used to generate the html output, e.g. 'index.tpl'
no_title set to 1 to disable graph titles, default 0
filter_domains array ref with list of domains to show as "unknown"
filter_target array ref with list of targets (regualr expressions)
graph_ext extension of the generated graphs, default 'png'
last_date in yyyy-mm-dd format: specify the last used date, any
mail newer than that will be skipped. Defaults to today
first_date in yyyy-mm-dd format: specify the first used date, any
mail older than that will be skipped. Defaults to undef
meaning any old mail will be considered.
generate()
Generate the stats, fill in the template and write it out. Takes no options.
error()
Return an error message or undef for no error.
BUGS
There are a couple of known bugs, some of the are unfinished features or problem of GD::Graph:
- Divide by Zero
-
This is a bug in some versions of GD::Graph, when generating a graph with only one bar it will crash with this error. If you encounter this, please bug the author of GD::Graph and send me a copy.
- Argument "4, 0.7%" isn't numeric
-
You might get a lot of warnings like
Argument "4, 0.7%" isn't numeric in numeric lt (<) at /usr/lib/perl5/site_perl/5.8.2/GD/Graph/Data.pm line 231.
This is a problem with GD::Graph: Mail::Graph wants to use labels like
4, 0.7%
but GD::Graphs uses the same string for the label and the value of the point/bar. And thus Perl warns. This needs a small patch to GD::Graph that strips anything non-numeric out of the label before using it in numeric context. Please bug the author of GD::Graph and send me a copy. - gzipped archives are not included in the stats
-
Some of the gzipped archives seem to trigger some bug in Compress::Zlib, at least til version v1.32. For instance, on my system on of the sample archives in
/sample/archives/
is not read properly by Compress::Zlib. I already have notified the author of Compress::Zlib.
LICENSE
This program is free software; you may redistribute it and/or modify it under the same terms as Perl itself.
AUTHOR
(c) Copyright by Tels http://bloodgate.com/ 2002.