NAME
MIME::Parser - split MIME mail into decoded components
SYNOPSIS
# Create a new parser object:
my $parser = new MIME::Parser;
# Set up output directory for files:
$parser->output_dir("$ENV{HOME}/mimemail");
# Set up the prefix for files with auto-generated names:
$parser->output_prefix("part");
# If content length is <= 20000 bytes, store each msg as in-core scalar;
# Else, write to a disk file (the default action):
$parser->output_to_core(20000);
# Parse an input stream:
$entity = $parser->read(\*STDIN) or die "couldn't parse MIME stream";
# Congratulations: you now have a (possibly multipart) MIME entity!
$entity->dump_skeleton; # for debugging
Shortcuts:
# Create a new parser object, and set some properties:
my $parser = new MIME::Parser output_dir => "$ENV{HOME}/mimemail",
output_prefix => "part",
output_to_core => 20000;
DESCRIPTION
A subclass of MIME::ParserBase, providing one useful way to parse MIME streams and obtain MIME::Entity objects. This particular parser class outputs the different parts as files on disk, in the directory of your choice.
If you don't like the way files are named... it's object-oriented and subclassable. If you want to do something really different, perhaps you want to subclass MIME::ParserBase instead.
PUBLIC INTERFACE
- init PARAMHASM
-
Initiallize a new MIME::Parser object. This is automatically sent to a new object; the PARAMHASH can contain the following...
- output_dir
-
The value is passed to output_dir().
- output_prefix
-
The value is passed to output_prefix().
- output_to_core
-
The value is passed to output_to_core().
For example:
$p = new MIME::Parser output_dir => "/tmp/mime", output_to_core => 'ALL';
- evil_filename FILENAME
-
Instance method. Is this an evil filename? It is if it contains any
"/"
characters, or if it's"."
,".."
, or empty.Override this method in a subclass if you just want to change which externally-provided filenames are allowed, and which are not. Like this:
package MIME::MyParser; use MIME::Parser; @ISA = qw(MIME::Parser); sub evil_filename { my ($self, $name) = @_; return ($name !~ /^[a-z\d][a-z\d\._-]*$/i); # only simple names ok }
Note: This method used to be a lot stricter, but it unnecessailry inconvenienced users on non-ASCII systems. That has been changed in 4.x.
Thanks to Andrew Pimlott for finding a real dumb bug in the original version. Thanks to Nickolay Saukh for noting that evil is in the eye of the beholder.
- new_body_for HEAD
-
Instance method. Based on the HEAD of a part we are parsing, return a new body object (any desirable subclass of MIME::Body) for receiving that part's data.
The default behavior is to examine the HEAD for a recommended filename (generating a random one if none is available), and create a new MIME::Body::File on that filename in the parser's current
output_dir()
.If you use the
output_to_core
method (q.v.) before parsing, you can force this method to output some or all or a message's parts to in-core data structures, based on their size.If you want the parser to do something else entirely, you should override this method in a subclass.
- output_dir [DIRECTORY]
-
Instance method. Get/set the output directory for the parsing operation. This is the directory where the extracted and decoded body parts will go. The default is
"."
.If
DIRECTORY
is not given, the current output directory is returned. IfDIRECTORY
is given, the output directory is set to the new value, and the previous value is returned.Note: this is used by the
output_path()
method in this class. It should also be used by subclasses, but if a subclass decides to output parts in some completely different manner, this method may of course be completely ignored. - output_path HEAD
-
Instance method. Given a MIME head for a file to be extracted, come up with a good output pathname for the extracted file.
The "directory" portion of the returned path will be the
output_dir()
, and the "filename" portion will be determined as follows:If the MIME header contains a recommended filename, and it is not judged to be "evil" (evil filenames are ones which contain things like "/" or ".." or non-ASCII characters), then that filename will be used.
If the MIME header contains a recommended filename, but it is judged to be "evil", then a warning is issued and we pretend that there was no recommended filename. In which case...
If the MIME header does not specify a recommended filename, then a simple temporary file name, starting with the
output_prefix()
, will be used.
Note: If you don't like the behavior of this function, you can define your own subclass of MIME::Parser and override it there:
package MIME::MyParser; require 5.002; # for SUPER use package MIME::Parser; @MIME::MyParser::ISA = ('MIME::Parser'); sub output_path { my ($self, $head) = @_; # Your code here; FOR EXAMPLE... if (i_have_a_preference) { return my_custom_path; } else { # return the default path: return $self->SUPER::output_path($head); } } 1;
Note: Nickolay Saukh pointed out that, given the subjective nature of what is "evil", this function really shouldn't warn about an evil filename, but maybe just issue a debug message. I considered that, but then I thought: if debugging were off, people wouldn't know why (or even if) a given filename had been ignored. In mail robots that depend on externally-provided filenames, this could cause hard-to-diagnose problems. So, the message is still a warning, but now it's only output if $^W is true.
Thanks to Laurent Amon for pointing out problems with the original implementation, and for making some good suggestions. Thanks also to Achim Bohnet for pointing out that there should be a hookless, OO way of overriding the output_path.
- output_prefix [PREFIX]
-
Instance method. Get/set the output prefix for the parsing operation. This is a short string that all filenames for extracted and decoded body parts will begin with. The default is "msg".
If
PREFIX
is not given, the current output prefix is returned. IfPREFIX
is given, the output directory is set to the new value, and the previous value is returned. - output_to_core [CUTOFF]
-
Instance method. Normally, instances of this class output all their decoded body data to disk files (via MIME::Body::File). However, you can change this behaviour by invoking this method before parsing:
If CUTOFF is an integer, then we examine the
Content-length
of each entity being parsed. If the content-length is known to be CUTOFF or below, the body data will go to an in-core data structure; If the content-length is unknown or if it exceeds CUTOFF, then the body data will go to a disk file.If the CUTOFF is the string "NONE", then all body data goes to disk files regardless of the content-length. This is the default behaviour.
If the CUTOFF is the string "ALL", then all body data goes to in-core data structures regardless of the content-length. This is very risky (what if someone emails you an MPEG or a tar file, hmmm?) but people seem to want this bit of noose-shaped rope, so I'm providing it.
Without argument, returns the current cutoff: "ALL", "NONE" (the default), or a number.
See the
new_body_for()
method for more details.
WRITING SUBCLASSES
Authors of subclasses can consider overriding the following methods. They are listed in approximate order of most-to-least impact.
- new_body_for
-
Override this if you want to change the entire mechanism for choosing the output destination. You may want to use information in the MIME header to determine how files are named, and whether or not their data goes to a disk file or to an in-core scalar. (You have the MIME header object at your disposal.)
- output_path
-
Override this if you want to completely change how the output path (containing both the directory and filename) is determined for those parts being output to disk files. (You have the MIME header object at your disposal.)
- evil_filename
-
Override this if you want to change the test that determines whether or not a filename obtained from the header is permissible.
- output_prefix
-
Override this if you want to change the mechanism for getting/setting the desired output prefix (used in naming files when no other names are suggested).
- output_dir
-
Override this if you want to change the mechanism for getting/setting the desired output directory (where extracted and decoded files are placed).
AUTHOR
Copyright (c) 1997 by Eryq / eryq@zeegee.com
All rights reserved. This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.
VERSION
$Revision: 4.102 $ $Date: 1997/12/14 03:04:10 $