NAME

MIME::Parser - split MIME mail into decoded components

SYNOPSIS

# Create a new parser object:
my $parser = new MIME::Parser;
    
# Set up output directory for files:
$parser->output_dir("$ENV{HOME}/mimemail");

# Set up the prefix for files with auto-generated names:
$parser->output_prefix("part");

# If content length is <= 20000 bytes, store each msg as in-core scalar;
# Else, write to a disk file (the default action):
$parser->output_to_core(20000);
     
# Parse an input stream:
$entity = $parser->read(\*STDIN) or die "couldn't parse MIME stream";

# Congratulations: you now have a (possibly multipart) MIME entity!
$entity->dump_skeleton;          # for debugging 

Shortcuts:

# Create a new parser object, and set some properties:
my $parser = new MIME::Parser output_dir     => "$ENV{HOME}/mimemail",
                              output_prefix  => "part",
                              output_to_core => 20000;

DESCRIPTION

A subclass of MIME::ParserBase, providing one useful way to parse MIME streams and obtain MIME::Entity objects. This particular parser class outputs the different parts as files on disk, in the directory of your choice.

If you don't like the way files are named... it's object-oriented and subclassable. If you want to do something really different, perhaps you want to subclass MIME::ParserBase instead.

PUBLIC INTERFACE

init PARAMHASM

Initiallize a new MIME::Parser object. This is automatically sent to a new object; the PARAMHASH can contain the following...

output_dir

The value is passed to output_dir().

output_prefix

The value is passed to output_prefix().

output_to_core

The value is passed to output_to_core().

For example:

$p = new MIME::Parser output_dir => "/tmp/mime",
                      output_to_core => 'ALL';
evil_filename FILENAME

Instance method. Is this an evil filename? It is if it contains any "/" characters, or if it's ".", "..", or empty.

Override this method in a subclass if you just want to change which externally-provided filenames are allowed, and which are not. Like this:

package MIME::MyParser;

use MIME::Parser;
@ISA = qw(MIME::Parser);

sub evil_filename {
    my ($self, $name) = @_;
    return ($name !~ /^[a-z\d][a-z\d\._-]*$/i);   # only simple names ok
}

Note: This method used to be a lot stricter, but it unnecessailry inconvenienced users on non-ASCII systems. That has been changed in 4.x.

Thanks to Andrew Pimlott for finding a real dumb bug in the original version. Thanks to Nickolay Saukh for noting that evil is in the eye of the beholder.

new_body_for HEAD

Instance method. Based on the HEAD of a part we are parsing, return a new body object (any desirable subclass of MIME::Body) for receiving that part's data.

The default behavior is to examine the HEAD for a recommended filename (generating a random one if none is available), and create a new MIME::Body::File on that filename in the parser's current output_dir().

If you use the output_to_core method (q.v.) before parsing, you can force this method to output some or all or a message's parts to in-core data structures, based on their size.

If you want the parser to do something else entirely, you should override this method in a subclass.

output_dir [DIRECTORY]

Instance method. Get/set the output directory for the parsing operation. This is the directory where the extracted and decoded body parts will go. The default is ".".

If DIRECTORY is not given, the current output directory is returned. If DIRECTORY is given, the output directory is set to the new value, and the previous value is returned.

Note: this is used by the output_path() method in this class. It should also be used by subclasses, but if a subclass decides to output parts in some completely different manner, this method may of course be completely ignored.

output_path HEAD

Instance method. Given a MIME head for a file to be extracted, come up with a good output pathname for the extracted file.

The "directory" portion of the returned path will be the output_dir(), and the "filename" portion will be determined as follows:

  • If the MIME header contains a recommended filename, and it is not judged to be "evil" (evil filenames are ones which contain things like "/" or ".." or non-ASCII characters), then that filename will be used.

  • If the MIME header contains a recommended filename, but it is judged to be "evil", then a warning is issued and we pretend that there was no recommended filename. In which case...

  • If the MIME header does not specify a recommended filename, then a simple temporary file name, starting with the output_prefix(), will be used.

Note: If you don't like the behavior of this function, you can define your own subclass of MIME::Parser and override it there:

     package MIME::MyParser;
     
     require 5.002;                # for SUPER
     use package MIME::Parser;
     
     @MIME::MyParser::ISA = ('MIME::Parser');
     
     sub output_path {
         my ($self, $head) = @_;
         
         # Your code here; FOR EXAMPLE...
         if (i_have_a_preference) {
	     return my_custom_path;
         }
	 else {                      # return the default path:
             return $self->SUPER::output_path($head);
         }
     }
     1;

Note: Nickolay Saukh pointed out that, given the subjective nature of what is "evil", this function really shouldn't warn about an evil filename, but maybe just issue a debug message. I considered that, but then I thought: if debugging were off, people wouldn't know why (or even if) a given filename had been ignored. In mail robots that depend on externally-provided filenames, this could cause hard-to-diagnose problems. So, the message is still a warning, but now it's only output if $^W is true.

Thanks to Laurent Amon for pointing out problems with the original implementation, and for making some good suggestions. Thanks also to Achim Bohnet for pointing out that there should be a hookless, OO way of overriding the output_path.

output_prefix [PREFIX]

Instance method. Get/set the output prefix for the parsing operation. This is a short string that all filenames for extracted and decoded body parts will begin with. The default is "msg".

If PREFIX is not given, the current output prefix is returned. If PREFIX is given, the output directory is set to the new value, and the previous value is returned.

output_to_core [CUTOFF]

Instance method. Normally, instances of this class output all their decoded body data to disk files (via MIME::Body::File). However, you can change this behaviour by invoking this method before parsing:

If CUTOFF is an integer, then we examine the Content-length of each entity being parsed. If the content-length is known to be CUTOFF or below, the body data will go to an in-core data structure; If the content-length is unknown or if it exceeds CUTOFF, then the body data will go to a disk file.

If the CUTOFF is the string "NONE", then all body data goes to disk files regardless of the content-length. This is the default behaviour.

If the CUTOFF is the string "ALL", then all body data goes to in-core data structures regardless of the content-length. This is very risky (what if someone emails you an MPEG or a tar file, hmmm?) but people seem to want this bit of noose-shaped rope, so I'm providing it.

Without argument, returns the current cutoff: "ALL", "NONE" (the default), or a number.

See the new_body_for() method for more details.

WRITING SUBCLASSES

Authors of subclasses can consider overriding the following methods. They are listed in approximate order of most-to-least impact.

new_body_for

Override this if you want to change the entire mechanism for choosing the output destination. You may want to use information in the MIME header to determine how files are named, and whether or not their data goes to a disk file or to an in-core scalar. (You have the MIME header object at your disposal.)

output_path

Override this if you want to completely change how the output path (containing both the directory and filename) is determined for those parts being output to disk files. (You have the MIME header object at your disposal.)

evil_filename

Override this if you want to change the test that determines whether or not a filename obtained from the header is permissible.

output_prefix

Override this if you want to change the mechanism for getting/setting the desired output prefix (used in naming files when no other names are suggested).

output_dir

Override this if you want to change the mechanism for getting/setting the desired output directory (where extracted and decoded files are placed).

AUTHOR

Copyright (c) 1997 by Eryq / eryq@zeegee.com

All rights reserved. This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

VERSION

$Revision: 4.102 $ $Date: 1997/12/14 03:04:10 $