NAME
Mail::SpamAssassin::Message - decode, render, and hold an RFC-2822 message
DESCRIPTION
This module encapsulates an email message and allows access to the various MIME message parts and message metadata.
The message structure, after initiating a parse() cycle, looks like this:
Message object, also top-level node in Message::Node tree
|
+---> Message::Node for other parts in MIME structure
| |---> [ more Message::Node parts ... ]
| [ others ... ]
|
+---> Message::Metadata object to hold metadata
PUBLIC METHODS
- new()
-
Creates a Mail::SpamAssassin::Message object. Takes a hash reference as a parameter. The used hash key/value pairs are as follows:
message
is either undef (which will use STDIN), a scalar of the entire message, an array reference of the message with 1 line per array element, or a file glob which holds the entire contents of the message.Note: The message is expected to generally be in RFC 2822 format, optionally including an mbox message separator line (the "From " line) as the first line.
parse_now
specifies whether or not to create the MIME tree at object-creation time or later as necessary.The parse_now option, by default, is set to false (0). This allows SpamAssassin to not have to generate the tree of Mail::SpamAssassin::Message::Node objects and their related data if the tree is not going to be used. This is handy, for instance, when running
spamassassin -d
, which only needs the pristine header and body which is always handled when the object is created.subparse
specifies how many MIME recursion levels should be parsed. Defaults to 20. - _do_parse()
-
Non-Public function which will initiate a MIME part parse (generates a tree) of the current message. Typically called by find_parts() as necessary.
- find_parts()
-
Used to search the tree for specific MIME parts. See Mail::SpamAssassin::Message::Node for more details.
- get_pristine_header()
-
Returns pristine headers of the message. If no specific header name is given as a parameter (case-insensitive), then all headers will be returned as a scalar, including the blank line at the end of the headers.
If called in an array context, an array will be returned with each specific header in a different element. In a scalar context, the last specific header is returned.
ie: If 'Subject' is specified as the header, and there are 2 Subject headers in a message, the last/bottom one in the message is returned in scalar context or both are returned in array context.
Note: the returned header will include the ending newline and any embedded whitespace folding.
- get_mbox_separator()
-
Returns the mbox separator found in the message, or undef if there wasn't one.
- get_body()
-
Returns an array of the pristine message body, one line per array element.
- get_pristine()
-
Returns a scalar of the entire pristine message.
- get_pristine_body()
-
Returns a scalar of the pristine message body.
- extract_message_metadata($main)
- $str = get_metadata($hdr)
- put_metadata($hdr, $text)
- delete_metadata($hdr)
- $str = get_all_metadata()
- finish_metadata()
-
Destroys the metadata for this message. Once a message has been scanned fully, the metadata is no longer required. Destroying this will free up some memory.
- finish()
-
Clean up an object so that it can be destroyed.
- receive_date()
-
Return a time_t value with the received date of the current message, or current time if received time couldn't be determined.
PARSING METHODS, NON-PUBLIC
These methods take a RFC2822-esque formatted message and create a tree with all of the MIME body parts included. Those parts will be decoded as necessary, and text/html parts will be rendered into a standard text format, suitable for use in SpamAssassin.
- parse_body()
-
parse_body() passes the body part that was passed in onto the correct part parser, either _parse_multipart() for multipart/* parts, or _parse_normal() for everything else. Multipart sections become the root of sub-trees, while everything else becomes a leaf in the tree.
For multipart messages, the first call to parse_body() doesn't create a new sub-tree and just uses the parent node to contain children. All other calls to parse_body() will cause a new sub-tree root to be created and children will exist underneath that root. (this is just so the tree doesn't have a root node which points at the actual root node ...)
- _parse_multipart()
-
Generate a root node, and for each child part call parse_body() to generate the tree.
- _parse_normal()
-
Generate a leaf node and add it to the parent.