NAME

DateTime::Format::Builder - create DateTime parser objects.

SYNOPSIS

use DateTime::Format::Builder;

my $parser = DateTime::Format::Builder->parser(
    params => [ qw( year month day hour minute second ) ],
    regex  => qr/^(\d\d\d\d)(\d\d)(\d\d)(\d\d)(\d\d)(\d\d)$/,
);

my $dt = $parser->parse_datetime( "197907161533" );

DESCRIPTION

DateTime::Format::Builder creates DateTime parser objects. Many string formats of dates and times are simple and just require a basic regular expression to extract the relevant information.

As such, they don't require a full blown module to be implemented. Hence, this module was written. It allows you to create parser objects and classes with a minimum of fuss.

FORMATTING vs PARSING

The name of this module is DateTime::Format::Builder. This is, perhaps, somewhat misleading. It should be noted that the word Format is being used as a noun, not a verb.

CONSTRUCTORS

new

Creates a new DateTime::Format::Builder object. If called as an object method, then it clones the object.

No arguments.

parser

If called as a class method, it creates a new DateTime::Format::Builder object with a specified parser. Parameters are as for create_parser.

If called as an object method, it creates a new parser for that object. (Essentially a shortcut for create_parser and set_parser.)

# Class
my $new_parser = DateTime::Format::Builder->parser( ... );

# Object
$new_parser->parser( ... )

As a sidenote, when called as an object method (e.g. $new_parser->parser(...)) then the object iself is returned (e.g. $new_parser).

clone

For those who prefer to explicitly clone via a method called clone(). If called as a class method it will die.

my $clone = $original->clone();

create_class

create_class is different from the other constructors. It creates a full class for the parser, not just an instance of DateTime::Format::Builder.

It takes two optional parameters and one required one.

OPTIONAL PARAMETERS

  • class is the name of the class to create. If not specified then it is inferred to be the current package. Generally best left unspecified.

  • version is the version of the class. Generally best left unspecified unless class is also specified (that is, you're not just preparing the current context). Why? Because CPAN won't pick up a version for a module that isn't specified with a $VERSION like how CPAN wants, it won't behave properly. Ditto ExtUtils::MakeMaker

REQUIRED PARAMETER

parsers is the important parameter. It takes a hashref as an argument. This hashref is a list of method names and arrayrefs of parser specifications.

For example (since the code is often clearer than my writing):

package DateTime::Format::Brief;
use DateTime::Format::Builder;
DateTime::Format::Builder->create_class(
    parsers => {
        parse_datetime => [
        {
            regex => qr/^(\d{4})(\d\d)(d\d)(\d\d)(\d\d)(\d\d)$/,
            params => [qw( year month day hour minute second )],
        },
        {
            regex => qr/^(\d{4})(\d\d)(d\d)$/,
            params => [qw( year month day )],
        },
        ],
    }
);

If you just have one specification, you can just have it without the list:

parse_datetime => {
    regex => qr/^(\d{4})(\d\d)(d\d)$/,
    params => [qw( year month day )],
},

CLASS METHODS

These methods work on either our objects or as class methods.

create_parser

Creates a function to parse datetime strings and return DateTime objects.

# Parse a 15 character ICal string
my $parser_fn = DateTime::Format::Builder->create_parser({
    regex  => qr/^(\d\d\d\d)(\d\d)(\d\d)T(\d\d)(\d\d)(\d\d)$/,
    params => [qw( year month day hour minute second )]
    extra   => {},
});

# Parse an 8 character ICal string
my $short_ical_parser = DateTime::Format::Builder->create_parser(
    {
        params => [ qw( year month day ) ],
        regex  => qr/^(\d\d\d\d)(\d\d)(\d\d)$/,
    }
);

I call the arguments seen above 'specifications', or spec. A reference to such a spec is done in a hashref and I call this a specref. Pardon the introduction of terminology, but it does make things simpler later on.

I specify the layout of a spec below.

create_parser (and most of the other routines because of this) can create a few different sorts of parser. For each type I'll have a bit in parens that indicates a the call style.

SPECIFICATIONS

A specification is typically a hashref (except for simple, single, parser creations where they can be just a hash).

For example, here we have two specifications:

my $inefficient_ical_parser = DateTime::Format::Builder->create_parser(
    {
        regex  => qr/^(\d\d\d\d)(\d\d)(\d\d)T(\d\d)(\d\d)(\d\d)$/,
        params => [qw( year month day hour minute second )]
    },
    {
        params => [ qw( year month day ) ],
        regex  => qr/^(\d\d\d\d)(\d\d)(\d\d)$/,
    },
);

Right. And for further fun and games, any of these specrefs can also be a coderef. The routine will be given $self object (or it may just be a class string) and a date string on input, and is expected to return undef on failure, or a DateTime object on success.

  • regex will be applied to the input of the created function. This argument is required.

  • params is an arrayref that maps the results of regex to parameters of DateTime->new(). The first element is $1, the second $2, etc. This argument is required.

  • extra is a hashref that lists what any extra arguments should be set to. You can use it to specify parameters to DateTime->new(), such as time_zone.

  • on_fail is a reference to a subroutine (anonymous or otherwise) that will be called in the event of a parse failing. It will be passed a hash looking like:

    • input, being the input on which the parser failed

    • label, being the label of the parser, if there is one

  • on_match is just like on_fail, only it's called in the event of success.

  • label provides a name for the parser and is passed to on_fail and on_match. If you specified a set of parsers with some form of X => Y hash style, then by default, the label is the X. That will be overridden if you use this label tag.

  • preprocess is another callback. Its arguments are a hash consisting of the keys input (the datetime string given to the parser) and parsed (a hashref that is initially empty [unless your group of parser specifications had a preprocessor that put something in it]).

    You may put what you like in the hashref, and it will be kept.

    This callback is called after length determination.

  • postprocess is yet another callback. Its arguments the same as for preprocess, except the parsed hashref has been filled out with how the parse went. If parsing failed, it is not called. It is free to modify the hashref. Any changes will be reflected back. If the callback returns false, then the parse is regarded as a failure. Note: ensure you return some true value if you don't want things to fail mysteriously.

If you have a series of specification and want a common preprocessor, it can be specified like this:

my $brief_parser = DateTime::Format::Builder->create_parser(
    [
        preprocess => sub { whatever },
    ],
    {
        regex => qr/^(\d{4})(\d\d)(d\d)(\d\d)(\d\d)(\d\d)$/,
        params => [qw( year month day hour minute second )],
    },
    {
        regex => qr/^(\d{4})(\d\d)(d\d)$/,
        params => [qw( year month day )],
    },
    ],
}

Note that this works with the arrays of specs in create_class too.

Note also that the arrayref must be the first argument.

The preprocess sub is given a hash on input of the date to be parsed and a hashref in which to place any pre-calculated values. The hash keys are input and parsed respectively. The return value should be the date string that the parsers will then go on to process.

A sample preprocessor (taken from DateTime::Format::ICal) looks like this:

my $add_tz = sub {
    my %args = @_;
    my ($date, $p) = @args{qw( input parsed )};
    if ( $date =~ s/^TZID=([^:]+):// )
    {
        $p->{time_zone} = $1;
    }
    # Z at end means UTC
    elsif ( $date =~ s/Z$// )
    {
        $p->{time_zone} = 'UTC';
    }
    else
    {
        $p->{time_zone} = 'floating';
    }
    return $date;
};

Any length calculations (for length parsers) are done after this preprocessing.

OBJECT METHODS

If you actually create a DateTime::Format::Builder object, then you get the following methods on that object.

set_parser / get_parser

Set and get the object's parser function. Fairly straight forward and of minimal use, except for sub classes.

parse_datetime

Given an Builder day number, return a DateTime object representing that date and time.

# Having created our parser, somehow, we can:

my $dt = $parser->parse_datetime( "1998-04-01 15:16:24" );

If you receive errors about things being undefined, then there was a parse failure.

format_datetime

Ok. We don't actually implement this. It's just here to make sure you know we don't. It's implemented like an abstract method: it will die if invoked.

It will be available at some point.

THANKS

Dave Rolsky (DROLSKY) for kickstarting the DateTime project and some much needed review.

Joshua Hoblitt (JHOBLITT) for the concept, some of the API, and more much needed review.

Kellan Elliott-McCrea (KELLAN) for even more review!

Simon Cozens (SIMON) for saying it was cool.

SUPPORT

Support for this module is provided via the datetime@perl.org email list. See http://lists.perl.org/ for more details.

Alternatively, log them via the CPAN RT system via the web or email:

http://perl.dellah.org/rt/dtbuilder
bug-datetime-format-builder@rt.cpan.org

This makes it much easier for me to track things and thus means your problem is less likely to be neglected.

LICENSE AND COPYRIGHT

Copyright © Iain Truskett, 2003. All rights reserved.

This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

The full text of the licenses can be found in the Artistic and COPYING files included with this module.

AUTHOR

Iain Truskett <spoon@cpan.org>

TODO

  • More tests.

  • strptime compatible parsing

  • strftime compatible formatting

SEE ALSO

datetime@perl.org mailing list.

http://datetime.perl.org/

perl, DateTime