NAME
DateTime::Format::Builder - create DateTime parser objects.
SYNOPSIS
use DateTime::Format::Builder;
my $parser = DateTime::Format::Builder->parser(
params => [ qw( year month day hour minute second ) ],
regex => qr/^(\d\d\d\d)(\d\d)(\d\d)(\d\d)(\d\d)(\d\d)$/,
);
my $dt = $parser->parse_datetime( "197907161533" );
DESCRIPTION
DateTime::Format::Builder
creates DateTime parser objects. Many string formats of dates and times are simple and just require a basic regular expression to extract the relevant information.
As such, they don't require a full blown module to be implemented. Hence, this module was written. It allows you to create parser objects and classes with a minimum of fuss.
FORMATTING vs PARSING
The name of this module is DateTime::Format::Builder
. This is, perhaps, somewhat misleading. It should be noted that the word Format
is being used as a noun, not a verb.
CONSTRUCTORS
new
Creates a new DateTime::Format::Builder
object. If called as an object method, then it clones the object.
No arguments.
parser
If called as a class method, it creates a new DateTime::Format::Builder
object with a specified parser. Parameters are as for create_parser.
If called as an object method, it creates a new parser for that object. (Essentially a shortcut for create_parser
and set_parser
.)
# Class
my $new_parser = DateTime::Format::Builder->parser( ... );
# Object
$new_parser->parser( ... )
As a sidenote, when called as an object method (e.g. $new_parser->parser(...)
) then the object iself is returned (e.g. $new_parser
).
clone
For those who prefer to explicitly clone via a method called clone()
. If called as a class method it will die.
my $clone = $original->clone();
create_class
create_class
is different from the other constructors. It creates a full class for the parser, not just an instance of DateTime::Format::Builder
.
It takes two optional parameters and one required one.
OPTIONAL PARAMETERS
class
is the name of the class to create. If not specified then it is inferred to be the current package. Generally best left unspecified.version
is the version of the class. Generally best left unspecified unlessclass
is also specified (that is, you're not just preparing the current context). Why? Because CPAN won't pick up a version for a module that isn't specified with a$VERSION
like how CPAN wants, it won't behave properly. Ditto ExtUtils::MakeMaker
REQUIRED PARAMETER
parsers
is the important parameter. It takes a hashref as an argument. This hashref is a list of method names and arrayrefs of parser specifications.
For example (since the code is often clearer than my writing):
package DateTime::Format::Brief;
use DateTime::Format::Builder;
DateTime::Format::Builder->create_class(
parsers => {
parse_datetime => [
{
regex => qr/^(\d{4})(\d\d)(d\d)(\d\d)(\d\d)(\d\d)$/,
params => [qw( year month day hour minute second )],
},
{
regex => qr/^(\d{4})(\d\d)(d\d)$/,
params => [qw( year month day )],
},
],
}
);
If you just have one specification, you can just have it without the list:
parse_datetime => {
regex => qr/^(\d{4})(\d\d)(d\d)$/,
params => [qw( year month day )],
},
CLASS METHODS
These methods work on either our objects or as class methods.
create_parser
Creates a function to parse datetime strings and return DateTime objects.
# Parse a 15 character ICal string
my $parser_fn = DateTime::Format::Builder->create_parser({
regex => qr/^(\d\d\d\d)(\d\d)(\d\d)T(\d\d)(\d\d)(\d\d)$/,
params => [qw( year month day hour minute second )]
extra => {},
});
# Parse an 8 character ICal string
my $short_ical_parser = DateTime::Format::Builder->create_parser(
{
params => [ qw( year month day ) ],
regex => qr/^(\d\d\d\d)(\d\d)(\d\d)$/,
}
);
I call the arguments seen above 'specifications', or spec
. A reference to such a spec
is done in a hashref and I call this a specref
. Pardon the introduction of terminology, but it does make things simpler later on.
I specify the layout of a spec
below.
create_parser
(and most of the other routines because of this) can create a few different sorts of parser. For each type I'll have a bit in parens that indicates a the call style.
SPECIFICATIONS
A specification is typically a hashref (except for simple, single, parser creations where they can be just a hash).
For example, here we have two specifications:
my $inefficient_ical_parser = DateTime::Format::Builder->create_parser(
{
regex => qr/^(\d\d\d\d)(\d\d)(\d\d)T(\d\d)(\d\d)(\d\d)$/,
params => [qw( year month day hour minute second )]
},
{
params => [ qw( year month day ) ],
regex => qr/^(\d\d\d\d)(\d\d)(\d\d)$/,
},
);
Right. And for further fun and games, any of these specrefs
can also be a coderef. The routine will be given $self
object (or it may just be a class string) and a date string on input, and is expected to return undef on failure, or a DateTime
object on success.
regex
will be applied to the input of the created function. This argument is required.params
is an arrayref that maps the results ofregex
to parameters ofDateTime->new()
. The first element is$1
, the second$2
, etc. This argument is required.extra
is a hashref that lists what any extra arguments should be set to. You can use it to specify parameters toDateTime->new()
, such astime_zone
.on_fail
is a reference to a subroutine (anonymous or otherwise) that will be called in the event of a parse failing. It will be passed a hash looking like:input
, being the input on which the parser failedlabel
, being the label of the parser, if there is one
on_match
is just likeon_fail
, only it's called in the event of success.label
provides a name for the parser and is passed toon_fail
andon_match
. If you specified a set of parsers with some form ofX => Y
hash style, then by default, the label is theX
. That will be overridden if you use thislabel
tag.preprocess
is another callback. Its arguments are a hash consisting of the keysinput
(the datetime string given to the parser) andparsed
(a hashref that is initially empty [unless your group of parser specifications had a preprocessor that put something in it]).You may put what you like in the hashref, and it will be kept.
This callback is called after length determination.
postprocess
is yet another callback. Its arguments the same as forpreprocess
, except theparsed
hashref has been filled out with how the parse went. If parsing failed, it is not called. It is free to modify the hashref. Any changes will be reflected back. If the callback returns false, then the parse is regarded as a failure. Note: ensure you return some true value if you don't want things to fail mysteriously.
If you have a series of specification and want a common preprocessor, it can be specified like this:
my $brief_parser = DateTime::Format::Builder->create_parser(
[
preprocess => sub { whatever },
],
{
regex => qr/^(\d{4})(\d\d)(d\d)(\d\d)(\d\d)(\d\d)$/,
params => [qw( year month day hour minute second )],
},
{
regex => qr/^(\d{4})(\d\d)(d\d)$/,
params => [qw( year month day )],
},
],
}
Note that this works with the arrays of specs in create_class
too.
Note also that the arrayref must be the first argument.
The preprocess
sub is given a hash on input of the date to be parsed and a hashref in which to place any pre-calculated values. The hash keys are input
and parsed
respectively. The return value should be the date string that the parsers will then go on to process.
A sample preprocessor (taken from DateTime::Format::ICal) looks like this:
my $add_tz = sub {
my %args = @_;
my ($date, $p) = @args{qw( input parsed )};
if ( $date =~ s/^TZID=([^:]+):// )
{
$p->{time_zone} = $1;
}
# Z at end means UTC
elsif ( $date =~ s/Z$// )
{
$p->{time_zone} = 'UTC';
}
else
{
$p->{time_zone} = 'floating';
}
return $date;
};
Any length calculations (for length parsers) are done after this preprocessing.
OBJECT METHODS
If you actually create a DateTime::Format::Builder
object, then you get the following methods on that object.
set_parser / get_parser
Set and get the object's parser function. Fairly straight forward and of minimal use, except for sub classes.
parse_datetime
Given an Builder day number, return a DateTime
object representing that date and time.
# Having created our parser, somehow, we can:
my $dt = $parser->parse_datetime( "1998-04-01 15:16:24" );
If you receive errors about things being undefined, then there was a parse failure.
format_datetime
Ok. We don't actually implement this. It's just here to make sure you know we don't. It's implemented like an abstract method: it will die if invoked.
It will be available at some point.
THANKS
Dave Rolsky (DROLSKY) for kickstarting the DateTime project and some much needed review.
Joshua Hoblitt (JHOBLITT) for the concept, some of the API, and more much needed review.
Kellan Elliott-McCrea (KELLAN) for even more review!
Simon Cozens (SIMON) for saying it was cool.
SUPPORT
Support for this module is provided via the datetime@perl.org email list. See http://lists.perl.org/ for more details.
Alternatively, log them via the CPAN RT system via the web or email:
http://perl.dellah.org/rt/dtbuilder
bug-datetime-format-builder@rt.cpan.org
This makes it much easier for me to track things and thus means your problem is less likely to be neglected.
LICENSE AND COPYRIGHT
Copyright © Iain Truskett, 2003. All rights reserved.
This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.
The full text of the licenses can be found in the Artistic and COPYING files included with this module.
AUTHOR
Iain Truskett <spoon@cpan.org>
TODO
More tests.
strptime compatible parsing
strftime compatible formatting
SEE ALSO
datetime@perl.org
mailing list.