NAME
Zoidberg::StringParser - simple string parser
SYNOPSIS
my $base_gram = {
esc => '\\',
quotes => {
q{"} => q{"},
q{'} => q{'},
},
};
my $parser = Zoidberg::StringParser->new($base_gram);
my @blocks = $parser->split(
qr/\|/,
qq{ls -al | cat > "somefile with a pipe | in it"} );
# @blocks now is:
# ('ls -al ', ' cat > "somefile with a pipe | in it"');
# So it worked like split, but it respected quotes
DESCRIPTION
This module is a simple syntaxt parser. It originaly was designed to work like the built-in split
function, but to respect quotes. The current version is a little more advanced: it uses user defined grammars to deal with delimiters, an escape char, quotes and braces. Also these grammars can contain hooks to add meta information to each splitted block of text. The parser has a 'pull' mechanism to allow line-by-line parsing, or to define callbacks for when for example an unmatched bracket is encountered.
All grammars and collections of grammars should be considered PRIVATE when used by a Z::SP object.
EXPORT
None by default.
GRAMMARS
TODO
- esc
-
FIXME
If this is an Regexp ref, no double-escape removal is done. Probably if you use a Regexp ref as ecape you also want to set "no_esc_rm".
- no_esc_rm
-
Boolean that tells the parser not to remove the escape char when an escaped token is encountered. Double escapes won't be replaced either. Usefull when a string needs to go through a chain of parsers.
Collection
The collection hash is simply a hash of grammars with the grammar names as keys. When a collection is given all methods can use a grammar name instead of a grammar.
Base grammar
This can be seen as the default grammar, to use it leave the grammar undefined when calling a method. If this base grammar is defined and you specify a grammar at a method call, the specified grammar will overload the base grammar.
METHODS
new(\%base_grammar, \%collection, \%settings)
-
Simple constructor. See "Collection", "Base grammar" and "settings" for explanation of the arguments.
set($grammar, @input_methods)
-
Sets begin state for parser.
$grammar
can either be a hash ref containing a grammar or be the name (key) of a grammar in%collection
. See "input methods" for possible values of@input_methods
. reset()
-
Remove all state information from the parser. Also removes any error messages.
more()
-
Test for more input. Can trigger the pull mechanism.
Intended usage:
$p->set($grammar, @input); while ($p->more) { ($block, $token) = $p->get() }
get()
-
Get next block from input. Intended for atomic use, for most situations either
split
orgetline
will do. next_line()
-
Loads next line of input from "input methods". This method is called internally by the pull mechanism. Intended for atomic use.
split($grammar, @input_methods)
-
Get all blocks till input returns
undef
. Arguments are passed directly toset()
. Blocks will by default be passed as scalar refs (unless the grammar's meta function altered them) and tokens as scalars. To be a little compatible withCORE::split
all items (blocks and tokens) are passed as plain scalars if$grammar
is or was a Regexp reference. ( This behaviour can be faked by giving your grammr a value called 'was_regexp'. ) This behaviour is turned off by the "no_split_intel" setting. getline($grammar, @input_methods)
-
Like split but gets only one line from input and without the "intelligent" behaviour. Will try to get more input when the syntax is incomplete unless "allow_broken" is set.
error()
-
Returns parser error if any. Returns undef if all is well.
input methods
FIXME
settings
The %settings
hash contains options that control the general behaviour of the parser. Supported settings are:
- allow_broken
-
If this value is set the parser will not automaticly pull from input when broken syntax is encountered. Very usefull in combination with the
getline()
method to make sure just one line is read and parsed even if this leaves us with broken syntax. - raise_error
-
Boolean that controls whether the parser dies when an error is encountered - see "DIAGNOSTICS".
- no_split_intel
-
Boolean, disables "intelligent" behaviour of
split()
when set.
DIAGNOSTICS
By default this module will croak for fatal errors like wrong argument types only. For less-fatal errors it sets the error function. Notice that some of these "less-fatal" errors may turn out to be fatal after all. If the raise_error
setting is set all errors will raise an exception.
FIXME splain error messages
AUTHOR
Jaap Karssenberg || Pardus [Larus] <pardus@cpan.org>
Copyright (c) 2003 Jaap G Karssenberg. All rights reserved. This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.
Contains some code derived from Tie-Hash-Stack-0.09 by Michael K. Neylon.
SEE ALSO
1 POD Error
The following errors were encountered while parsing the POD:
- Around line 592:
You forgot a '=back' before '=head1'