NAME

Text::Glob::Expand - permute and expand glob-like text patterns

VERSION

version 0.1.1

SYNOPSIS

The original use case was to specify hostname aliases and expansions thereof. For example, it supports basic expansion of the glob expression into its permutations like this:

use Text::Glob::Expand;

my $hosts = "{www{1,2,3},mail{1,2},ftp{1,2}}";
my $glob = Text::Glob::Expand->parse($hosts);

my $permutations = $glob->explode;
# result is: [qw(www1 www2 www3 mail1 mail2 ftp1 ftp2)]

But additionally, to generate full hostnames, it supports a method to expand these permutations using a format string:

my $permutations = $glob->explode_format("%0.somewhere.co.uk");

# result is:
# {
#     www1 => 'www1.somewhere.co.uk',
#     www2 => 'www2.somewhere.co.uk',
#     www3 => 'www3.somewhere.co.uk',
#     mail1 => 'mail1.somewhere.co.uk',
#     mail2 => 'mail2.somewhere.co.uk',
#     ftp1 => 'ftp1.somewhere.co.uk',
#     ftp2 => 'ftp2.somewhere.co.uk',
# }

INTERFACE

$obj = $class->parse($string)

This is the constructor. It implements a simple state-machine to parse the expression in $string, and returns a Text::Glob::Expand object.

You don't really need to understand the structure of this object, just invoke methods on it. However, see "PARSING RULES" for more details of the expression and the internal structure of the object returned.

$arrayref = $obj->explode

This returns an arrayref containing all the expanded permutations generated from the string parsed by the constructor.

(The result is cached, and returned again if this is called more than once. See $MAX_CACHING.)

$hashref = $obj->explode_format($format)

This returns a hashref mapping each expanded permutation to a string generated from the $format parameter.

(The return value is not cached, since the result depends on $format.)

PARSING RULES

Using a notation based on a subset of the Backus Naur Form described by the HTTP 1.1 RFC (with the notable exception that white-space is significant here) the expression syntax expected by the ->parse method can be defined like this:

expression =
   segment *( brace-expression segment )

A segment is a sequence of zero or more characters or escaped-characters (i.e. braces and commas must be escaped with a preceding backslash).

segment =
   *( escaped-character | <any character except glob-characters> )

Where:

escaped-character =
   "\" <any char>

glob-character =
   "{" | "}" | ","

A brace-expression is a sequence of one or more expressions (which in this context I call 'alternatives'), delimited by commas, and enclosed in braces.

brace-expression =
   "{" expression ( "," expression )* "}"

OBJECT STRUCTURE

An expression such as described in the previous above is parsed into an arrayref of text segments (represented with Text::Glob::Expand::Segment instances) and brace-expressions (represented by arrayrefs).

An expression is represented at the top level by a Text::Glob::Expand instance, which is a blessed arrayref containing only Text::Glob::Expand::Segment instances and brace-expression arrayrefs.

Each brace-expression array contains a list of the brace-expression's 'alternatives' (the comma-delimited sub-expressions within the braces). These are represented by arrayrefs. Apart from being unblessed, they otherwise have the same structure as the top-level expression.

Text::Glob::Expand::Segment instances are blessed arrayrefs, composed of a string plus an integer (>= 0) indicating the number of brace-pairs enclosing the segment. The depth is used internally to preserve the expression structure, and may be ignored by the user. (See also Text::Glob::Expand::Segment.)

For example, an expression such as:

"a{b,{c,d}e,}g"

Will be parsed into something analogous to this structure (for better readability I use a simplified Perl data-structure in which segments are represented by simple strings instead of blessed arrays, and use comments to denote types):

[ # expression
  'a', # segment depth 0
  [ # brace
    [ # expression
      'b' # segment, depth 1
    ],
    [ # expression
      '', # segment, depth 1
      [ # brace
        [ # expression
          'c' # segment, depth 2
        ],
        [ # expression
          'd' # segment, depth 2
        ],
      ],
      'e' # segment, depth 1
    ],
    [ # expression
      '' # segment, depth 1
    ]
  ],
  'g', # segment, depth 0
]

DIAGNOSTICS

The following parsing diagnostics should never actually occur. If they do it means the internal data structure or code design is inconsistent. In this case, please file a bug report with details of how to replicate the error.

"unexpected scalar..."
"no such state..."
"no handler for state '...' looking at '...' pos '...'"

CONFIGURATION AND ENVIRONMENT

Text::Glob::Expand requires no configuration files or environment variables.

There is one configurable option in the form of a package variable, as follows.

$MAX_CACHING

The package variable $Text::Glob::Expand::MAX_CACHING can be used to control or disable the caching done by the ->explode method. It should be a positive integer, or zero.

The default value is 100, which means that up to 100 Text::Glob::Expand objects' ->explode results will be cached, but no more. You can disable caching by setting this to zero or less.

DEPENDENCIES

The dependencies should be minimal - I aim to have none.

For a definitive answer, see the Build.PL file included in the distribution, or use the dependencies tool on http://search.cpan.org

BUGS AND LIMITATIONS

Currently the parser will infer closing brackets at the end of an expression if they are omitted. Probably a syntax error should be thrown instead.

Also, extra closing brackets with no matching opening bracket will generate an error. This is a bug which will be addressed in future versions.

Please report any bugs or feature requests to bug-Text-Glob-Expand@rt.cpan.org, or through the web interface at http://rt.cpan.org.

SEE ALSO

Similar libraries I am aware of are:

Text::Glob

Wildcard matching against strings, which includes alternation (brace expansion).

String::Glob::Permute

A permutation generator similar to this one. Supports numbered ranges, but not format string expansion.

Plus there is of course Perl's own glob function, which supports brace expansions. That however can be sensitive to unusually-named files in the current director - and more importantly, like String::Glob::Permute it does not implement format string expansions.

AUTHOR

Nick Stokoe <wulee@cpan.org>

LICENCE AND COPYRIGHT

Copyright (c) 2011, Nick Stokoe <wulee@cpan.org>. All rights reserved.

This module is free software; you can redistribute it and/or modify it under the same terms as Perl itself. See perlartistic.

DISCLAIMER OF WARRANTY

BECAUSE THIS SOFTWARE IS LICENSED FREE OF CHARGE, THERE IS NO WARRANTY FOR THE SOFTWARE, TO THE EXTENT PERMITTED BY APPLICABLE LAW. EXCEPT WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES PROVIDE THE SOFTWARE "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE SOFTWARE IS WITH YOU. SHOULD THE SOFTWARE PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING, REPAIR, OR CORRECTION.

IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY AND/OR REDISTRIBUTE THE SOFTWARE AS PERMITTED BY THE ABOVE LICENCE, BE LIABLE TO YOU FOR DAMAGES, INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL, OR CONSEQUENTIAL DAMAGES ARISING OUT OF THE USE OR INABILITY TO USE THE SOFTWARE (INCLUDING BUT NOT LIMITED TO LOSS OF DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD PARTIES OR A FAILURE OF THE SOFTWARE TO OPERATE WITH ANY OTHER SOFTWARE), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGES.