NAME

Paranoid::Args - Command-linne argument parsing functions

VERSION

$Id: Args.pm,v 0.21 2009/03/05 00:06:01 acorliss Exp $

SYNOPSIS

use Paranoid::Args;

$rv = parseArgs(\@templates, \%opts);
$rv = parseArgs(\@templates, \%opts, \@args);

@errors = Paranoid::Args::listErrors();
Paranoid::Args::clearMemory();

DESCRIPTION

The purpose of this module is to provide simplified but validated parsing and extraction of command-line arguments (otherwise known as the contents of @ARGV). It is meant to be used in lieu of modules like Getopt::Std and Getopt::Long, but that does not mean that this module is functionally equivalent -- it isn't. There are things that those modules do that this doesn't, but that's primarily by design. My priorities are a bit different when it comes to this particular task.

The primary focus of this module is validation, with the secondary focus being preservation of context.

VALIDATION

When validating the use of options and arguments we concern ourselves primarily the following things:

1)

Is the option accompanied by the requisite arguments?

2)

Was the option called with the other requisite options?

3)

Was the option called without options meant only for mutually exclusive use?

4)

Were any unrecognized options used?

This module also does basic sanity validation of all option templates to ensure correct usage of this module.

PRESERVATION OF CONTEXT

Simply put, preservation of context means remembering the order and grouping of associated arguments. A demonstrative example would perhaps serve better than one of my poor explanations.

Take the hypothetical case of "tagging" files. The traditional approach is to define an option that takes a single string argument and apply them to the remaining contents of @ARGV:

./foo.pl -t "tag1" file1 file2

This module supports that model, with the option argument template being '$' for that single string. But what if you wanted to apply different tags to different files with one command execution?

./foo.pl -t "tag1" file1 file2 -t "tag2" file3

In this case it is important to keep each group of payloads that you want to operate on separate. With this module you could instead use an argument template of '$@', which would return each set independently:

%opt = (
  't' => [
          [ "tag1", [ "file1", "file2" ] ],
          [ "tag2", [ "file3" ] ],
         ],
        );

Notice that we also preserve the context between the '$' and the '@' by putting the '@' arguments in a sublist. With this example that could possible be considered pointless, but we also support templates like '$$@$' which makes this very useful. Now, instead of having to shift or pop off the encapsulating arguments they now have one permanent ordinal index. You also can now just grab the array reference for the '@' portion and iterate over a complete and separate list rather than having to take a splice of the complete argument array.

It's probably just me, but I find that a little easier to track.

SUPPORTED COMMAND-LINE SYNTAX

In keeping with my established tradition of discarding everything I have no use for this module does not support the same range of expressiveness that the Getopt::* modules do. Nor do we support "flexible" modes of differing modes of expressiveness. What we do support we support unconditionally.

The following list of syntactical options are supported:

o

Short option bundling (i.e., "rm -rf")

o

Short option counting (i.e., "ssh -vvv")

o

Short option argument concatenation (i.e., "cut -d' '")

o

Long option "equals" argument concatenation (i.e., "./configure --prefix=/usr")

o

The use of '--' to designate all following arguments are strictly that, even if they look like options.

We don't support the hash key/value pairs (i.e., -s foo=one bar=two) or argument type validation (Getopt::* can validate string, integer, and floating point argument types). And while we support a short & long option we don't support innumerable aliases in addition. In short, if it isn't explicitly documented it isn't supported, though it probably is in Getopt::*.

There are a few restrictions meant to eliminate confusion:

1)

Long and short argument concatenation is only allowed if the argument template is '$' (expecting a single argument, only).

2)

Short argument concatenation is furthermore only allowed on arguments that aren't allowed to be bundled with other short options.

3)

Short options supporting bundling can require associate arguments as long as '@' is not part of the argument template.

SUBROUTINES/METHODS

parseArgs

$rv = parseArgs(\@templates, \%opts);
$rv = parseArgs(\@templates, \%opts, \@args);

Using the option templates passed as the first reference this function populates the options hash with all of the parsed options found in the passed arguments. The args list reference can be omitted if you wish the function to work off of @ARGV. Please note that this function makes a working copy of the array, so no alterations will be made to it.

If any options and/or arguments fail to match the option template, or if an option is found with no template, a text message is pushed into an errors array and the function will return a boolean false.

When the options hash is populated extracted arguments to the options are stored in both long and short form as the keys, assuming they were defined in the template. Otherwise it will use whatever form of option was defined.

Any arguments not associated with an option are stored in the options hash in a list associated with the key PAYLOAD.

Paranoid::Args::listErrors

@errors = Paranoid::Args::listErrors();

If you need a list of everything that was found wrong during a parseArgs run, from template errors to command-line argument validation failures, you can get all of the messages form listErrors. Please note that we show it fully qualified about because it is not exported under any circumstances. If you need these extended diagnostics, you'll need to call it as shown.

Each time parseArgs is invoked this array is reset.

Paranoid::Args::clearMemory

Paranoid::Args::clearMemory();

If the existance of a (most likely) lightly populated array bothers you, you may use this function to empty all internal data structures of their contents. Like listErrors this function is not exported under any circumstances.

OPTION TEMPLATES

The function provided by this module depends on templates to extract and validate the options and arguments. Each option template looks similar to the following:

{
  Short         => 'v',
  Long          => 'verbose',
  Template      => '$',
  CountShort    => 1,
  Multiple      => 1,
  CanBundle     => 1,
  ExclusiveOf   => [],
  AccompaniedBy => [],
}

This template provides extraction of verbose options in the following (and similar) forms:

-vvvvv
--verbose 5
--verbose=5

If CountShort was instead false you'd have to say '-v5' or '-v 5' instead of '-vvvvv'.

When the parseArgs function is called the options hash passed to it would be populated with:

%opts = (
  'v'        => 5,
  'verbose'  => 5,
  );

The redundancy is intentional. Regardless of whether you look up the short or the long name you will be able to retrieve the cummulative value.

The particulars of all key/value pairs in a template are documented below.

Short

Short refers to the form of the short option style (minus the normal preceding '-'). If this is left undefined then no short option is supported.

This parameter is set to undef by default.

NOTE: All short option names must be only one character in length and consisting only of alphanumeric characters.

Long

Long refers to the from of the long option style (minus the normal preceding '--'). If this is left undefined then no long option is supported.

This parameter is set to undef by default.

NOTE: All long option names must be more than one character in length and consisting only of alphanumeric characters and hyphens.

Template

Template refers to the argument template which informs us how many, if any, arguments are required for this option. A template can consist of zero or more of the following characters:

Char  Description
========================================================
$     The option will be followed by a mandatory argument
@     The option will be followed by one or more arguments
''    No additional arguments are expected

For simple boolean options (like '-f') you'd use a zero-length string as the template. The associated value of the option will be either a scalar or a list reference, depending on various parameters in the option template.

If the option has a template of '' then it is assumed that it is a boolean option. The associated value in the options hash would then be a scalar:

# Template: ''
# @ARGV:  -vvv
'v' => 3

with the scalar denoting the number of times it was used in the arguments. It is the same if the template is '$' but CountShort is true. In that case, the template really only applies to the long option (whose argument would set the initial scalar value), while the short options operate purely as an incrementer. However, since everything is processed serially, you get the following results:

# Template '$', CountShort is true
# @ARGV: -vvv --verbose=7 -v --verbose=1 -v
'v' => 2

If the template is '$', but Multiple is false (mandating that the option be used only once) the associated value is again scalar:

# Template: '$'
# @ARGV: -v3
'v' => 3

If the template is '$' and Multiple is true then the associated value is an array reference, with the contents of the array being every argument associated with each option invocation:

# Template: '$'
# @ARGV:  --file foo  --file bar
'file' => [ 'foo', 'bar' ]

If the template is two or more '$' or contains '@' anywhere in the template then the associated value is an array reference. The element where '@' would occur would be an array reference to the list containing everything globbed up by the '@':

# Template:  '$@'
# @ARGV: --chmod 0755 foo bar
'chmod'   => [ '0755', [ 'foo', 'bar' ] ]

If Multiple is true, each element would be a reference to each invocation of the option, with the element organized internally as in the previous example:

# Template: '@'
# @ARGV:  --add 5 7 2 --add 4 9
'add'   => [ [ 5, 7, 2 ], [ 4, 9 ] ]

# Template: '$@$'
# @ARGV: --perform one two three four --perform five six seven
'perform' => [ [ 'one', [ 'two', 'three' ], 'four'],
               [ 'five', [ 'six' ], 'seven' ] ]

NOTE: You cannot use the '@' character if the short option is allowed to be bundled with other options.

This parameter defaults to '' (boolean options).

Multiple

Multiple is a boolean parameter which, if set, allows an option to be used more than once on the command-line.

This parameter defaults to false.

ExclusiveOf

ExclusiveOf is an array of options that this option cannot be used in conjunction with. If the options in this list contain both short and long names you do not have to list them both. Listing only one of the names will suffice.

This parameter defaults to an empty list.

AccompaniedBy

AccompaniedBy is array of options that this option must be accompanied by. If the options in this list contain both short and long names you do not have to list them both. Listing only one of the names will suffice.

This parameter defaults to an empty list.

CanBundle

CanBundle is a boolean parameter which, if set, allows short options to be bundled as part of a single argument (i.e., combining '-r' and '-f' as '-rf').

This parameter defaults to false.

NOTE: if you wish to be able to concatenate a short option and its requisite argument then CanBundle must be set to false.

NOTE: if CanBundle is true and each short option requires a mandatory argument those arguments will be associated with each option in the order in which the options were specified. For example, if 'v' and 'S' each expected a mandatory single argument:

-vuS foo bar

v would be associated with foo, and S with bar. Bundling of short options that use '@' as part of their template is not allowed due to the obvious guaranteed problems which will result.

DEPENDENCIES

o

Paranoid

o

Paranoid::Debug

EXAMPLE

@otemplates = (
    {
      Short       => 'v',
      Long        => 'verbose',
      Multiple    => 1,
      CountShort  => 1,
      CanBundle   => 1,
      Template    => '$',
    },
    {
      Short       => 'f',
      Long        => 'force',
      CanBundle   => 1,
      Template    => '',
    },
    {
      Short       => 'h',
      Long        => 'host',
      Multiple    => 1,
      CanBundle   => 1,
      Template    => '$',
    },
  );

# Process @ARGV:  -vvvfh host1 file1 file2 file3
if (parseArgs(\@otemplates, \%opts, \@errors)) {
  setVerbosity($opts{'verbose'});

  if ($opts{'force'}) {
    foreach (@{ $opts{'host'} }) {
      if (connectToHost($_)) {
        transferFiles(@{ $opts{'PAYLOAD'} });
      }
    }
  }
} else {
  foreach (@errors) { warn "$_\n" };
}

BUGS AND LIMITATIONS

It is not advisable for you to call parseArgs multiple times in a program to process a list of arguments in sections. parseArgs uses an internal flag to note whether or not its seen the '--' argument, which disables all further recognition of arguments as options. That flag is set to false with every invocation, possibly causing problems for later sections if that flag had been used in a prior section.

This doesn't offer the same range of functionality or flexibility of Getopt::Long.

AUTHOR

Arthur Corliss (corliss@digitalmages.com)

LICENSE AND COPYRIGHT

This software is licensed under the same terms as Perl, itself. Please see http://dev.perl.org/licenses/ for more information.

(c) 2005, Arthur Corliss (corliss@digitalmages.com)