NAME
Params::Clean (Parse A Routine Allowing Modest Syntax--Casually List Explicit Arg Names): Process @_ as positional/named/flag/list/typed arguments
SYNOPSIS
Instead of starting your sub with my ($x, $y, $z) = @_;
#Get positional args, named args, and flags
my ( $x, $y, $z, $blue, $man, $group, $semaphore, $six_over_texas )
= args POSN 0, 1, 2, NAME fu, man, chu, FLAG pennant, banner;
#Any of the three types of argument is optional
my ($tom, $dick, $harry) = args NAME tom, randal, larry;
#...or repeatable -- order doesn't matter
my ($p5, $s, @others) = args NAME pearl, FLAG white, NAME ruby, POSN 0;
#If no types specified, ints are taken to mean positional args, text as named
my ($fee, $fo, $fum) = args 0, -1, jack;
#Can also retrieve any args left over after pulling out NAMEs/FLAGs/POSNs/etc.
my ($gilligan, $skipper, $thurston, $lovey, $ginger, @prof_mary_ann)
= args first_mate, skipper, millionaire, wife, star, REST;
#Or collect args that qualify as matching a certain type
my ($objects, @rest) = args TYPE "Class::Name", REST; # ref() string
my ($files, @rest) = args TYPE \&is_filehandle, REST; # code-ref
#Specify a LIST by giving starting and (optional) ending points
# <=> includes end-point in the returned list; <= excludes it
my ($fields, $tables, $conditions)
= args LIST Select<=From, LIST From<=Where, LIST Where<=>-1;
#Or by giving a list of positions relative to the LIST's starting point
my ($man, $machine) = args LIST vs = [-1, 1];
my ($tick, $santa) = args LIST vs & [-1, 1]; # include starting key
my ($kong, $godzilla)=args LIST vs ^ [-1, 1]; # exclude starting key
#Specify synonymous alternatives using brackets
my ($either_end, $tint) = args [0, -1], [Colour, Color];
VERSION
Version 0.9.2 (August 2007)
INTRODUCTION
Params::Clean
is intended to provide a relatively simple and clean way to parse an argument list. Perl subroutines typically assign the values of @_
to a list of variables, which is even simpler and cleaner, but has the disadvantage that all the parameters are thus determined by position. If you have optional parameters, or are worried about the order in which they might be passed (it can be a pain to have to know the order when there are more than a couple of arguments), it's much nicer to be able to use named arguments.
The traditional way to pass a bunch of named arguments is to interpret @_
as a hash (a series of paired parameter names and values). Easy, but you have to refer to your arguments via the hash, and you can't have multiple parameters with the same name or any parameters that aren't named. There are many modules that provide nifty mechanisms for much fancier arg processing; however, they entail a certain amount of overhead to work their magic. (Even in simple cases, they usually at least require extra punctuation or brackets.)
Params::Clean
lacks various advanced features in favour of a minimal interface. It's meant to be easy to learn and easier to use, covering the most common cases in a way that keeps your code simple and obvious. If you need something more powerful (or just think code should be as hard to read as it was to write (and real programmers know that it should!)), then this module may not be for you.
(Params::Clean
does have a few semi-advanced features, but you may need extra punctuation to use them. (In some cases, even extra brackets.))
DESCRIPTION
Basics
In its simplest form, the args
function provided by Params::Clean
takes a series of names or positions and returns the arguments that correspond to those positions in @_
, or that are identified by those names. The values are returned in the same order that you ask for them in the call to args
. @_
itself is never changed. (Thus you could call args
several times, if you wanted to for some reason. You can also manipulate @_
before calling args
.)
marine("begin", bond=>007, "middle", smart=>86, "end");
sub marine
{
my ($first, $last, $between, $maxwell, $james)=args 0,-1, 3, 'smart','bond';
#==>"begin" "end" "middle" 86 007
my ($last, $max, $between, $first, $jim) = args(6, 'smart', -4, 0, 'bond');
#same thing in a different order
}
By default, integers passed to args
are taken to refer to positions in @_
, and anything else is taken to be a name, or key, that returns the element following it if it is found in @_
. (Note that you can use negative values to count backwards from the end of @_
. If some values are too big or too small for the number of elements in @_
, undef is returned for those positions.)
There is nothing special about the names as far as Perl is concerned: calling a function passes a list via @_
as always. Then args
loops through @_
and looks for matching elements; if it finds a match, the element of @_
following the key is returned. If no match is found, undef is returned, and if multiple matches are found, a reference is returned to an array containing all the appropriate values (in the order in which they occurred in @_
).
human(darryl=>$brother, darryl=>$other_brother);
sub human
{
my ($larry, $darryls) = args Larry, Darryl;
#==> undef [$brother, $other_brother]
}
Keys are insensitive to case by default, but this is controlled by whether $Params::Clean::CaseSensitive
is true or not when args
is called.
Note that although
Params::Clean
will let you mix named and positional arguments indiscriminately, that doesn't mean it's a good idea, of course. It's not uncommon to have one or a few positional args required at the beginning of a parameter list, followed by various (optional) named args. In particular, methods always have the object passed as the argument in position 0. It also might be reasonable sometimes to use fixed positions at the end of an arg list (since we can refer to them with negative positions). Trying to mix named and positional params in the middle of your args, though, is asking for confusion. (But many of the examples here do that for the sake of demonstrating how things work!)
POSN/NAME/FLAG identifiers
You can also explicitly identify the kind of parameter using the keywords POSN
or NAME
. This can be useful when you have, for example, keys that look like integers but that you want to treat as named keys.
tract(1=>money, 2=>show, 3=>'get ready', Four, go);
sub tract
{
my ($one, $two, $three, $four) = args NAME 1, 2, 3, four;
#==> money show get ready go
#Without the NAMES identifier, the 1/2/3 would be interpreted as positions:
# $two would end up as "2" (the third element of @_), $three as "show", etc.
}
Conversely, you could use the POSN
keyword to force parameters to be interpreted positionally. (Of course, most strings reduce to a numeric value of zero, which refers to the first position.)
Besides named parameters, you can also pass FLAG
s to a function -- flags work like names, except that they do not take their value from the following element of @_
; they simply become true if they are found. More exactly, flags are counted; a flag returns undef
if it does not occur in @_
, or returns the count of the number of times it was matched. (This allows you to handle flags such as a "verbose" switch that can have a differing effect depending on how many times it was used.)
scribe(black, white, red_all_over, black, jack, black);
sub scribe
{
my ($raid, $surrender, $rule, $britannia)=args FLAG qw/black white union jack/;
#==> 3 1 undef 1
}
The identifiers (POSN, NAME, FLAG
) can be mixed and repeated in any order, as desired. The default integer/string distinction applies only until the first identifier is encountered; once an identifier is used, it remains in effect until another identifier is found. (Well, except in the case of alternatives, as explained in the next section.)
Alternative parameter names
There may be situations where you want to mix different parameters together; that is, return all the args named "foo" and all the args named "bar" in one set, as though they were all named "foo" (or all named "bar"). You can specify alternatives that should be treated as synonymous by putting them in square brackets (i.e., using an array-ref). If a single match is found, it is grabbed; if there are more, they are all returned as an array-ref (or in the case of a flag, it will be incremented as many times as there are matches).
text(hey=>there, colour=>123, over=>here, color=>321);
sub text
{
my ($horses, $hues, $others)
=args [hey, hay], [colour, color], [4, 5];
#===> there [123, 321] [over, here]
}
As the example shows, this also works for positional parameters, so you can return multiple positions as a single arg too. Like any parameters, synonyms are by default positional (if numeric) or named (if not); they are also affected normally by any identifier (POSN
/NAME
/FLAG
) that precedes them. If you specify an identifier inside the alternatives, the brackets provide a limited scope, so the identifier does not extend to any parameters outside the list of alternatives.
lime(alpha, Jack=>"B. Nimble", verbosity, verbosity);
sub lime
{
my ($start, $verb, $water_bearer, $pomp)
=args [0, FIRST], FLAG verbosity, [NAME Jack, Jill], pomposity;
#===> alpha 2 B. Nimble
}
Without the NAME
identifier, "Jack" and "Jill" would be parsed as flags; if the NAME
came in front of the opening bracket instead of inside it, "pomposity" would also be considered a NAME
instead of a FLAG
. (There's nothing to say a list of synonyms can't contain only one item; so you might say [FLAG foo]
to identify that single parameter as a flag without affecting the parameters that follow it.)
The order of the synonyms is irrelevant; once keys are declared as alternatives for each other, Params::Clean
sees no difference between them. All the args that match a given key or keys are returned in the order in which they occur in @_
.
The REST
Another keyword args
understands is REST
, to return any elements of @_
that are left over after all the other kinds of parameters have been parsed. The leftovers are not grouped into an array-ref; they are simply returned as a list of items coming after the other args.
$I->conscious(earth, sky, plants, sun, fish, animals, holiday);
sub conscious
{
($self, @days[1..6], @sabbath) = args 0, 1..6, REST;
}
Although the REST identifier can appear anywhere in the call to args
, the remaining arguments are always returned last. (If warnings are turned on, args
will complain about REST
not being specified last. (There wouldn't be any point to returning the leftover values in the middle of the other arguments anyway, since you don't know how many there are. (And if you really do know, then just use positionals instead.)))
Identifying args by type
As well as by name or position, args
can also gather parameters by type. For instance, you can collect any array-refs passed to your function by asking for TYPE "ARRAY"
. TYPE
checks the ref
of each argument, so you can select any built-in reference (SCALAR, ARRAY, HASH, CODE, GLOB, REF
), or the name of a class to grab all objects of a certain type.
#Assume we have created some filehandle objects with a module like IO::All
version($INPUT, $OUTPUT, some, random, stuff, $LOGFILE);
sub version
{
my ($files, @leftovers) = args TYPE "IO::All", REST;
#===> [$INPUT, $OUTPUT, $LOGFILE], some, random, stuff
}
TYPE
can also take a code-ref for more complex conditions. Each argument will be passed to the code block, and it must return true or false according to whether the arg qualifies.
stance(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, oops, 13, 2048);
sub Even { $_=shift; return $_ && /^\d+$/ && $_%2==0 }
# check whether the given value looks like an int and is even
sub stance
{
my ($odds, $evens, @others)
= args TYPE sub {shift()%2}, TYPE \&Even, REST;
# one inline code-ref and one ref to a sub
#===> [1,3,5,7,9,13], [2,4,6,8,10,2048], oops
}
Note that since all the args are passed to our TYPE functions, that "oops" is going to cause a warning about not being numeric when the odd-number coderef simply attempts to % 2
it. The Even
sub is better behaved: it first checks (with the regex) whether it's got something that looks like a number. Since you never know what kind of arguments might get passed in, TYPE
blocks should always take appropriate precautions.
Also note that TYPE
functions do not validate the arguments. Although the code block can be quite complex, it doesn't reject anything; args that don't pass the test are simply not collected for that parameter.
Lists
Absolute lists
It is possible to collect a LIST
of arguments starting from a certain name or position, and grabbing all the args that follow it up to an ending name or position. If the end point cannot be found (e.g., we run out of args because there aren't any more, or because we've reached an arg that was already grabbed by some previous parameter), the list stops. If the end point is found, you can choose to include it in the list of args, or to exclude it (in which case, the list will consist of the args from the starting point to the one just before the end point).
dominant(some, stuff, Start=> C, G, A, E, F, C, End, something, else);
sub dominant
{
my ($notes, @rest) = args LIST Start<=>End, REST; # including end point
#===> [Start,C,G,A,E,F,C,End], some, stuff, something, else
my ($notes, @rest) = args LIST Start<=End, REST; # excluding end point
#===> [Start,C,G,A,E,F,C], some, stuff, End, something, else
}
The LIST
keyword is followed by a parameter name or position to start from. An ending parameter is not required (the list will go until the end of the arg list, or until hitting an argument that was already collected). Use <=>
after the starting parameter key to indicate that the following end-point should be included in the resulting list; use <=
to indicate that it should not. (The starting argument is always included -- if you don't want it, you can always shift
it off the front of the list later.)
Excluding the end-points from a list can be useful when you want to indicate that a list should stop where something else begins. The following example has three LIST
s, where the end of one is the start of the next; if each list included its end-point, then the starting-point for the next list would already be used up, and args
wouldn't see it.
query(SELECT=>@fields, FROM=>$table, WHERE=>@conditions);
sub query
{
my ($select, $from, $where)
= args LIST SELECT<=FROM, LIST FROM<=WHERE, LIST WHERE; #explicit endings
#===> [SELECT, @fields], [FROM, $table], [WHERE, @conditions]
# But this is not what we want -- the first list grabs everything:
= args LIST SELECT, LIST FROM, LIST WHERE; #oops!
#===> [SELECT, @fields, FROM, $table, WHERE, @conditions], undef, undef
my ($where, $from, $select) # note the reversed order
= args LIST WHERE, LIST FROM, LIST SELECT; #this is OK
#===> [WHERE, @conditions], [FROM, $table], [SELECT, @fields]
}
The middle part of the example shows that even though it's not necessary to specify an ending for a list, without one the argument-gathering might run amok. The last part illustrates how lists stop when they run out of ungathered args, even if the end-point hasn't been reached. By collecting the WHERE
list first, the FROM
list is forced to stop when it reaches the last arg preceding the WHERE
, and similarly the SELECT
list stops with the last element of @fields
, since the subsequent FROM
has already been used. (See also "Using up arguments".)
Relative lists
Specifying the starting and ending points for a list gives absolute bounds for the list. Lists can also be relative; that is, specifying the desired positions surrounding the starting key. The starting point itself represents position zero, and you can choose args before or after it. You can specify just a single position to grab, but usually you will want to grab several positions, using the "alternatives" syntax [brackets/array-ref]. (However, you may not specify NAMEd params or FLAGs; a relative list can collect only args positionally relative to the starting parameter.)
merge(black =>vs=> white);
sub merge
{
my ($spys) = args LIST vs=[-1, 1];
#===> [black, white] # -1=posn before "vs", +1=posn after "vs"
}
Use =
after the starting point to specify exactly what positions to collect (include position 0
to grab the starting parameter too); use &
followed by the positions to collect them as well as the the starting point itself (without having to include position 0
explicitly); use ^
to collect positions but exclude the starting point itself (even if 0
is included in the positions given). This lets you say things like LIST Start ^ [-3..+3]
instead of spelling it out explicitly without the 0
: LIST Start = [-3. -2. -1. 1. 2. 3]
. (The symbol used for the exclusive case is the same character that Perl uses for exclusive-or.)
due(First=>$a, $b, $c, Second=>$d, $e, Third=>$f);
sub due
{
my ($first, $second, $third)
= args LIST First=[1,2,3], LIST Second & 2, LIST Third^[-1..+1];
#===> [$a, $b, $c], [Second, $e], [$e, $f]
}
As shown, a relative list can take a just a single position, in which case the brackets are optional: LIST Foo=2
or LIST Foo=[2]
.
General notes about lists
You can mix positionals and named parameters in the starting point for any list, or for the ending point of an absolute LIST
in the expected way (using brackets/array-refs for alternatives):
let(foo, Color=> $red, $green, $blue, Begin=>@scrabble=>Stop, bar);
sub let
{
my ($rgb, $tiles, @rest)
= args LIST [Colour,Color]=[1,2,3], LIST [Start,Begin]<=>[Stop,-1], REST;
#===> [$red,$green,$blue], [Begin,@scrabble,Stop], foo, bar
}
(In this example, the second list will end when it finds the string Stop
or reaches the last (-1
) position; the first element of the list will be whichever parameter was found -- in this case, "Begin
").
If the starting key for a list appears more than once, the first occurrence (that has not already been used) will match. So calling some_func(FOO=>a,b,c. FOO=>x,y,z)
could produce two lists with, e.g., args LIST FOO=[1,2,3], LIST FOO<=>[-1]
.
Unlike the other kinds of parameter (which return a single scalar or an array-ref if multiple matches are found), lists always return an array-ref, even though it might contain only one arg. (Calling it a "list" implies you're expecting more than one result -- if you're not, you can simply use a NAME
or POSN
instead.) The exception is that if the list runs into a problem (e.g. cannot find a legitimate starting point), it will return undef
.
Using up arguments
Every time an argument is found, Params::Clean
marks it as used. Used arguments are not checked again, regardless of whether they could match other parameters or not.
side(left=>right);
sub side
{
my ($dextrous, $sinister, @others) = args NAME left, FLAG left, REST;
#===> right undef ()
#"left" was not found as a FLAG because it was already used as a NAME
# But...
my ($sinister, $dextrous, @others) = args FLAG left, NAME left, REST;
#===> 1 undef right
#now "left" was not found as a NAME because it was found first as a FLAG
}
Note that the second case, the argument "right
" was found as a leftover (REST
), because it did not get collected by the other parameters. Since the "left
" argument was found and used as a FLAG
, it was no longer available to be used as a NAME
, and so nothing happened to the arg (right
) that it was meant to be a name for.
It is possible to collect the same value more than once, however. This can happen when the parameter that args
is searching for has not been used yet, even though an arg that parameter points to already has. For example, this next example gets the $fh
argument from all three parameters:
#Assume that $fh is a filehandle,
# and &handle() returns true when it identifies a filehandle
tend(Input=>$fh, Pipe "/dev/null");
sub tend
{
my ($file, $input, $pipe)=args TYPE \&handle, NAME Input, LIST Pipe=[-1, 1];
#===> $fh, $fh, [$fh, /dev/null]
}
First, args
searches by type for any args that satisfy the handle()
function, so it grabs $fh
for the first parameter, $file
. Next, args
looks for an argument identified by the name Input
; the first element of @_
is indeed "Input
", so it gets the following element of @_
. (That second element has already been used to get the $file
, but the name has not yet been used, so it still qualifies. Once the name has been found, the collected arg is always what comes immediately after it -- for example, args
will not grab the second element after the name just because the first value after was already used.) Finally, the relative list successfully identifies the Pipe
label, so it takes the preceding and succeeding elements of @_
(relative positions -1 and +1). Again, once Pipe
is found, it does not matter whether the values identified by the positions have been used already or not. (However, recall that for an absolute list, a used argument will stop processing the list, even if that means the list consists of nothing but the starting point.)
UIDs
Perl cannot tell a parameter name (or flag or list boundary) from any other argument passed to a subroutine. If someone passes an arg with a value of "date" to your sub (e.g., lunch(fruit=>"date", date=>"tomorrow")
), and it is looking for a parameter called "date" (e.g., my ($when, $snack)=args 'date', 'fruit'
), it will match the first occurrence (e.g., $when
will find the first date
string and get as its value what comes next, which is the second date
) -- unless you can be sure that there will be no confusion; for example, because that arg will be caught as one of the positional params and thus ignored by any subsequent FLAG or NAME or LIST parts of the process.
Of course, it is difficult to guarantee that no such confusion will arise; even if the values that could be ambiguous don't make sense, you can't stop somebody from calling your function with nonsensical arguments! What is possible, though, is to avoid using ordinary strings for parameters names (or flags, etc.). The UID module is useful in this respect: it creates unique identifier objects that cannot be duplicated accidentally. (You can deliberately copy one, of course; but you cannot create separate UIDs that would match each other.) Thus if you use UIDs for your parameter flags, you do not have to worry about your caller (accidentally!) passing a value that could be a false positive.
use UID Stop; # create a unique ID
way(Delimiter=>"Stop", Stop "Morningside Crescent");
sub way
{
my ($tube, $telegram) = args Stop, Delimiter;
#===>"Morningside Crescent", "Stop"
}
When args
is looking for the parameter name Stop
, it will not find the plain string "Stop" -- only a UID object (in fact, the same UID object) will do. Note also that a UID doesn't (usually) require a comma between it and the following value.
Of course, if you are exporting a function for other packages to use, you will probably want to export any UIDs that go along with it (otherwise the UIDs will have to be fully-qualified to use them from another package, e.g., do_stuff(Some::Module::FOO $value)
). The same considerations apply as for exporting any other subroutine -- allow the user control over what gets exported to avoid conflicts from different modules trying to export UIDs of the same name.
Params::Clean
exports UIDs for its identifiers (NAME, POSN, FLAG, TYPE, REST, LIST
) so that you can use them with the args
function in your subroutines.
DIAGNOSTICS
- WARNING: attempt to use REST before last parameter
-
The
REST
keyword was not the last item passed toargs
. The leftover values are always returned after everything else, soREST
should appear last to avoid confusion. - WHOA: can't use other LISTs inside a LIST! Ignoring starting [or ending] param key: $key
- WHOA: can't use FLAGs or TYPEs inside a LIST! Ignoring starting [or ending] param key: $key
-
A
LIST
can take only named or positional parameters as the starting (or ending) point. Something likeLIST [FLAG Foo] <=> [TYPE \&foo]
will trigger a warning for either the starting or ending point (or both), An invalid starting point means nothing will be returned for the list (undef
); an invalid ending point means that only the starting key will be returned; no other args will be collected. - ERROR: couldn't find beginning of LIST starting with '$key'
- ERROR: couldn't find ending of LIST from $start to $end
-
The starting or ending parameter specified for a LIST could not be found. If the given parameter does appear somewhere in
@_
, the message will also say, "(probably already used up by another param!)" (meaning a previously-collected arg already marked that parameter as "used" -- see "Using up arguments"). If the starting point cannot be found, then nothing (undef
) is returned for the list (surprisingly enough). If the ending point cannot be found, then everything else (not already collected) until the end of@_
will be grabbed by the list. To deliberately allow a list to run off the end of@_
, make-1
(one of) the ending keys, or else do not specify an ending point at all. - WARNING: attempt to use invalid TYPE
-
TYPE
parameters must be the name of a class (aref
value), or a code-ref that can check each arg. Trying to use anything else as aTYPE
(e.g. a plain number or string) will result in this error. - WARNING: non-integral number $param will be interpreted as a named parameter
-
A number that's not an integer was found as a parameter key. Since positional params must be integers, the value will be interpreted as a
NAME
d parameter. To avoid the error, explicitly mark the key using theNAME
keyword. - WARNING: Orphaned TYPE
-
A
TYPE
keyword was encountered without a following string or coderef, e.g.,args 1,2, [TYPE];
.
BUGS & OTHER ANNOYANCES
There are no known bugs at the moment. (That's what they all say!) Please report any problems you may find, or any other feedback, to <bug-params-clean at rt.cpan.org>
, or through the web interface at http://rt.cpan.org/NoAuth/ReportBug.html?Queue=Params-Clean.
Using args
, variables are not right next to the parameter identifiers they are assigned from. It probably helps to line up the variables and the call to args
if you have more than a few parameters, so that you can see what matches up with what:
my ($foo, $bar, $baz)
= args(foo, POSN -1, FLAG on)
Defaults must be set in a separate step after parsing the parameters with args
(e.g., $foo||=$default;
).
@_
is aliased to the actual calling parameters, that is, changing @_
will change the original variables passed to the function. Variables assigned from a call to args
are of course copies rather than aliases. @_
can be used directly, although if you're making the effort to use named parameters, you can require the caller to pass in references to the original variables where appropriate.
The special identifiers (NAME
, POSN
, etc.) are UID objects, and UID objects are really functions, so NAME=>foo
will not work; the =>
auto-quotes the preceding bareword, even when the "bareword" is really meant to call a sub. Fortunately, you can usually simply say NAME foo
instead. See the documentation for UID
for further details and caveats.
If a named parameter (or position) does not appear in the argument list, then args
will return undef
for it -- just as if someone had explicitly specified a parameter with that name and passed it a value of undef
. Thus there is no way to tell the difference between a deliberate value of undef
and a parameter that is simply missing altogether. However, you could force an extra argument of that name into @_
before parsing it with args
; if the parameter was missing altogether, your dummy value will be the only one returned; if you get back multiple values, you know that others were explicitly passed for that parameter.
The examples given here use lots of barewords. Omitting all those quotation marks makes them look cleaner, but any real program, with use strict
and use warnings
in effect, will need to quote everything, even if it does add slightly to the clutter. Judicious use of =>
to quote the preceding word can help, as can defining UIDs.
LIST
s cannot identify starting (or ending) points by TYPE
. They probably should be able to.
Additional or more helpful diagnostics would be nice, and users should have more control over them.
To paraphrase Damian Conway: It shouldn't take hundreds and hundreds of lines to explain a package that was designed for intuitive ease of use!
RELATED MODULES
This module requires UID.pm and Devel::Caller::Perl.
METADATA
Copyright 2007 David Green, <plato at cpan.org>
.
This module is free software; you may redistribute it or modify it under the same terms as Perl itself. See perlartistic.