NAME
Config::ReadAndCheck - Perl module for parsing generic config files conforms to predefined line-by-line-based format.
Version 0.01
SYNOPSIS
# This code could be used for parsing
# the windows-style INI files.
use strict;
use Config::ReadAndCheck;
my $FileName = shift or die "Usage: $0 <FileName>\n";
my %ParsINI = ();
# The lines started from ';' or '#' and empty lines will be ignored
$ParsINI{'Comment'} = {'Pattern' => '(?:\s*(?:(?:[\;\#]).*)*)',
'Type' => 'ignore',
};
# Sections have to have a '[SectionName]' form.
# SectionName cannot be empty
# SectionName has to be unique
# The first line which is not a parameter definition is end of the section
# Comments are allowed inside of the section (!)
# At least one section has to be defined because of lack of 'Default' definition
$ParsINI{'Section'} = {'Pattern' => '\s*\[(.+)\]'.$ParsINI{'Comment'}->{'Pattern'},
'Type' => 'UniqList',
'SubSection' => {'Params' => {}, # Defined latter
'Comment' => $ParsINI{'Comment'},
},
};
# Parameters have to have a 'ParamName=Value' form.
# All leading o trailing spaces are ignored.
# All spaces around the '=' sign are ignored
# ParamName can not contain '=' sign and can not be empty
# ParamName has to be unique in the section
# The default 'Process' function is used.
# Empty (no parameters) sections are allowed by the 'Default' definition
$ParsINI{'Section'}->{'SubSection'}->{'Params'} =
{'Pattern' => '\s*([^\=]+)\s*=\s*([^\s](?:.*[^\s])?)'.$ParsINI{'Comment'}->{'Pattern'},
'Type' => 'UniqList',
'Default' => {},
};
# Create the parser object.
# '%ParsINI' will be automaticaly checked for consistency
my $Parser = Config::ReadAndCheck->new('Params' => \%ParsINI)
or die "Can not create the parser: $@\n";
# Parse the INI file. Parsing is case-insensitive by default
my $Result = $Parser->ParseFile($FileName)
or die "Error parsing file \"$FileName\": $@";
# The I<C<$Result>> will be a reference to the hash with the followin structure:
#
# {'SectionName1' => {'ParamName1' => 'Value1',
# 'ParamName2' => 'Value2',
# ...
# },
# 'SectionName2' => {'ParamName1' => 'Value1',
# 'ParamName2' => 'Value2',
# ...
# },
# ...
# }
print Config::ReadAndCheck::PrintList($Result, '', "\t");
DESCRIPTION
This module provides a way to easily create a parser for your own file format and check the parsed values on the fly.
The Config::ReadAndCheck
methods
new(%Config)
-
Returns a reference to the
Config::ReadAndCheck
object.%Config
is a hash containing configuration parameters.Configuration parameters are:
CaseSens
-
Optional parameter. If value is 'true' the input line identification will be case-sensitive. Default action is case-insensitive.
Params
-
The value has to be the reference to the "section definition hash".
- Section definition
-
The structure is:
my $Params = {'ParamName1' => $ParamDefinition1, 'ParamName2' => $ParamDefinition2, ... 'EndOfSection' => $ParamDefinition3, };
The
'ParamName1'
,'ParamName2'
are the names of parameters.The
$ParamDefinition1
,$ParamDefinition2
are the reference to the "parameter definition hash".'EndOfSection'
is a reserved parameter name (see'SubSection'
).Each parameter will be represented in the result hash as a value with key the same as parameter name. The type of the value depends on
'Type'
field in the parameter definition (see below). - Parameter definition hash
-
The structure is:
my $ParamDefinition = {'Pattern' => 'The pattern string', 'Process' => $ProcessSubroutine, 'Default' => 'Value', 'Type' => $ParamType, 'SubSection' => $RefToSection, };
'Pattern'
-
The perl regexp is used to identify the input line as a relative to this parameter. The
'\A'
escape sequence will be added to the beginning of the pattern and'\Z'
will be added to the end automatically.'\n'
symbols will be striped out from the line before evaluation. The evaluation will be done case sensitive or insensitive according to the 'CaseSens' parameter of thenew()
method. 'Process'
-
The reference to your very own parameter check and preparation subroutine. This subroutine which is called without parameters.
$1
,$2
and so on will be set according to your pattern.Process
subroutine has to return one or two elements list. Number and type of elements depends on$ParamDefinition->{'Type'}
and$ParamDefinition->{'SubSection'}
. Empty list means the 'line did not pass the check'. If$ParamDefinition->{'Process'}
is not defined the simplesub{return ($1,$2);}
subroutine will be used. 'Default'
-
The default value for this parameter. The type of this property depends on
'Type'
and'SubSection'
. If$ParamDefinition->{'Default'}
does not exist the parameter is treated as 'required' (seeCheckRequired()
). 'Type'
-
The type of the parameter. Can be
'UNIQ'
, or'UNIQLIST'
, or'LIST'
, or'IGNORE'
.UNIQ
-
Only one line corresponding to the pattern has to be presented in the input.
The
UNIQ
value will be represented as single value in the result hash. This will be a first value in the list returned by the process subroutine. UNIQLIST
-
Multiple lines corresponding to the pattern can be presented in the input. The process subroutine for this type has to return a list of two values.
The
UNIQLIST
parameter will be represented in the result hash as a reference to hash. The first value returned by process subroutine will be used as a hash key and the second will be a value. So, the first value returned by the process subroutine has to be uniq for each line corresponded to the pattern. LIST
-
Multiple lines corresponding to the pattern can be presented in the input.
The
LIST
parameter will be represented in the result hash as a reference to array. The first value returned by the process subroutine will be pushed to this array for each line corresponded to the pattern. So, nothing unique at all. IGNORE
-
Multiple lines corresponding to the pattern can be presented in the input. All them will not be presented in the result hash.
-
Type name is case-insensitive
'SubSection'
-
The reference to the "section definition".
If
'SubSection'
is defined, the'Process'
subroutine has to return the reference to the hash, even empty as a first list element for typesUNOQ
andLIST
, and as a second element for typeUNIQLIST
.The parameter with
'SubSection'
defined will be represented in the result hash as a reference to hash.The parameters defined in the
'SubSection'
will be represented in this hash with their own names.The level of recursion is not limited but loops are prohibited.
If
'SubSection'
is defined, the line corresponding to'Pattern'
will be treated as a first line of the enclosed section.The line corresponding to the
'EndOfSection'
parameter of the enclosed section will be treated as a last line in the subsection. The next line will be verified by the parameters of the parent section.If no
'EndOfSection'
parameter is defined in the subsection, the first line which does not correcpond to any of the subsection parameters will be treated as an end of subsection. Also, this line will be passed for the verification to the parent section.Note: the root section also can have a
'EndOfSection'
parameter. It will be treated as an 'EOF'.
The
new()
method returns a reference to theConfig::ReadAndCheck
object or 'undef' value.
Result()
-
Returns a copy of current result of parsing as a hash or reference to hash in scalar context.
Reset()
-
Remove all the data relative to previous parsing from the memory and make the parser ready for next parsing. Returns 'undef'.
Params()
-
Returns a copy of 'Params' hash currently in use or reference to hash in scalar context.
Parse(ARRAYREF)
-
$Array
is a reference to array of strings to be parsed.Then reach the EOF or
'EndOfSection'
ParseArray()
calls theCheckRequired()
function to check if all required parameters were defined. ParseFile($FileName)
-
$FileName
is the name of file to be parsed.Then reach the EOF or
'EndOfSection'
Parse()
calls theCheckRequired()
function to check if all required parameters were defined. Parse(CODEREF)
-
CODEREF
is a reference to the subroutine, which returns the next string.&{CODEREF}()
will be called without any parameters and have to return a string. It have to return anundef
value as an 'EOF' indication. Parse($String)
-
$String
is just string. The tokens(.*\n)
will be extracted from this string and parsed one by one. ParseIncremental($Str)
-
$Str
is a string to be parsed.ParseIncremental()
returns a name of the parameter which is$Str
correspond to or undef if string is unrecognised.$@
will contain an error message. CheckRequired()
-
CheckRequired()
checks whether all the parameters which do not have a'Default'
value provided exist in the$Result
hash. It stops on the first one which does not and returns a false value.$@
variable contains a string "Required parameter PARAMETER_NAME is not defined".If no 'problematic' parameters are found
CheckRequired()
returns a true value.In addition to this check,
CheckRequired()
sets all undefined parameters to their'Default'
value. PrintList($List, $Prefix, $Shift)
-
$List
is a hash or array reference.$Prefix
is a prefix substring.$Shift
is a 'shift' substring (see below).PrintList()
produces a string which contains a human readable representation of a hash or array.It is descending to the any hash or array references in the list. Embedded records are shifted for the one or more (according to level of embedment)
$Shift
substrings.All records preceded by the
$Prefix
substring.For example
my @Tst = ('p 0.0', 'p 0.1', {'p 1.0' => 'here', 'p 1.1' => 'here too', 'p 1.2' => ['p 2.0', 'p 2.1']}, 'p 0.3'); print PrintList(\@Tst, '>', "\t");
will print
>[0] = "p 0.0" >[1] = "p 0.1" >[2] hash > 'p 1.1' => "here too" > 'p 1.2' array > [0] = "p 2.0" > [1] = "p 2.1" > 'p 1.0' => "here" >[3] = "p 0.3"
All methods including new()
returns an 'undef' value in case of error. The $@
variable will contain an error explanation.
EXPORT
None by default.
AUTHOR
Daniel Podolsky, <tpaba@cpan.org>