NAME
X500::DN::Marpa
- Parse X.500 DNs
Synopsis
#!/usr/bin/env perl
use strict;
use warnings;
use X500::DN::Marpa ':constants';
# -----------
my(%count) = (fail => 0, success => 0, total => 0);
my($parser) = X500::DN::Marpa -> new
(
options => long_descriptors,
);
my(@text) =
(
q||,
q|1.4.9=2001|,
q|cn=Nemo,c=US|,
q|cn=Nemo, c=US|,
q|cn = Nemo, c = US|,
q|cn=John Doe, o=Acme, c=US|,
q|cn=John Doe, o=Acme\\, Inc., c=US|,
q|x= |,
q|x=\\ |,
q|x = \\ |,
q|x=\\ \\ |,
q|x=\\#\"\\41|,
q|x=#616263|,
q|SN=Lu\C4\8Di\C4\87|, # 'Lučić'.
q|foo=FOO + bar=BAR + frob=FROB, baz=BAZ|,
q|UID=jsmith,DC=example,DC=net|,
q|OU=Sales+CN=J. Smith,DC=example,DC=net|,
q|CN=James \"Jim\" Smith\, III,DC=example,DC=net|,
q|CN=Before\0dAfter,DC=example,DC=net|,
q|1.3.6.1.4.1.1466.0=#04024869|,
q|UID=nobody@example.com,DC=example,DC=com|,
q|CN=John Smith,OU=Sales,O=ACME Limited,L=Moab,ST=Utah,C=US|,
);
my($result);
for my $text (@text)
{
$count{total}++;
print "# $count{total}. Parsing |$text|. \n";
$result = $parser -> parse($text);
print "Parse result: $result (0 is success)\n";
if ($result == 0)
{
$count{success}++;
for my $item ($parser -> stack -> print)
{
print "$$item{type} = $$item{value}. count = $$item{count}. \n";
}
print 'DN: ', $parser -> dn, ". \n";
print 'OpenSSL DN: ', $parser -> openssl_dn, ". \n";
}
print '-' x 50, "\n";
}
$count{fail} = $count{total} - $count{success};
print "\n";
print 'Statistics: ', join(', ', map{"$_ => $count{$_}"} sort keys %count), ". \n";
See scripts/synopsis.pl.
This is part of the printout of synopsis.pl:
# 3. Parsing |cn=Nemo,c=US|.
Parse result: 0 (0 is success)
commonName = Nemo. count = 1.
countryName = US. count = 1.
DN: countryName=US,commonName=Nemo.
OpenSSL DN: commonName=Nemo+countryName=US.
--------------------------------------------------
...
--------------------------------------------------
# 13. Parsing |x=#616263|.
Parse result: 0 (0 is success)
x = #616263. count = 1.
DN: x=#616263.
OpenSSL DN: x=#616263.
--------------------------------------------------
...
--------------------------------------------------
# 15. Parsing |foo=FOO + bar=BAR + frob=FROB, baz=BAZ|.
Parse result: 0 (0 is success)
foo = FOO+bar=BAR+frob=FROB. count = 3.
baz = BAZ. count = 1.
DN: baz=BAZ,foo=FOO+bar=BAR+frob=FROB.
OpenSSL DN: foo=FOO+bar=BAR+frob=FROB+baz=BAZ.
If you set the option return_hex_as_chars
, as discussed in the "FAQ", then case 13 will print:
# 13. Parsing |x=#616263|.
Parse result: 0 (0 is success)
x = abc. count = 1.
DN: x=abc.
OpenSSL DN: x=abc.
Description
X500::DN::Marpa
provides a Marpa::R2-based parser for parsing X.500 Distinguished Names.
It is based on RFC4514: Lightweight Directory Access Protocol (LDAP): String Representation of Distinguished Names.
Distributions
This module is available as a Unix-style distro (*.tgz).
See http://savage.net.au/Perl-modules/html/installing-a-module.html for help on unpacking and installing distros.
Installation
Install X500::DN::Marpa
as you would any Perl
module:
Run:
cpanm X500::DN::Marpa
or run:
sudo cpan X500::DN::Marpa
or unpack the distro, and then either:
perl Build.PL
./Build
./Build test
sudo ./Build install
or:
perl Makefile.PL
make (or dmake or nmake)
make test
make install
Constructor and Initialization
new()
is called as my($parser) = X500::DN::Marpa -> new(k1 => v1, k2 => v2, ...)
.
It returns a new object of type X500::DN::Marpa
.
Key-value pairs accepted in the parameter list (see corresponding methods for details [e.g. "options([$bit_string])"]):
- o options => $bit_string
-
This allows you to turn on various options.
Default: 0 (nothing is fatal).
See the "FAQ" for details.
- o text => $a_string_to_be_parsed
-
Default: ''.
Methods
bnf()
Returns a string containing the grammar used by this module.
dn()
Returns the RDNs, separated by commas, as a single string in the reverse order compared with the order of the RNDs in the input text.
The order reversal is discussed in section 2.1 of RFC4514.
Hence 'cn=Nemo, c=US' is returned as 'countryName=US,commonName=Nemo' (when the long_descriptors
option is used), and as 'c=US,cn=Nemo' by default.
See also "openssl_dn()".
error_message()
Returns the last error or warning message set.
Error messages always start with 'Error: '. Messages never end with "\n".
Parsing error strings is not a good idea, ever though this module's format for them is fixed.
See "error_number()".
error_number()
Returns the last error or warning number set.
Warnings have values < 0, and errors have values > 0.
If the value is > 0, the message has the prefix 'Error: ', and if the value is < 0, it has the prefix 'Warning: '. If this is not the case, it's a reportable bug.
Possible values for error_number() and error_message():
- o 0 => ""
-
This is the default value.
- o 1/-1 => "Parse exhausted"
-
If "error_number()" returns 1, it's an error, and if it returns -1 it's a warning.
You can set the option
exhaustion_is_fatal
to make it fatal. - o 2/-2 => "Ambiguous parse. Status: $status. Terminals expected: a, b, ..."
-
This message is only produced when the parse is ambiguous.
If "error_number()" returns 2, it's an error, and if it returns -2 it's a warning.
You can set the option
ambiguity_is_fatal
to make it fatal.
See "error_message()".
new()
See "Constructor and Initialization" for details on the parameters accepted by "new()".
openssl_dn()
Returns the RDNs, separated by pluses, as a single string in the same order compared with the order of the RNDs in the input text.
Hence 'cn=Nemo, c=US' is returned as 'commonName=Nemo+countryName=US' (when the long_descriptors
option is used), and as 'cn=Nemo+c=US' by default.
See also "dn()".
options([$bit_string])
Here, the [] indicate an optional parameter.
Get or set the option flags.
For typical usage, see scripts/synopsis.pl.
See the "FAQ" for details.
'options' is a parameter to "new()". See "Constructor and Initialization" for details.
parse([$string])
Here, the [] indicate an optional parameter.
This is the only method the user needs to call. All data can be supplied when calling "new()".
You can of course call other methods (e.g. "text([$string])" ) after calling "new()" but before calling parse()
.
Note: If a string is passed to parse()
, it takes precedence over any string passed to new(text => $string)
, and over any string passed to "text([$string])". Further, the string passed to parse()
is passed to "text([$string)", meaning any subsequent call to text()
returns the string passed to parse()
.
See scripts/synopsis.pl.
Returns 0 for success and 1 for failure.
If the value is 1, you should call "error_number()" to find out what happened.
rdn($n)
Returns a string containing the $n-th RDN, or returns '' if $n is out of range.
$n counts from 1.
If the input is 'UID=nobody@example.com,DC=example,DC=com', rdn(1)
returns 'uid=nobody@example.com'. Note the lower-case 'uid'.
See t/dn.t.
rdn_count($n)
Returns a string containing the $n-th RDN's count (multivalue indicator), or returns 0 if $n is out of range.
$n counts from 1.
If the input is 'UID=nobody@example.com,DC=example,DC=com', rdn_count(1)
returns 1.
If the input is 'foo=FOO+bar=BAR+frob=FROB, baz=BAZ', rdn_count(1)
returns 3.
Not to be confused with "rdn_number()".
See t/dn.t.
rdn_number()
Returns the number of RDNs, which may be 0.
If the input is 'UID=nobody@example.com,DC=example,DC=com', rdn_number()
returns 3.
Not to be confused with "rdn_count($n)".
See t/dn.t.
rdn_type($n)
Returns a string containing the $n-th RDN's attribute type, or returns '' if $n is out of range.
$n counts from 1.
If the input is 'UID=nobody@example.com,DC=example,DC=com', rdn_type(1)
returns 'uid'.
See t/dn.t.
rdn_types($n)
Returns an array containing all the types of all the RDNs for the given RDN, or returns () if $n is out of range.
$n counts from 1.
If the DN is 'foo=FOO+bar=BAR+frob=FROB, baz=BAZ', rdn_types(1)
returns ('foo', 'bar', frob').
See t/dn.t.
rdn_value($n)
Returns a string containing the $n-th RDN's attribute value, or returns '' if $n is out of range.
$n counts from 1.
If the input is 'UID=nobody@example.com,DC=example,DC=com', rdn_type(1)
returns 'nobody@example.com'.
See t/dn.t.
rdn_values($type)
Returns an array containing the RDN attribute values for the attribute type $type, or ().
If the input is 'UID=nobody@example.com,DC=example,DC=com', rdn_values('DC')
returns ('example', 'com').
See t/dn.t.
stack()
Returns an object of type Set::Array, which holds the parsed data.
Obviously, it only makes sense to call stack()
after calling "parse([$string])".
The structure of elements in this stack is documented in the "FAQ".
See scripts/tiny.pl for sample code.
text([$string])
Here, the [] indicate an optional parameter.
Get or set a string to be parsed.
'text' is a parameter to "new()". See "Constructor and Initialization" for details.
FAQ
Where are the error messages and numbers described?
See "error_message()" and "error_number()".
See also "What are the possible values for the 'options' parameter to new()?" below.
What is the structure in RAM of the parsed data?
The module outputs a stack, which is an object of type Set::Array. See "stack()".
Elements in this stack are in the same order as the RDNs are in the input string.
The "dn()" method returns the RDNs, separated by commas, as a single string in the reverse order, whereas "openssl_dn()" separates them by pluses and uses the original order.
Each element of this stack is a hashref, with these (key => value) pairs:
- o count => $number
-
The number of attribute types and values in a (possibly multivalued) RDN.
$number counts from 1.
- o type => $type
-
The attribute type.
- o value => $value
-
The attribute value.
Sample DNs:
Note: These examples assume the default case of the option long_descriptors
(discussed below) not being used.
If the input is 'UID=nobody@example.com,DC=example,DC=com', the stack will contain:
- o [0]: {count => 1, type => 'uid', value => 'nobody@example.com'}
- o [1]: {count => 1, type => 'dc', value => 'example'}
- o [2]: {count => 1, type => 'dc', value => 'com'}
If the input is 'foo=FOO+bar=BAR+frob=FROB, baz=BAZ', the stack will contain:
- o [0]: {count => 3, type => 'foo', value => 'FOO+bar=BAR+frob=FROB'}
- o [1]: {count => 1, type => 'baz', value => 'BAZ'}
Sample Code:
A typical script uses code like this (copied from scripts/tiny.pl):
$result = $parser -> parse($text);
print "Parse result: $result (0 is success)\n";
if ($result == 0)
{
for my $item ($parser -> stack -> print)
{
print "$$item{type} = $$item{value}. count = $$item{count}. \n";
}
}
If the option long_descriptors
is not used in the call to "new()", then $$item{type} defaults to lower-case. RFC4512 says 'Short names are case insensitive....'. I've chosen to use lower-case as the canonical form output by my code.
If that option is used, then some types are output in mixed case. The list of such types is given in section 3 (at the top of page 6) in RFC4514. This document is one of those listed in "References", below.
For a discussion of the mixed-case descriptors, see "What are the possible values for the 'options' parameter to new()?" below.
An extended list of such long descriptors is given in section 4 (page 25) in RFC4519. Note that 'streetAddress' is missing from this list.
What are the possible values for the 'options' parameter to new()?
Firstly, to make these constants available, you must say:
use X500::DN::Marpa ':constants';
Secondly, more detail on errors and warnings can be found at "error_number()".
Thirdly, for usage of these option flags, see scripts/synopsis.pl and scripts/tiny.pl.
Now the flags themselves:
- o nothing_is_fatal
-
This is the default.
nothing_is_fatal
has the value of 0. - o print_errors
-
Print error messages if this flag is set.
print_errors
has the value of 1. - o print_warnings
-
Print various warnings if this flag is set:
- o The ambiguity status and terminals expected, if the parse is ambiguous
- o See "error_number()" for other warnings which might be printed
-
Ambiguity is not, in and of itself, an error. But see the
ambiguity_is_fatal
option, below.
It's tempting to call this option
warnings
, but Perl already hasuse warnings
, so I didn't.print_warnings
has the value of 2. - o print_debugs
-
Print extra stuff if this flag is set.
print_debugs
has the value of 4. - o ambiguity_is_fatal
-
This makes "error_number()" return 2 rather than -2.
ambiguity_is_fatal
has the value of 8. - o exhaustion_is_fatal
-
This makes "error_number()" return 1 rather than -1.
exhaustion_is_fatal
has the value of 16. - o long_descriptors
-
This makes the
type
key in the output stack's elements contain long descriptor names rather than abbreviations.For example, if the input was 'cn=Nemo,c=US', the output stack would contain, by default, i.e. without setting this option:
However, if this option is set, the output will contain:
- o [0]: {count => 1, type => 'commonName', value => 'Nemo'}
- o [1]: {count => 1, type => 'countryName', value => 'US'}
long_descriptors
has the value of 32. - o return_hex_as_chars
-
This triggers extra processing of attribute values which start with '#':
- o The value is assumed to consist entirely of hex digits (after the '#' is discarded)
- o The digits are converted 2 at-a-time into a string of (presumably ASCII) characters
- o These characters are concatenated into a single string, which becomes the new value
So, if this option is not used, 'x=#616263' is parsed as {type => 'x', value => '#616263'}, but if the option is used, you get {type => 'x', value => 'abc'}.
return_hex_as_chars
has the value of 64.
Does this package support Unicode/UTF8?
Handling of UTF8 is discussed in one of the RFCs listed in "References", below.
What is the homepage of Marpa?
http://savage.net.au/Marpa.html.
That page has a long list of links.
How do I run author tests?
This runs both standard and author tests:
shell> perl Build.PL; ./Build; ./Build authortest
References
I found RFCs 4514 and 4512 to be the most directly relevant ones.
RFC Index: The Index. Just search for 'LDAP'.
RFC4514: Lightweight Directory Access Protocol (LDAP): String Representation of Distinguished Names.
RFC4512: Lightweight Directory Access Protocol (LDAP): Directory Information Models.
RFC4517: Lightweight Directory Access Protocol (LDAP): Syntaxes and Matching Rules.
RFC5234: Augmented BNF for Syntax Specifications: ABNF.
RFC3629: UTF-8, a transformation format of ISO 10646.
RFC4514 also discusses UTF8. Search it using the string 'UTF-8'.
See Also
X500::DN. Note: This module is based on the obsolete RFC2253.
Machine-Readable Change Log
The file Changes was converted into Changelog.ini by Module::Metadata::Changes.
Version Numbers
Version numbers < 1.00 represent development versions. From 1.00 up, they are production versions.
Repository
https://github.com/ronsavage/X500-DN-Marpa
Support
Email the author, or log a bug on RT:
https://rt.cpan.org/Public/Dist/Display.html?Name=X500::DN::Marpa.
Author
X500::DN::Marpa was written by Ron Savage <ron@savage.net.au> in 2015.
Marpa's homepage: http://savage.net.au/Marpa.html.
My homepage: http://savage.net.au/.
Copyright
Australian copyright (c) 2015, Ron Savage.
All Programs of mine are 'OSI Certified Open Source Software';
you can redistribute them and/or modify them under the terms of
The Artistic License 2.0, a copy of which is available at:
http://opensource.org/licenses/alphabetical.