NAME

X500::DN::Marpa - Parse X.500 DNs

Synopsis

#!/usr/bin/env perl

use strict;
use warnings;

use X500::DN::Marpa ':constants';

# -----------

my(%count)  = (fail => 0, success => 0, total => 0);
my($parser) = X500::DN::Marpa -> new
(
	options => long_descriptors,
);
my(@text) =
(
	q||,
	q|1.4.9=2001|,
	q|cn=Nemo,c=US|,
	q|cn=Nemo, c=US|,
	q|cn = Nemo, c = US|,
	q|cn=John Doe, o=Acme, c=US|,
	q|cn=John Doe, o=Acme\\, Inc., c=US|,
	q|x= |,
	q|x=\\ |,
	q|x = \\ |,
	q|x=\\ \\ |,
	q|x=\\#\"\\41|,
	q|x=#616263|,
	q|SN=Lu\C4\8Di\C4\87|,		# 'Lučić'.
	q|foo=FOO + bar=BAR + frob=FROB, baz=BAZ|,
	q|UID=jsmith,DC=example,DC=net|,
	q|OU=Sales+CN=J.  Smith,DC=example,DC=net|,
	q|CN=James \"Jim\" Smith\, III,DC=example,DC=net|,
	q|CN=Before\0dAfter,DC=example,DC=net|,
	q|1.3.6.1.4.1.1466.0=#04024869|,
	q|UID=nobody@example.com,DC=example,DC=com|,
	q|CN=John Smith,OU=Sales,O=ACME Limited,L=Moab,ST=Utah,C=US|,
);

my($result);

for my $text (@text)
{
	$count{total}++;

	print "# $count{total}. Parsing |$text|. \n";

	$result = $parser -> parse($text);

	print "Parse result: $result (0 is success)\n";

	if ($result == 0)
	{
		$count{success}++;

		for my $item ($parser -> stack -> print)
		{
			print "$$item{type} = $$item{value}. count = $$item{count}. \n";
		}

		print 'DN:         ', $parser -> dn, ". \n";
		print 'OpenSSL DN: ', $parser -> openssl_dn, ". \n";
	}

	print '-' x 50, "\n";
}

$count{fail} = $count{total} - $count{success};

print "\n";
print 'Statistics: ', join(', ', map{"$_ => $count{$_}"} sort keys %count), ". \n";

See scripts/synopsis.pl.

This is part of the printout of synopsis.pl:

# 3. Parsing |cn=Nemo,c=US|.
Parse result: 0 (0 is success)
commonName = Nemo. count = 1.
countryName = US. count = 1.
DN:         countryName=US,commonName=Nemo.
OpenSSL DN: commonName=Nemo+countryName=US.
--------------------------------------------------
...
--------------------------------------------------
# 13. Parsing |x=#616263|.
Parse result: 0 (0 is success)
x = #616263. count = 1.
DN:         x=#616263.
OpenSSL DN: x=#616263.
--------------------------------------------------
...
--------------------------------------------------
# 15. Parsing |foo=FOO + bar=BAR + frob=FROB, baz=BAZ|.
Parse result: 0 (0 is success)
foo = FOO+bar=BAR+frob=FROB. count = 3.
baz = BAZ. count = 1.
DN:         baz=BAZ,foo=FOO+bar=BAR+frob=FROB.
OpenSSL DN: foo=FOO+bar=BAR+frob=FROB+baz=BAZ.

If you set the option return_hex_as_chars, as discussed in the "FAQ", then case 13 will print:

# 13. Parsing |x=#616263|.
Parse result: 0 (0 is success)
x = abc. count = 1.
DN:         x=abc.
OpenSSL DN: x=abc.

Description

X500::DN::Marpa provides a Marpa::R2-based parser for parsing X.500 Distinguished Names.

It is based on RFC4514: Lightweight Directory Access Protocol (LDAP): String Representation of Distinguished Names.

Distributions

This module is available as a Unix-style distro (*.tgz).

See http://savage.net.au/Perl-modules/html/installing-a-module.html for help on unpacking and installing distros.

Installation

Install X500::DN::Marpa as you would any Perl module:

Run:

cpanm X500::DN::Marpa

or run:

sudo cpan X500::DN::Marpa

or unpack the distro, and then either:

perl Build.PL
./Build
./Build test
sudo ./Build install

or:

perl Makefile.PL
make (or dmake or nmake)
make test
make install

Constructor and Initialization

new() is called as my($parser) = X500::DN::Marpa -> new(k1 => v1, k2 => v2, ...).

It returns a new object of type X500::DN::Marpa.

Key-value pairs accepted in the parameter list (see corresponding methods for details [e.g. "options([$bit_string])"]):

o options => $bit_string

This allows you to turn on various options.

Default: 0 (nothing is fatal).

See the "FAQ" for details.

o text => $a_string_to_be_parsed

Default: ''.

Methods

bnf()

Returns a string containing the grammar used by this module.

dn()

Returns the RDNs, separated by commas, as a single string in the reverse order compared with the order of the RNDs in the input text.

The order reversal is discussed in section 2.1 of RFC4514.

Hence 'cn=Nemo, c=US' is returned as 'countryName=US,commonName=Nemo' (when the long_descriptors option is used), and as 'c=US,cn=Nemo' by default.

See also "openssl_dn()".

error_message()

Returns the last error or warning message set.

Error messages always start with 'Error: '. Messages never end with "\n".

Parsing error strings is not a good idea, ever though this module's format for them is fixed.

See "error_number()".

error_number()

Returns the last error or warning number set.

Warnings have values < 0, and errors have values > 0.

If the value is > 0, the message has the prefix 'Error: ', and if the value is < 0, it has the prefix 'Warning: '. If this is not the case, it's a reportable bug.

Possible values for error_number() and error_message():

o 0 => ""

This is the default value.

o 1/-1 => "Parse exhausted"

If "error_number()" returns 1, it's an error, and if it returns -1 it's a warning.

You can set the option exhaustion_is_fatal to make it fatal.

o 2/-2 => "Ambiguous parse. Status: $status. Terminals expected: a, b, ..."

This message is only produced when the parse is ambiguous.

If "error_number()" returns 2, it's an error, and if it returns -2 it's a warning.

You can set the option ambiguity_is_fatal to make it fatal.

See "error_message()".

new()

See "Constructor and Initialization" for details on the parameters accepted by "new()".

openssl_dn()

Returns the RDNs, separated by pluses, as a single string in the same order compared with the order of the RNDs in the input text.

Hence 'cn=Nemo, c=US' is returned as 'commonName=Nemo+countryName=US' (when the long_descriptors option is used), and as 'cn=Nemo+c=US' by default.

See also "dn()".

options([$bit_string])

Here, the [] indicate an optional parameter.

Get or set the option flags.

For typical usage, see scripts/synopsis.pl.

See the "FAQ" for details.

'options' is a parameter to "new()". See "Constructor and Initialization" for details.

parse([$string])

Here, the [] indicate an optional parameter.

This is the only method the user needs to call. All data can be supplied when calling "new()".

You can of course call other methods (e.g. "text([$string])" ) after calling "new()" but before calling parse().

Note: If a string is passed to parse(), it takes precedence over any string passed to new(text => $string), and over any string passed to "text([$string])". Further, the string passed to parse() is passed to "text([$string)", meaning any subsequent call to text() returns the string passed to parse().

See scripts/synopsis.pl.

Returns 0 for success and 1 for failure.

If the value is 1, you should call "error_number()" to find out what happened.

rdn($n)

Returns a string containing the $n-th RDN, or returns '' if $n is out of range.

$n counts from 1.

If the input is 'UID=nobody@example.com,DC=example,DC=com', rdn(1) returns 'uid=nobody@example.com'. Note the lower-case 'uid'.

See t/dn.t.

rdn_count($n)

Returns a string containing the $n-th RDN's count (multivalue indicator), or returns 0 if $n is out of range.

$n counts from 1.

If the input is 'UID=nobody@example.com,DC=example,DC=com', rdn_count(1) returns 1.

If the input is 'foo=FOO+bar=BAR+frob=FROB, baz=BAZ', rdn_count(1) returns 3.

Not to be confused with "rdn_number()".

See t/dn.t.

rdn_number()

Returns the number of RDNs, which may be 0.

If the input is 'UID=nobody@example.com,DC=example,DC=com', rdn_number() returns 3.

Not to be confused with "rdn_count($n)".

See t/dn.t.

rdn_type($n)

Returns a string containing the $n-th RDN's attribute type, or returns '' if $n is out of range.

$n counts from 1.

If the input is 'UID=nobody@example.com,DC=example,DC=com', rdn_type(1) returns 'uid'.

See t/dn.t.

rdn_types($n)

Returns an array containing all the types of all the RDNs for the given RDN, or returns () if $n is out of range.

$n counts from 1.

If the DN is 'foo=FOO+bar=BAR+frob=FROB, baz=BAZ', rdn_types(1) returns ('foo', 'bar', frob').

See t/dn.t.

rdn_value($n)

Returns a string containing the $n-th RDN's attribute value, or returns '' if $n is out of range.

$n counts from 1.

If the input is 'UID=nobody@example.com,DC=example,DC=com', rdn_type(1) returns 'nobody@example.com'.

See t/dn.t.

rdn_values($type)

Returns an array containing the RDN attribute values for the attribute type $type, or ().

If the input is 'UID=nobody@example.com,DC=example,DC=com', rdn_values('DC') returns ('example', 'com').

See t/dn.t.

stack()

Returns an object of type Set::Array, which holds the parsed data.

Obviously, it only makes sense to call stack() after calling "parse([$string])".

The structure of elements in this stack is documented in the "FAQ".

See scripts/tiny.pl for sample code.

text([$string])

Here, the [] indicate an optional parameter.

Get or set a string to be parsed.

'text' is a parameter to "new()". See "Constructor and Initialization" for details.

FAQ

Where are the error messages and numbers described?

See "error_message()" and "error_number()".

See also "What are the possible values for the 'options' parameter to new()?" below.

What is the structure in RAM of the parsed data?

The module outputs a stack, which is an object of type Set::Array. See "stack()".

Elements in this stack are in the same order as the RDNs are in the input string.

The "dn()" method returns the RDNs, separated by commas, as a single string in the reverse order, whereas "openssl_dn()" separates them by pluses and uses the original order.

Each element of this stack is a hashref, with these (key => value) pairs:

o count => $number

The number of attribute types and values in a (possibly multivalued) RDN.

$number counts from 1.

o type => $type

The attribute type.

o value => $value

The attribute value.

Sample DNs:

Note: These examples assume the default case of the option long_descriptors (discussed below) not being used.

If the input is 'UID=nobody@example.com,DC=example,DC=com', the stack will contain:

o [0]: {count => 1, type => 'uid', value => 'nobody@example.com'}
o [1]: {count => 1, type => 'dc', value => 'example'}
o [2]: {count => 1, type => 'dc', value => 'com'}

If the input is 'foo=FOO+bar=BAR+frob=FROB, baz=BAZ', the stack will contain:

o [0]: {count => 3, type => 'foo', value => 'FOO+bar=BAR+frob=FROB'}
o [1]: {count => 1, type => 'baz', value => 'BAZ'}

Sample Code:

A typical script uses code like this (copied from scripts/tiny.pl):

$result = $parser -> parse($text);

print "Parse result: $result (0 is success)\n";

if ($result == 0)
{
	for my $item ($parser -> stack -> print)
	{
		print "$$item{type} = $$item{value}. count = $$item{count}. \n";
	}
}

If the option long_descriptors is not used in the call to "new()", then $$item{type} defaults to lower-case. RFC4512 says 'Short names are case insensitive....'. I've chosen to use lower-case as the canonical form output by my code.

If that option is used, then some types are output in mixed case. The list of such types is given in section 3 (at the top of page 6) in RFC4514. This document is one of those listed in "References", below.

For a discussion of the mixed-case descriptors, see "What are the possible values for the 'options' parameter to new()?" below.

An extended list of such long descriptors is given in section 4 (page 25) in RFC4519. Note that 'streetAddress' is missing from this list.

What are the possible values for the 'options' parameter to new()?

Firstly, to make these constants available, you must say:

use X500::DN::Marpa ':constants';

Secondly, more detail on errors and warnings can be found at "error_number()".

Thirdly, for usage of these option flags, see scripts/synopsis.pl and scripts/tiny.pl.

Now the flags themselves:

o nothing_is_fatal

This is the default.

nothing_is_fatal has the value of 0.

o print_errors

Print error messages if this flag is set.

print_errors has the value of 1.

o print_warnings

Print various warnings if this flag is set:

o The ambiguity status and terminals expected, if the parse is ambiguous
o See "error_number()" for other warnings which might be printed

Ambiguity is not, in and of itself, an error. But see the ambiguity_is_fatal option, below.

It's tempting to call this option warnings, but Perl already has use warnings, so I didn't.

print_warnings has the value of 2.

o print_debugs

Print extra stuff if this flag is set.

print_debugs has the value of 4.

o ambiguity_is_fatal

This makes "error_number()" return 2 rather than -2.

ambiguity_is_fatal has the value of 8.

o exhaustion_is_fatal

This makes "error_number()" return 1 rather than -1.

exhaustion_is_fatal has the value of 16.

o long_descriptors

This makes the type key in the output stack's elements contain long descriptor names rather than abbreviations.

For example, if the input was 'cn=Nemo,c=US', the output stack would contain, by default, i.e. without setting this option:

o [0]: {count => 1, type => 'cn', value => 'Nemo'}
o [1]: {count => 1, type => 'c', value => 'US'}

However, if this option is set, the output will contain:

o [0]: {count => 1, type => 'commonName', value => 'Nemo'}
o [1]: {count => 1, type => 'countryName', value => 'US'}

long_descriptors has the value of 32.

o return_hex_as_chars

This triggers extra processing of attribute values which start with '#':

o The value is assumed to consist entirely of hex digits (after the '#' is discarded)
o The digits are converted 2 at-a-time into a string of (presumably ASCII) characters
o These characters are concatenated into a single string, which becomes the new value

So, if this option is not used, 'x=#616263' is parsed as {type => 'x', value => '#616263'}, but if the option is used, you get {type => 'x', value => 'abc'}.

return_hex_as_chars has the value of 64.

Does this package support Unicode/UTF8?

Handling of UTF8 is discussed in one of the RFCs listed in "References", below.

What is the homepage of Marpa?

http://savage.net.au/Marpa.html.

That page has a long list of links.

How do I run author tests?

This runs both standard and author tests:

shell> perl Build.PL; ./Build; ./Build authortest

References

I found RFCs 4514 and 4512 to be the most directly relevant ones.

RFC Index: The Index. Just search for 'LDAP'.

RFC4514: Lightweight Directory Access Protocol (LDAP): String Representation of Distinguished Names.

RFC4512: Lightweight Directory Access Protocol (LDAP): Directory Information Models.

RFC4517: Lightweight Directory Access Protocol (LDAP): Syntaxes and Matching Rules.

RFC5234: Augmented BNF for Syntax Specifications: ABNF.

RFC3629: UTF-8, a transformation format of ISO 10646.

RFC4514 also discusses UTF8. Search it using the string 'UTF-8'.

See Also

X500::DN. Note: This module is based on the obsolete RFC2253.

Machine-Readable Change Log

The file Changes was converted into Changelog.ini by Module::Metadata::Changes.

Version Numbers

Version numbers < 1.00 represent development versions. From 1.00 up, they are production versions.

Repository

https://github.com/ronsavage/X500-DN-Marpa

Support

Email the author, or log a bug on RT:

https://rt.cpan.org/Public/Dist/Display.html?Name=X500::DN::Marpa.

Author

X500::DN::Marpa was written by Ron Savage <ron@savage.net.au> in 2015.

Marpa's homepage: http://savage.net.au/Marpa.html.

My homepage: http://savage.net.au/.

Copyright

Australian copyright (c) 2015, Ron Savage.

All Programs of mine are 'OSI Certified Open Source Software';
you can redistribute them and/or modify them under the terms of
The Artistic License 2.0, a copy of which is available at:
http://opensource.org/licenses/alphabetical.