NAME
XML::PP - A simple XML parser
VERSION
Version 0.04
SYNOPSIS
use XML::PP;
my $parser = XML::PP->new();
my $xml = '<note id="1"><to priority="high">Tove</to><from>Jani</from><heading>Reminder</heading><body importance="high">Don\'t forget me this weekend!</body></note>';
my $tree = $parser->parse($xml);
print $tree->{name}; # 'note'
print $tree->{children}[0]->{name}; # 'to'
DESCRIPTION
You almost certainly do not need this module, for most tasks use XML::Simple or XML::LibXML. XML::PP
exists only for the most lightweight of scenarios where you can't get one of the above modules to install, for example, CI/CD machines running Windows that get stuck with https://stackoverflow.com/questions/11468141/cant-load-c-strawberry-perl-site-lib-auto-xml-libxml-libxml-dll-for-module-x.
XML::PP
is a simple, lightweight XML parser written in pure Perl. It does not rely on external libraries like XML::LibXML
and is suitable for small XML parsing tasks. This module supports basic XML document parsing, including namespace handling, attributes, and text nodes.
METHODS
new
my $parser = XML::PP->new();
my $parser = XML::PP->new(strict => 1);
my $parser = XML::PP->new(warn_on_error => 1);
Creates a new XML::PP
object. It can take several optional arguments:
strict
- If set to true, the parser dies when it encounters unknown entities or unescaped ampersands.warn_on_error
- If true, the parser emits warnings for unknown or malformed XML entities. This is enabled automatically ifstrict
is enabled.logger
Used for warnings and traces. It can be an object that understands warn() and trace() messages, such as a Log::Log4perl or Log::Any object, a reference to code, a reference to an array, or a filename.
parse
my $tree = $parser->parse($xml_string);
Parses the XML string and returns a tree structure representing the XML content. The returned structure is a hash reference with the following fields:
name
- The tag name of the node.ns
- The namespace prefix (if any).ns_uri
- The namespace URI (if any).attributes
- A hash reference of attributes.children
- An array reference of child nodes (either text nodes or further elements).
collapse_structure
Collapse an XML-like structure into a simplified hash (like XML::Simple).
use XML::PP;
my $input = {
name => 'note',
children => [
{ name => 'to', children => [ { text => 'Tove' } ] },
{ name => 'from', children => [ { text => 'Jani' } ] },
{ name => 'heading', children => [ { text => 'Reminder' } ] },
{ name => 'body', children => [ { text => 'Don\'t forget me this weekend!' } ] },
],
attributes => { id => 'n1' },
};
my $result = collapse_structure($input);
# Output:
# {
# note => {
# to => 'Tove',
# from => 'Jani',
# heading => 'Reminder',
# body => 'Don\'t forget me this weekend!',
# }
# }
The collapse_structure
subroutine takes a nested hash structure (representing an XML-like data structure) and collapses it into a simplified hash where each child element is mapped to its name as the key, and the text content is mapped as the corresponding value. The final result is wrapped in a note
key, which contains a hash of all child elements.
This subroutine is particularly useful for flattening XML-like data into a more manageable hash format, suitable for further processing or display.
collapse_structure
accepts a single argument:
$node
(Required)A hash reference representing a node with the following structure:
{ name => 'element_name', # Name of the element (e.g., 'note', 'to', etc.) children => [ # List of child elements { name => 'child_name', children => [{ text => 'value' }] }, ... ], attributes => { ... }, # Optional attributes for the element ns_uri => ... , # Optional namespace URI ns => ... , # Optional namespace }
The
children
key holds an array of child elements. Each child element may have its ownname
andtext
, and the function will collapse all text values into key-value pairs.
The subroutine returns a hash reference that represents the collapsed structure, where the top-level key is note
and its value is another hash containing the child elements' names as keys and their corresponding text values as values.
For example:
{
note => {
to => 'Tove',
from => 'Jani',
heading => 'Reminder',
body => 'Don\'t forget me this weekend!',
}
}
- Basic Example:
-
Given the following input structure:
my $input = { name => 'note', children => [ { name => 'to', children => [ { text => 'Tove' } ] }, { name => 'from', children => [ { text => 'Jani' } ] }, { name => 'heading', children => [ { text => 'Reminder' } ] }, { name => 'body', children => [ { text => 'Don\'t forget me this weekend!' } ] }, ], };
Calling
collapse_structure
will return:{ note => { to => 'Tove', from => 'Jani', heading => 'Reminder', body => 'Don\'t forget me this weekend!', } }
_parse_node
my $node = $self->_parse_node($xml_ref, $nsmap);
Recursively parses an individual XML node. This method is used internally by the parse
method. It handles the parsing of tags, attributes, text nodes, and child elements. It also manages namespaces and handles self-closing tags.
AUTHOR
Nigel Horne, <njh at nigelhorne.com>
SEE ALSO
SUPPORT
This module is provided as-is without any warranty.
LICENSE AND COPYRIGHT
Copyright 2025 Nigel Horne.
Usage is subject to licence terms.
The licence terms of this software are as follows:
Personal single user, single computer use: GPL2
All other users (including Commercial, Charity, Educational, Government) must apply in writing for a licence for use from Nigel Horne at the above e-mail.