NAME

XML::Handler::Composer - Another XML printer/writer/generator

SYNOPSIS

use XML::Handler::Composer;

my $composer = new XML::Handler::Composer ( [OPTIONS] );

DESCRIPTION

XML::Handler::Composer is similar to XML::Writer, XML::Handler::XMLWriter, XML::Handler::YAWriter etc. in that it generates XML output.

This implementation may not be fast and it may not be the best solution for your particular problem, but it has some features that may be missing in the other implementations:

  • Supports every output encoding that XML::UM supports

    XML::UM supports every encoding for which there is a mapping file in the XML::Encoding distribution.

  • Pretty printing

    When used with XML::Filter::Reindent.

  • Fine control over which kind of quotes are used

    See options below.

  • Supports PerlSAX interface

Constructor Options

  • EndWithNewline (Default: 1)

    Whether to print a newline at the end of the file (i.e. after the root element)

  • Newline (Default: undef)

    If defined, which newline to use for printing. (Note that XML::Parser etc. convert newlines into "\x0A".)

    If undef, newlines will not be converted and XML::Handler::Composer will use "\x0A" when printing.

    A value of "\n" will convert the internal newlines into the platform specific line separator.

    See the PreserveWS option in the characters event (below) for finer control over when newline conversion is active.

  • DocTypeIndent (Default: a Newline and 2 spaces)

    Newline plus indent that is used to separate lines inside the DTD.

  • IndentAttList (Default: 8 spaces)

    Indent used when printing an <!ATTLIST> declaration that has more than one attribute definition, e.g.

    <!ATTLIST my_elem
           attr1 CDATA "foo"
           attr2 CDATA "bar"
    >
  • Quote (Default: { XMLDecl => '"', Attr => '"', Entity => '"', SystemLiteral => '"' })

    Quote contains a reference to a hash that defines which quoting characters to use when printing XML declarations (XMLDecl), attribute values (Attr), <!ENTITY> values (Entity) and system/public literals (SystemLiteral) as found in <!DOCTYPE>, <!ENTITY> declarations etc.

  • PrintDefaultAttr (Default: 0)

    If 1, prints attribute values regardless of whether they are default attribute values (as defined in <!ATTLIST> declarations.) Normally, default attributes are not printed.

  • Encoding (Default: undef)

    Defines the output encoding (if specified.) Note that future calls to the xml_decl() handler may override this setting (if they contain an Encoding definition.)

  • EncodeUnmapped (Default: \&XML::UM::encode_unmapped_dec)

    Defines how Unicode characters not found in the mapping file (of the specified encoding) are printed. By default, they are converted to decimal entity references, like '&#123;'

    Use \&XML::UM::encode_unmapped_hex for hexadecimal constants, like '&#xAB;'

  • Print (Default: sub { print @_ }, which prints to stdout)

    The subroutine that is used to print the encoded XML output. The default prints the string to stdout.

Method: get_compressed_element_suffix ($event)

Override this method to support the different styles for printing empty elements in compressed notation, e.g. <p/>, <p></p>, <p />, <p>.

The default returns "/>", which results in <p/>. Use " />" for XHTML style elements or ">" for certain HTML style elements.

The $event parameter is the hash reference that was received from the start_element() handler.

Extra PerlSAX event information

XML::Handler::Composer relies on hints from previous SAX filters to format certain parts of the XML. These SAX filters (e.g. XML::Filter::Reindent) pass extra information by adding name/value pairs to the appropriate PerlSAX events (the events themselves are hash references.)

  • entity_reference: Parameter => 1

    If Parameter is 1, it means that it is a parameter entity reference. A parameter entity is referenced with %ent; instead of &ent; and the entity declaration starts with <!ENTITY % ent ...> instead of <!ENTITY ent ...>

    NOTE: This should be added to the PerlSAX interface!

  • start_element/end_element: Compress => 1

    If Compress is 1 in both the start_element and end_element event, the element will be printed in compressed form, e.g. <a/> instead of <a></a>.

  • start_element: PreserveWS => 1

    If newline conversion is active (i.e. Newline was defined in the constructor), then newlines will *NOT* be converted in text (character events) within this element.

  • attlist_decl: First, MoreFollow

    The First and MoreFollow options can be used to force successive <!ATTLIST> declarations for the same element to be merged, e.g.

    <!ATTLIST my_elem
           attr1 CDATA "foo"
           attr2 CDATA "bar"
           attr3 CDATA "quux"
    >

    In this example, the attlist_decl event for foo should contain (First => 1, MoreFollow => 1) and the event for bar should contain (MoreFollow => 1). The quux event should have no extra info.

    'First' indicates that the event is the first of a sequence. 'MoreFollow' indicates that more events will follow in this sequence.

    If neither option is set by the preceding PerlSAX filter, each attribute definition will be printed as a separate <!ATTLIST> line.

CAVEATS

This code is highly experimental! It has not been tested well and the API may change.

AUTHOR

Send bug reports, hints, tips, suggestions to Enno Derksen at <enno@att.com>.