NAME

XML::Genx - A simple, correct XML writer

SYNOPSIS

use XML::Genx;
my $w = XML::Genx->new;
eval {
    # <foo>bar</foo>
    $w->StartDocFile( *STDOUT );
    $w->StartElementLiteral( 'foo' );
    $w->AddText( 'bar' );
    $w->EndElement;
    $w->EndDocument;
};
die "Writing XML failed: $@" if $@;

DESCRIPTION

This class is used for generating XML. The underlying library (genx) ensures that the output is well formed, canonical XML. That is, all characters are correctly encoded, namespaces are handled properly and so on. If you manage to generate non-well-formed XML using XML::Genx, please submit a bug report.

The API is mostly a wrapper over the original C library. Consult the genx documentation for the fine detail. This code is based on genx beta5.

For more detail on how to use this class, see "EXAMPLES".

METHODS

All methods will die() when they encounter an error. Otherwise they return zero.

new ( )

Constructor. Returns a new XML::Genx object.

StartDocFile ( FILEHANDLE )

Starts writing output to FILEHANDLE. You have to open this yourself.

This method will not accept a filename.

StartDocSender ( CALLBACK )

Takes a coderef ( sub {} ), which gets called each time that genx needs to output something. CALLBACK will be called with two arguments: the text to output and the name of the function that called it (one of write, write_bounded, or flush).

$w->StartDocSender( sub { print $_[0] } );

In the case of flush, the first argument will always be an empty string.

The string passed to CALLBACK will always be UTF-8.

NB: If you just want to append to a string, have a look at "StartDocString" in XML::Genx::Simple.

EndDocument ( )

Finishes writing to the output stream.

StartElementLiteral ( [NAMESPACE], LOCALNAME )

Starts an element LOCALNAME, in NAMESPACE. If NAMESPACE is not present or undef, or an empty string, no namespace is used. NAMESPACE can either be a string or an XML::Genx::Namespace object.

AddAttributeLiteral ( [NAMESPACE], LOCALNAME, VALUE )

Adds an attribute LOCALNAME, with contents VALUE. If NAMESPACE is not present or undef, or an empty string, no namespace is used. NAMESPACE can either be a string or an XML::Genx::Namespace object.

EndElement ( )

Output a closing tag for the currently open element.

LastErrorMessage ( )

Returns the string value of the last error.

LastErrorCode ( )

Returns the integer status code of the last error. This can be compared to one of the values in XML::Genx::Constants.

This will return zero if no error condition is present.

This value cannot be relied upon to stay the same after further method calls to the same object.

GetErrorMessage ( CODE )

Given a genxStatus code, return the equivalent string.

ScrubText ( STRING )

Returns a new version of STRING with prohibited characters removed. Prohibited characters includes non UTF-8 byte sequences and characters which are not allowed in XML 1.0.

AddText ( STRING )

Output STRING. STRING must be valid UTF-8.

AddCharacter ( C )

Output the Unicode character with codepoint C (an integer). This is normally obtained by ord().

Comment ( STRING )

Output STRING as an XML comment. Genx will complain if STRING contains "--".

PI ( TARGET, STRING )

Output a processing instruction, with target TARGET and STRING as the body. Genx will complain if STRING contains "?>" or if TARGET is the string "xml" (in any case).

UnsetDefaultNamespace ( )

Insert an xmlns="" attribute. Has no effect if the default namespace is already in effect.

GetVersion ( )

Return the version number of the Genx library in use.

DeclareNamespace ( URI, PREFIX )

Returns a new namespace object. The resulting object has two methods defined on it.

GetNamespacePrefix ( )

Returns the current prefix in scope for this namespace.

AddNamespace ( [PREFIX] )

Adds the namespace into the document, optionally with PREFIX.

NB: This object is only valid as long as the original XML::Genx object that created it is still alive.

DeclareElement ( [NS], NAME )

Returns a new element object. NS must an object returned by DeclareNamespace(), or undef to indicate no namespace (or not present at all).

The resulting object has one method available to call.

StartElement ( )

Outputs a start tag.

NB: This object is only valid as long as the original XML::Genx object that created it is still alive.

DeclareAttribute ( [NS], NAME )

Returns a new attribute object. NS must an object returned by DeclareNamespace(), or undef to indicate no namespace (or not present at all).

There is one method defined for this object.

AddAttribute ( VALUE )

Adds an attribute to the current element with VALUE as the contents.

NB: This object is only valid as long as the original XML::Genx object that created it is still alive.

LIMITATIONS

According to the Genx manual, the things that Genx can't do include:

  • Generating output in anything but UTF8.

  • Writing namespace-oblivious XML. That is to say, you can't have an element or attribute named foo:bar unless foo is a prefix associated with some namespace.

  • Empty-element tags.

  • Writing XML or <!DOCTYPE> declarations. Of course, you could squeeze these into the output stream yourself before any Genx calls that generate output.

  • Pretty-printing. Of course, you can pretty-print yourself by putting the linebreaks in the right places and indenting appropriately, but Genx won't do it for you. Someone might want to write a pretty-printer that sits on top of Genx.

EXAMPLES

  • Simple XML, with no namespaces or attributes.

    $w->StartDocFile( *STDOUT );
    $w->StartElementLiteral( 'strong' );
    $w->AddText( 'bad' );
    $w->EndElement();
    $w->EndDocument();

    This produces:

    <strong>bad</strong>
  • XML with attributes.

    $w->StartDocFile( *STDOUT );
    $w->StartElementLiteral( 'a' );
    $w->AddAttributeLiteral( href => 'http://www.cpan.org/' );
    $w->AddText( 'CPAN' );
    $w->EndElement();
    $w->EndDocument();

    This produces:

    <a href="http://www.cpan.org/">CPAN</a>
  • XML with a default namespace. Note that you have to explicitly pass in an empty string to specify the default namespace. Just leaving out the second argument will result in an autogenerated prefix instead.

    $w->StartDocFile( *STDOUT );
    my $ns = $w->DeclareNamespace( "http://www.w3.org/1999/xhtml", "" );
    $w->StartElementLiteral( $ns, 'strong' );
    $w->AddText( 'bad' );
    $w->EndElement();
    $w->EndDocument();

    This produces:

    <strong xmlns="http://www.w3.org/1999/xhtml">bad</strong>
  • XML with prefixed namespaces.

    $w->StartDocFile( *STDOUT );
    my $ns = $w->DeclareNamespace( "http://www.w3.org/1999/xhtml", "xh" );
    $w->StartElementLiteral( $ns, 'strong' );
    $w->AddText( 'bad' );
    $w->EndElement();
    $w->EndDocument();

    This produces:

    <xh:strong xmlns:xh="http://www.w3.org/1999/xhtml">bad</xh:strong>
  • XML with attributes in a namespace.

    $w->StartDocFile( *STDOUT );
    my $ns = $w->DeclareNamespace( "http://www.w3.org/1999/xlink", "x" );
    $w->StartElementLiteral( 'user' );
    $w->AddAttributeLiteral( $ns, href => '/user/42' );
    $w->AddText( 'Fred' );
    $w->EndElement();
    $w->EndDocument();

    This produces:

    <user xmlns:x="http://www.w3.org/1999/xlink" x:href="/user/42">Fred</user>
  • Declaring elements. If you are going to be using the same element many times over, it's worthwhile to predeclare it, since genx doesn't have to check the validity of the element name on each call.

    $w->StartDocFile( *STDOUT );
    my $ns = $w->DeclareNamespace( 'http://www.w3.org/1999/xhtml', "" );
    my $li = $w->DeclareElement( 'li' );
    $w->StartElementLiteral( 'ul' );
    
    $li->StartElement();
    $w->AddText( 'Fred' );
    $w->EndElement();
    
    $li->StartElement();
    $w->AddText( 'Barney' );
    $w->EndElement();
    
    $w->EndElement();
    $w->EndDocument();

    This produces:

    <ul xmlns="http://www.w3.org/1999/xhtml"><li>Fred</li><li>Barney</li></ul>

    You might also want to look at "Element" in XML::Genx::Simple, which does this for you (when there aren't any namespace involved).

SEE ALSO

XML::Genx::Constants, XML::Genx::Simple.

http://www.tbray.org/ongoing/When/200x/2004/02/20/GenxStatus

AUTHOR

Dominic Mitchell, <cpan (at) happygiraffe.net>

The genx library was created by Tim Bray http://www.tbray.org/.

COPYRIGHT AND LICENSE

Copyright (C) 2004 by Dominic Mitchell. All rights reserved.

Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:

  1. Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.

  2. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.

THIS SOFTWARE IS PROVIDED BY AUTHOR AND CONTRIBUTORS ``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL AUTHOR OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

The genx library is:

Copyright (c) 2004 by Tim Bray and Sun Microsystems. For copying permission, see http://www.tbray.org/ongoing/genx/COPYING.

VERSION

@(#) $Id: Genx.pm 1270 2006-10-08 17:29:33Z dom $