The London Perl and Raku Workshop takes place on 26th Oct 2024. If your company depends on Perl, please consider sponsoring and/or attending.

NAME

XML::Xalan::Transformer - Perl interface to XalanTransformer class

SYNOPSIS

  use XML::Xalan;

  my $tr = new XML::Xalan::Transformer;

  # compile a stylesheet file:
  my $compiled = $tr->compile_stylesheet_file("foo.xsl");

  # compile a stylesheet string:
  my $compiled = $tr->compile_stylesheet_string(<<"XSLT");
  <?xml version="1.0"?> 
  <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
    <xsl:template match="doc">
      <out><xsl:value-of select="."/></out>
    </xsl:template>
  </xsl:stylesheet>
  XSLT

  # parse an XML file:
  my $parsed = $tr->parse_file("foo.xml");

  # parse an XML string:
  my $parsed = $tr->parse_string(<<"XML");
  <?xml version="1.0"?>
  <doc>Hello</doc>
  XML

  # perform a transformation and store the result into a destination file:
  my $res = $tr->transform_to_file($src_file, $xsl_file, $dest_file);
  my $res = $tr->transform_to_file($parsed, $xsl_file, $dest_file);
  my $res = $tr->transform_to_file($parsed, $compiled, $dest_file);

  # perform a transformation and return the result:
  my $res = $tr->transform_to_data($src_file, $xsl_file);
  my $res = $tr->transform_to_data($parsed, $xsl_file);
  my $res = $tr->transform_to_data($parsed, $compiled);

  # error checking
  die $tr->errstr unless defined $res;      

DESCRIPTION

Interface to XalanTransformer class.

Methods

new()

Constructor, with no argument. Returns an XML::Xalan::Transformer object.

 my $tr = new XML::Xalan::Transformer;
$tr->compile_stylesheet_file($xsl_file)

Compiles a stylesheet file and returns an XML::Xalan::CompiledStylesheet object.

 my $compiled = $tr->compile_stylesheet("foo.xsl");
$tr->compile_stylesheet_string($xsl_string)

Compiles a stylesheet string and returns an XML::Xalan::CompiledStylesheet object.

 my $compiled = $tr->compile_stylesheet_string(<<"XSLT");
 <?xml version="1.0"?> 
 <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
   <xsl:template match="doc">
     <out><xsl:value-of select="."/></out>
   </xsl:template>
 </xsl:stylesheet>
 XSLT
$tr->parse_file($xml_file)

Parses an XML file and returns an XML::Xalan::ParsedSource object.

 my $parsed = $tr->parse_file("foo.xml");
$tr->parse_string($xml_string)

Parses an XML string and returns an XML::Xalan::ParsedSource object.

 my $parsed = $tr->parse_string(<<"XML");
 <?xml version="1.0"?>
 <doc>Hello</doc>
 XML
$tr->transform_to_file($source, $xsl, $dest)

Transforms a source into a specified file. Returns undef on failure. $source could be an XML::Xalan::ParsedSource object or an XML file. $xsl could be an XML::Xalan::CompiledStylesheet object or an XSL file.

 $tr->transform_to_file("foo.xml", "foo.xsl", "bar.xml");

To process an XML source with xml-stylesheet processing instruction, pass undef as the second argument.

$tr->transform_to_data($source, $xsl)

Transforms a source and returns the result. $source could be an XML::Xalan::ParsedSource object or an XML file. $xsl could be an XML::Xalan::CompiledStylesheet object or an XSL file.

Example:

 my $result = $tr->transform_to_data("foo.xml", "foo.xsl");

To process an XML source with xml-stylesheet processing instruction, pass undef as the second argument.

$tr->transform_to_handler($source, $xsl, *FH, $handler)

Transforms a source and pass the result to a callback handler. $xsl could be an XML::Xalan::CompiledStylesheet object or an XSL file.

If $xsl is an XML::Xalan::CompiledStylesheet object, then $source must be an XML::Xalan::ParsedSource object.

Example:

 $out_handler = sub {
     my ($ctx, $mesg);
     print $ctx $mesg;
 };
 $tr->transform_to_handler(
     $xmlfile, $xslfile, 
     *STDERR, $out_handler);

To process an XML source with xml-stylesheet processing instruction, pass undef as the second argument.

$tr->destroy_stylesheet($compiled_stylesheet)

Removes $compiled_stylesheet from memory.

$tr->destroy_parsed_source($parsed_source)

Removes $parsed_source from memory.

$tr->set_stylesheet_param($key, $val)

Set an XSLT parameter, $key is the param name and val is the assigned value. Returns nothing.

 $tr->set_stylesheet_param("id", 777);
 $tr->set_stylesheet_param("user", "'johndoe'");
 my $res = $tr->transform_to_file($source, $xsl, $dest);
$tr->install_function($namespace, $function_name, $function, \%opts)

Install a user defined function as an extension. Returns: nothing. \%opts is an optional hashref to specify one or more particular option. Recognized option are: Context and AutoCast. See "Writing Your Own Extension Function" for more elaborative information.

$tr->uninstall_function($namespace, $function_name)

Uninstall a previously installed user defined function. Returns: nothing. Example:

 my $namespace = "http://ExternalFunction.xalan-c++.xml.apache.org";
 my $func_name = "square-root";

 $tr->uninstall_function($namespace, $func_name);
$tr->create_document_builder()

Returns an XML::Xalan::DocumentBuilder object. This object allows an XML::Xalan::Transformer object to transform input sources parsed by any Perl SAX2 conformant parsers.

See the XML::Xalan::DocumentBuilder pod documentation for usage info.

$tr->destroy_document_builder($document_builder)

Destroy an XML::Xalan::DocumentBuilder object.

$tr->errstr()

Returns current error string.

Writing Your Own Extension Function

You can write an extension function in two different ways: a context-aware function, or a function which is context-indifferent and always returns value of XSLT string type. The latter is very simple. Below is an example of a function to remove html tags that should get you started:

 use HTML::Parse;
 use HTML::FormatText;
 
 ...
 
 my $namespace = "http://ExternalFunction.xalan-c++.xml.apache.org";
 my $func = sub {
         my $html_text = shift;
         return HTML::FormatText->new->format(parse_html($html_text));
     };

The function $func is to be installed using install_function without the optional hashref parameter, which means that it will always accept stringified arguments and the value it returns will always be treated as string.

 $tr->install_function($namespace, 'plain-text', $func);

 my $parsed = $tr->parse_string(<<"XML");
 <?xml version="1.0"?>
 <doc>
  <value><![CDATA[<B>Something bold</B><p>and a new paragraph..</p>]]></value>
 </doc>
 XML

 my $compiled = $tr->compile_stylesheet_string(<<'XSLT');
 <?xml version="1.0"?> 
 <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0"
                xmlns:external="http://ExternalFunction.xalan-c++.xml.apache.org"
        exclude-result-prefixes="external">
 <xsl:output method="text"/>
  <xsl:template match="doc">
  <xsl:choose>
    <xsl:when test="function-available('external:plain-text')">
      <xsl:value-of select="external:plain-text(value)"/>
    </xsl:when>
    <xsl:otherwise>
      Function external:plain-text() is not available!
    </xsl:otherwise>
  </xsl:choose>
  </xsl:template>
 </xsl:stylesheet>  
 XSLT

 die $tr->errstr unless $compiled;
 my $res = $tr->transform_to_data($parsed, $compiled) or die $tr->errstr;

If a function is to be installed with Context option, then it must be written to accept the following ordered arguments: an execution context, a context node, and XML::Xalan::XObject-derived objects. An execution context is an XML::Xalan::ExecutionContext::XPath object, and a context node is an XML::Xalan::DOM::Node-derived object, traversable via DOM API.

The function then must get an XML::Xalan::XObjectFactory object to create the return value of type XML::Xalan::Boolean, or XML::Xalan::Number, or XML::Xalan::String, or XML::Xalan::NodeSet, or XML::Xalan::Scalar.

 my $func = sub {
         my ($exec_context, $context_node, $xobj) = @_;
         my $html_text = $xobj->string; # convert nodeset to string
         return $exec_context->get_xobject_factory->create_string(
            HTML::FormatText->new->format(parse_html($html_text));
         );
     };

 $tr->install_function($namespace, 'plain-text', $func, {Context => 1});

The AutoCast option is similar to Context, but differs in the way that it automatically typecast certain XML::Xalan::XObject-derived objects into their equivalent Perl scalar type, if possible. The following rule is used:

     XML::Xalan::Boolean -> Scalar contains integer
     XML::Xalan::Number  -> Scalar contains double
     XML::Xalan::String  -> Scalar contains string

Note that the typecasting is not happened on the returned value, which means that you still have to return an XML::Xalan::XObject-derived object.

 my $func = sub {
         my ($exec_context, $context_node, $arg) = @_;
         my $html_text = ref $arg ? $arg->string : $arg;
         return $exec_context->get_xobject_factory->create_string(
            HTML::FormatText->new->format(parse_html($html_text));
         );
     };

 $tr->install_function($namespace, 'plain-text', $func, {AutoCast => 1});

A Note on Objects Cleaning Up

XML::Xalan::Transformer is an interface to XalanTransformer class, a C++ class which internally keeps a list of compiled stylesheet objects and another list of parsed source objects. Upon an XalanTransformer object destruction, those lists are iterated and each element of them is deleted. Deleting an element which is no longer exist causes a segfault, thereby I do not provide destructors for XML::Xalan::CompiledStylesheet and XML::Xalan::ParsedSource, since these will conflict with one from XML::Xalan::Transformer.

As a consequence, if you write a code which runs an XML::Xalan::Transformer object for a long time and using either compiled stylesheet or parsed source, be careful to call the appropriate destroy_stylesheet() or destroy_parsed_source() to remove it from the internal list (thus, from the memory) once it's no longer used. Otherwise, the memory used will be accumulated regardless of the objects are already out of scope, and the wasted allocated memory will be freed only when the XML::Xalan::Transformer object runs out of scope.

For example:

 my $tr = new XML::Xalan::Transformer;
 my $compiled = $tr->compile_stylesheet_file($stylesheet);
 my $res = $tr->transform_to_file($source, $compiled, $dest);

 # $compiled will be used for another stylesheet, then 
 # it's necessary to destroy it explicitly first:
 $tr->destroy_stylesheet($compiled);

 # now it's safe to use for another stylesheet
 $compiled = $tr->compile_stylesheet_file($another_stylesheet);

Also, there's no destructor for XML::Xalan::ContentHandler where the instance gets removed as the XML::Xalan::DocumentBuilder that created it is destroyed.

TODO

  • set_stylesheet_param() should accept a hash ref instead, so several parameters can be passed at once.

  • Validation option on parsing.

    At this moment, if you need validation capability, you can use XML::Xalan::DocumentBuilder to make use any validating XML parser which supports Perl SAX2, such as XML::LibXML.

AUTHOR

Edwin Pratomo, edpratomo@cpan.org

SEE ALSO

XML::Xalan::DocumentBuilder(3), XML::Xalan::ParsedSource(3), XML::Xalan::DOM(3), XML::Xalan::XObject(3).