The London Perl and Raku Workshop takes place on 26th Oct 2024. If your company depends on Perl, please consider sponsoring and/or attending.

NAME

DTA::CAB::Format::XmlPerl - Datum parser|formatter: XML (perl-like)

SYNOPSIS

use DTA::CAB::Format::XmlPerl;

##========================================================================
## Constructors etc.

$fmt = DTA::CAB::Format::XmlPerl->new(%args);

##========================================================================
## Methods: Input

$obj = $fmt->parseNode($nod);
$doc = $fmt->parseDocument();

##========================================================================
## Methods: Output

$xmlnod = $fmt->tokenNode($tok);
$xmlnod = $fmt->sentenceNode($sent);
$xmlnod = $fmt->documentNode($doc);
$body_array_node = $fmt->xmlBodyNode();
$sentence_array_node = $fmt->xmlSentenceNode();
$fmt = $fmt->putToken($tok);
$fmt = $fmt->putSentence($sent);
$fmt = $fmt->putDocument($doc);

DESCRIPTION

Globals

Variable: @ISA

DTA::CAB::Format::XmlPerl inherits from DTA::CAB::Format::XmlCommon.

Filenames

DTA::CAB::Format::XmlPerl registers the filename regex:

/\.(?i:xml-perl|perl[\-\.]xml)$/

with DTA::CAB::Format.

Constructors etc.

new
$fmt = CLASS_OR_OBJ->new(%args);

Constructor.

%args, %$fmt:

##-- input
xdoc => $xdoc,                          ##-- XML::LibXML::Document
xprs => $xprs,                          ##-- XML::LibXML parser
##
##-- output
encoding => $inputEncoding,             ##-- default: UTF-8; applies to output only!
level => $level,                        ##-- output formatting level (default=0)
##
##-- common
#(nothing here)

Methods: Persistence

noSaveKeys
@keys = $class_or_obj->noSaveKeys();

Override: returns list of keys not to be saved. Here, returns qw(xdoc xprs).

Methods: Input

parseNode
$obj = $fmt->parseNode($nod);

Returns the perl object represented by the XML::LibXML::Node $nod.

parseDocument
$doc = $fmt->parseDocument();

Override: parses buffered XML::LibXML::Document in $fmt->{xdoc}

Methods: Output

tokenNode
$xmlnod = $fmt->tokenNode($tok);

Returns an XML::LibXML::Node representing the token $tok.

sentenceNode
$xmlnod = $fmt->sentenceNode($sent);

Returns an XML::LibXML::Node representing the sentence $sent.

documentNode
$xmlnod = $fmt->documentNode($doc);

Returns an XML::LibXML::Node representing the document $doc.

xmlBodyNode
$body_array_node = $fmt->xmlBodyNode();

Gets or creates buffered array node representing document body.

xmlSentenceNode
$sentence_array_node = $fmt->xmlSentenceNode();

Gets or creates buffered array node representing (current) document sentence.

putToken
$fmt = $fmt->putToken($tok);

Override: write token $tok to output buffer.

putSentence
$fmt = $fmt->putSentence($sent);

Override: write sentence $sent to output buffer.

putDocument
$fmt = $fmt->putDocument($doc);

Override: write document $doc to output buffer.

EXAMPLE

An example file in the format accepted/generated by this module is:

<?xml version="1.0" encoding="UTF-8"?>
<m ref="DTA::CAB::Document">
  <l key="body">
    <m>
      <a key="lang">de</a>
      <l key="tokens">
        <m>
          <m key="moot">
            <a key="lemma">wie</a>
            <a key="word">wie</a>
            <a key="tag">PWAV</a>
          </m>
          <l key="lang">
            <a>de</a>
          </l>
          <a key="hasmorph">1</a>
          <a key="msafe">1</a>
          <a key="text">wie</a>
          <a key="exlex">wie</a>
          <a key="errid">ec</a>
          <m key="xlit">
            <a key="latin1Text">wie</a>
            <a key="isLatinExt">1</a>
            <a key="isLatin1">1</a>
          </m>
        </m>
        <m>
          <a key="text">oede</a>
          <a key="msafe">0</a>
          <m key="moot">
            <a key="word">öde</a>
            <a key="tag">ADJD</a>
            <a key="lemma">öde</a>
          </m>
          <m key="xlit">
            <a key="latin1Text">oede</a>
            <a key="isLatin1">1</a>
            <a key="isLatinExt">1</a>
          </m>
        </m>
        <m>
          <m key="moot">
            <a key="lemma">!</a>
            <a key="tag">$.</a>
            <a key="word">!</a>
          </m>
          <a key="msafe">1</a>
          <a key="text">!</a>
          <a key="exlex">!</a>
          <a key="errid">ec</a>
          <m key="xlit">
            <a key="latin1Text">!</a>
            <a key="isLatin1">1</a>
            <a key="isLatinExt">1</a>
          </m>
        </m>
      </l>
    </m>
  </l>
</m>

AUTHOR

Bryan Jurish <moocow@cpan.org>

COPYRIGHT AND LICENSE

Copyright (C) 2009-2019 by Bryan Jurish

This package is free software; you can redistribute it and/or modify it under the same terms as Perl itself, either Perl version 5.24.1 or, at your option, any later version of Perl 5 you may have available.

1 POD Error

The following errors were encountered while parsing the POD:

Around line 502:

Non-ASCII character seen before =encoding in 'key="word">öde</a>'. Assuming UTF-8