Name

Dita::Extend - In situ validation of XML parse trees representing OASIS DITA documents.

Synopsis

Parse some XML representing a document that should conform to the Dita standard:

use Data::Edit::Xml;
use Dita::Extend;

my $x = Data::Edit::Xml::new(<<END);
<ol>
<li><p>ppp</p></li>
<li><q>qqq</q></li>
<li><conbody>ccc</conbody></li>
</ol>
END

Diagnose errors in situ

Validate the XML parse tree structure against the Dita standard and diagnose any errors in situ:

if (my $r = Dita::Extend::allChildren($x))
 {
  ok $r->last->tag eq q(li);
  ok $r->fail->tag eq q(conbody);
  ok $r->reason    eq q(Tag: conbody cannot appear first under tag: li);
  ok join(" ", @{$r->next}) =~ m(b boolean cite cmdname codeblock);
 }

Overriding node names

Validate the XML parse tree by overriding the names of some nodes:

my $x = Data::Edit::Xml::new(<<END);
<list>
<item/>
<item/>
</list>
END

ok !Dita::Extend::directChildren($x, q(ol), item=>q(li));

Benefits

This approach avoids the need to construct a complete Dita topic, write the topic to a file, apply xmllint to the file, then manually connect the resulting error messages back to the failing nodes in the parse tree.

See also

To apply xmllint to a large number of files see: Data::Edit::Xml::Lint.

The deterministic finite state automatons used internally to validate the XML representing Dita conforming documents were obtained by parsing the Normative Form of the Dita specification with Data::Edit::Xml and then applying Data::NFA and Data::DFA.

Description

In situ validation of XML parse trees representing OASIS DITA

Version q(20181215).

The following sections describe the methods in each functional area of this module. For an alphabetic listing of all methods by name see Index.

Validate

Validate the structure of an XML parse tree representing an OASIS DITA document without using the DITA-OT toolkit.

directChildren($$%)

Validate the children immediately under a parent node of a Data::Edit::Xml parse tree. Return undef if the child nodes occur are complete and in a valid order as defined by DITA else return a description of why the validation failed. Optionally an $alternate name for the parent node tag can be supplied - this alternate name will be used to pick the regular expression with which to validate the children of the parent rather then using the existing tag of the parent. The existing names of the children can be transformed during validation by looking them up in the optional %renames hash.

   Parameter   Description
1  $parent     Node in parse tree
2  $alternate  Optional alternate name for parent
3  %renames    Optional alternate names for some or all of the children.

Example:

 {my $x = Data::Edit::Xml::new(<<END);
<p>aaa<p>
</p>
</p>
END

  my $r = Dita::Extend::𝗱𝗶𝗿𝗲𝗰𝘁𝗖𝗵𝗶𝗹𝗱𝗿𝗲𝗻($x);

  ok $r->last->text eq qq(aaa);

  ok $r->reason eq q(Tag: "p" cannot appear after tag: "CDATA" under tag: "p");

This is a static method and so should be invoked as:

Dita::Extend::directChildren

allChildren($$)

Validate all the child nodes in the entire sub tree of a parent node in an Data::Edit::Xml parse tree representing a DITA document. Return undef if they are complete and occur in a valid order else return a description of why the validation failed.

   Parameter  Description
1  $parent    Node in parse tree
2  $renamer   Optional sub for deriving the name of the nodes.

Example:

 {my $x = Data::Edit::Xml::new(<<END);
<ol>
<li><p>ppp</p></li>
<li><q>qqq</q></li>
<li><conbody>ccc</conbody></li>
</ol>
END

  if (my $r = Dita::Extend::𝗮𝗹𝗹𝗖𝗵𝗶𝗹𝗱𝗿𝗲𝗻($x)) {

    ok $r->last->tag eq q(li);

    ok $r->fail->tag eq q(conbody);

    ok $r->reason    eq q(Tag: "conbody" cannot appear first under tag: "li");

    ok join(" ", @{$r->next}) =~ m(b boolean cite cmdname codeblock);

This is a static method and so should be invoked as:

Dita::Extend::allChildren

Hash Definitions

Dita::Extend::Failed Definition

The reason why validation failed.

fail - The node in the XML parse tree at which validation failed.

last - The last valid node visited in the XML parse tree before validation failed.

next - A reference to an array of the tags that would have succeeded at the last valid node.

reason - A readable description of the error.

Private Methods

new()

Create a new set of Dita XML validation DFAs. Each Data::DFA below has been dumped, zipped, then converted to base 64 for convenient storage.

result($$$$)

Create a validation results description.

   Parameter     Description
1  $node         Last good node
2  $failingNode  Node at which validation failed
3  $transitions  Array of possible symbols
4  $reason       Readable reason for failure

This is a static method and so should be invoked as:

Dita::Extend::result

Index

1 allChildren - Validate all the child nodes in the entire sub tree of a parent node in an Data::Edit::Xml parse tree representing a DITA

2 directChildren - Validate the children immediately under a parent node of a Data::Edit::Xml parse tree.

3 new - Create a new set of Dita XML validation DFAs.

4 result - Create a validation results description.

Installation

This module is written in 100% Pure Perl and, thus, it is easy to read, comprehend, use, modify and install via cpan:

sudo cpan install Dita::Extend

Author

philiprbrenan@gmail.com

http://www.appaapps.com

Copyright

Copyright (c) 2016-2018 Philip R Brenan.

This module is free software. It may be used, redistributed and/or modified under the same terms as Perl itself.

1 POD Error

The following errors were encountered while parsing the POD:

Around line 716:

Unterminated L<...> sequence