NAME
LaTeXML::Core::Document
- represents an XML document under construction.
DESCRIPTION
A LaTeXML::Core::Document
represents an XML document being constructed by LaTeXML, and also provides the methods for constructing it. It extends LaTeXML::Common::Object.
LaTeXML will have digested the source material resulting in a LaTeXML::Core::List (from a LaTeXML::Core::Stomach) of LaTeXML::Core::Boxs, LaTeXML::Core::Whatsits and sublists. At this stage, a document is created and it is responsible for `absorbing' the digested material. Generally, the LaTeXML::Core::Boxs and LaTeXML::Core::Lists create text nodes, whereas the LaTeXML::Core::Whatsits create XML
document fragments, elements and attributes according to the defining LaTeXML::Core::Definition::Constructor.
Most document construction occurs at a current insertion point where material will be added, and which moves along with the inserted material. The LaTeXML::Common::Model, derived from various declarations and document type, is consulted to determine whether an insertion is allowed and when elements may need to be automatically opened or closed in order to carry out a given insertion. For example, a subsection
element will typically be closed automatically when it is attempted to open a section
element.
In the methods described here, the term $qname
is used for XML qualified names. These are tag names with a namespace prefix. The prefix should be one registered with the current Model, for use within the code. This prefix is not necessarily the same as the one used in any DTD, but should be mapped to the a Namespace URI that was registered for the DTD.
The arguments named $node
are an XML::LibXML node.
The methods here are grouped into three sections covering basic access to the document, insertion methods at the current insertion point, and less commonly used, lower-level, document manipulation methods.
Accessors
$doc = $document->getDocument;
-
Returns the
XML::LibXML::Document
currently being constructed. $doc = $document->getModel;
-
Returns the
LaTeXML::Common::Model
that represents the document model used for this document. $node = $document->getNode;
-
Returns the node at the current insertion point during construction. This node is considered still to be `open'; any insertions will go into it (if possible). The node will be an
XML::LibXML::Element
,XML::LibXML::Text
or, initially,XML::LibXML::Document
. $node = $document->getElement;
-
Returns the closest ancestor to the current insertion point that is an Element.
$node = $document->getChildElement($node);
-
Returns a list of the child elements, if any, of the
$node
. @nodes = $document->getLastChildElement($node);
-
Returns the last child element of the
$node
, if it has one, else undef. $node = $document->getFirstChildElement($node);
-
Returns the first child element of the
$node
, if it has one, else undef. @nodes = $document->findnodes($xpath,$node);
-
Returns a list of nodes matching the given
$xpath
expression. The context node for$xpath
is$node
, if given, otherwise it is the document element. $node = $document->findnode($xpath,$node);
-
Returns the first node matching the given
$xpath
expression. The context node for$xpath
is$node
, if given, otherwise it is the document element. $node = $document->getNodeQName($node);
-
Returns the qualified name (localname with namespace prefix) of the given
$node
. The namespace prefix mapping is the code mapping of the current document model. $boolean = $document->canContain($tag,$child);
-
Returns whether an element
$tag
can contain a child$child
.$tag
and$child
can be nodes, qualified names of nodes (prefix:localname), or one of a set of special symbols#PCDATA
,#Comment
,#Document
or#ProcessingInstruction
. $boolean = $document->canContainIndirect($tag,$child);
-
Returns whether an element
$tag
can contain a child$child
either directly, or after automatically opening one or more autoOpen-able elements. $boolean = $document->canContainSomehow($tag,$child);
-
Returns whether an element
$tag
can contain a child$child
either directly, or after automatically opening one or more autoOpen-able elements. $boolean = $document->canHaveAttribute($tag,$attrib);
-
Returns whether an element
$tag
can have an attribute named$attrib
. $boolean = $document->canAutoOpen($tag);
-
Returns whether an element
$tag
is able to be automatically opened. $boolean = $document->canAutoClose($node);
-
Returns whether the node
$node
can be automatically closed.
Construction Methods
These methods are the most common ones used for construction of documents. They generally operate by creating new material at the current insertion point. That point initially is just the document itself, but it moves along to follow any new insertions. These methods also adapt to the document model so as to automatically open or close elements, when it is required for the pending insertion and allowed by the document model (See Tag).
$xmldoc = $document->finalize;
-
This method finalizes the document by cleaning up various temporary attributes, and returns the XML::LibXML::Document that was constructed.
@nodes = $document->absorb($digested);
-
Absorb the
$digested
object into the document at the current insertion point according to its type. Various of the the other methods are invoked as needed, and document nodes may be automatically opened or closed according to the document model.This method returns the nodes that were constructed. Note that the nodes may include children of other nodes, and nodes that may already have been removed from the document (See filterChildren and filterDeleted). Also, text insertions are often merged with existing text nodes; in such cases, the whole text node is included in the result.
$document->insertElement($qname,$content,%attributes);
-
This is a shorthand for creating an element
$qname
(with given attributes), absorbing$content
from within that new node, and then closing it. The$content
must be digested material, either a single box, or an array of boxes, which will be absorbed into the element. This method returns the newly created node, although it will no longer be the current insertion point. $document->insertMathToken($string,%attributes);
-
Insert a math token (XMTok) containing the string
$string
with the given attributes. Useful attributes would be name, role, font. Returns the newly inserted node. $document->insertComment($text);
-
Insert, and return, a comment with the given
$text
into the current node. $document->insertPI($op,%attributes);
-
Insert, and return, a ProcessingInstruction into the current node.
$document->openText($text,$font);
-
Open a text node in font
$font
, performing any required automatic opening and closing of intermedate nodes (including those needed for font changes) and inserting the string$text
into it. $document->openElement($qname,%attributes);
-
Open an element, named
$qname
and with the given attributes. This will be inserted into the current node while performing any required automatic opening and closing of intermedate nodes. The new element is returned, and also becomes the current insertion point. An error (fatal if inStrict
mode) is signalled if there is no allowed way to insert such an element into the current node. $document->closeElement($qname);
-
Close the closest open element named
$qname
including any intermedate nodes that may be automatically closed. If that is not possible, signal an error. The closed node's parent becomes the current node. This method returns the closed node. $node = $document->isOpenable($qname);
-
Check whether it is possible to open a
$qname
element at the current insertion point. $node = $document->isCloseable($qname);
-
Check whether it is possible to close a
$qname
element, returning the node that would be closed if possible, otherwise undef. $document->maybeCloseElement($qname);
-
Close a
$qname
element, if it is possible to do so, returns the closed node if it was found, else undef. $document->addAttribute($key=>$value);
-
Add the given attribute to the node nearest to the current insertion point that is allowed to have it. This does not change the current insertion point.
$document->closeToNode($node);
-
This method closes all children of
$node
until$node
becomes the insertion point. Note that it closes any open nodes, not only autoCloseable ones.
Internal Insertion Methods
These are described as an aide to understanding the code; they rarely, if ever, should be used outside this module.
$document->setNode($node);
-
Sets the current insertion point to be
$node
. This should be rarely used, if at all; The construction methods of document generally maintain the notion of insertion point automatically. This may be useful to allow insertion into a different part of the document, but you probably want to set the insertion point back to the previous node, afterwards. $string = $document->getInsertionContext($levels);
-
For debugging, return a string showing the context of the current insertion point; that is, the string of the nodes leading up to it. if
$levels
is defined, show only that many nodes. $node = $document->find_insertion_point($qname);
-
This internal method is used to find the appropriate point, relative to the current insertion point, that an element with the specified
$qname
can be inserted. That position may require automatic opening or closing of elements, according to what is allowed by the document model. @nodes = getInsertionCandidates($node);
-
Returns a list of elements where an arbitrary insertion might take place. Roughly this is a list starting with
$node
, followed by its parent and the parents siblings (in reverse order), followed by the grandparent and siblings (in reverse order). $node = $document->floatToElement($qname);
-
Finds the nearest element at or preceding the current insertion point (see
getInsertionCandidates
), that can accept an element$qname
; it moves the insertion point to that point, and returns the previous insertion point. Generally, after doing whatever you need at the new insertion point, you should call$document->setNode($node);
to restore the insertion point. If no such point is found, the insertion point is left unchanged, and undef is returned. $node = $document->floatToAttribute($key);
-
This method works the same as
floatToElement
, but find the nearest element that can accept the attribute$key
. $node = $document->openText_internal($text);
-
This is an internal method, used by
openText
, that assumes the insertion point has been appropriately adjusted.) $node = $document->openMathText_internal($text);
-
This internal method appends
$text
to the current insertion point, which is assumed to be a math node. It checks for math ligatures and carries out any combinations called for. $node = $document->closeText_internal();
-
This internal method closes the current node, which should be a text node. It carries out any text ligatures on the content.
$node = $document->closeNode_internal($node);
-
This internal method closes any open text or element nodes starting at the current insertion point, up to and including
$node
. Afterwards, the parent of$node
will be the current insertion point. It condenses the tree to avoid redundant font switching elements. $document->afterOpen($node);
-
Carries out any afterOpen operations that have been recorded (using
Tag
) for the element name of$node
. $document->afterClose($node);
-
Carries out any afterClose operations that have been recorded (using
Tag
) for the element name of$node
.
Document Modification
The following methods are used to perform various sorts of modification and rearrangements of the document, after the normal flow of insertion has taken place. These may be needed after an environment (or perhaps the whole document) has been completed and one needs to analyze what it contains to decide on the appropriate representation.
$document->setAttribute($node,$key,$value);
-
Sets the attribute
$key
to$value
on$node
. This method is preferred over the direct LibXML one, since it takes care of decoding namespaces (if$key
is a qname), and also manages recording of xml:id's. $document->recordID($id,$node);
-
Records the association of the given
$node
with the$id
, which should be thexml:id
attribute of the$node
. Usually this association will be maintained by the methods that create nodes or set attributes. $document->unRecordID($id);
-
Removes the node associated with the given
$id
, if any. This might be needed if a node is deleted. $document->modifyID($id);
-
Adjusts
$id
, if needed, so that it is unique. It does this by appending a letter and incrementing until it finds an id that is not yet associated with a node. $node = $document->lookupID($id);
-
Returns the node, if any, that is associated with the given
$id
. $document->setNodeBox($node,$box);
-
Records the
$box
(being a Box, Whatsit or List), that was (presumably) responsible for the creation of the element$node
. This information is useful for determining source locations, original TeX strings, and so forth. $box = $document->getNodeBox($node);
-
Returns the
$box
that was responsible for creating the element$node
. $document->setNodeFont($node,$font);
-
Records the font object that encodes the font that should be used to display any text within the element
$node
. $font = $document->getNodeFont($node);
-
Returns the font object associated with the element
$node
. $node = $document->openElementAt($point,$qname,%attributes);
-
Opens a new child element in
$point
with the qualified name$qname
and with the given attributes. This method is not affected by, nor does it affect, the current insertion point. It does manage namespaces, xml:id's and associating a box, font and locator with the new element, as well as running anyafterOpen
operations. $node = $document->closeElementAt($node);
-
Closes
$node
. This method is not affected by, nor does it affect, the current insertion point. However, it does run anyafterClose
operations, so any element that was created using the lower-levelopenElementAt
should be closed using this method. $node = $document->appendClone($node,@newchildren);
-
Appends clones of
@newchildren
to$node
. This method modifies any ids found within@newchildren
(usingmodifyID
), and fixes up any references to those ids within the clones so that they refer to the modified id. $node = $document->wrapNodes($qname,@nodes);
-
This method wraps the
@nodes
by a new element with qualified name$qname
, that new node replaces the first of@node
. The remaining nodes in@nodes
must be following siblings of the first one.NOTE: Does this need multiple nodes? If so, perhaps some kind of movenodes helper? Otherwise, what about attributes?
$node = $document->unwrapNodes($node);
-
Unwrap the children of
$node
, by replacing$node
by its children. $node = $document->replaceNode($node,@nodes);
-
Replace
$node
by@nodes
; presumably they are some sort of descendant nodes. $node = $document->renameNode($node,$newname);
-
Rename
$node
to the tagname$newname
; equivalently replace$node
by a new node with name$newname
and copy the attributes and contents. It is assumed that$newname
can contain those attributes and contents. @nodes = $document->filterDeletions(@nodes);
-
This function is useful with
$doc-
absorb($box)>, when you want to filter out any nodes that have been deleted and no longer appear in the document. @nodes = $document->filterChildren(@nodes);
-
This function is useful with
$doc-
absorb($box)>, when you want to filter out any nodes that are children of other nodes in@nodes
.
AUTHOR
Bruce Miller <bruce.miller@nist.gov>
COPYRIGHT
Public domain software, produced as part of work done by the United States Government & not subject to copyright in the US.