NAME

HTML::Object::DOM::TreeWalker - HTML Object DOM Tree Walker Class

SYNOPSIS

With just one argument, this default to search for everything (SHOW_ALL) and to use the default filter, which always returns FILTER_ACCEPT

use HTML::Object::DOM::TreeWalker;
my $walker = HTML::Object::DOM::TreeWalker->new( $doc->body ) || 
    die( HTML::Object::DOM::TreeWalker->error, "\n" );

Or, passing an anonymous subroutine as the filter

my $nodes = HTML::Object::DOM::TreeWalker->new(
    $root_node,
    $what_to_show_bit,
    sub{ return( FILTER_ACCEPT ); }
) || die( HTML::Object::DOM::TreeWalker->error, "\n" );

Or, passing an hash reference with a property 'acceptNode' whose value is an anonymous subroutine, as the filter

my $nodes = HTML::Object::DOM::TreeWalker->new(
    $root_node,
    $what_to_show_bit,
    {
        acceptNode => sub{ return( FILTER_ACCEPT ); }
    }
) || die( HTML::Object::DOM::TreeWalker->error, "\n" );

Or, passing an object that implements the method "acceptNode"

my $nodes = HTML::Object::DOM::TreeWalker->new(
    $root_node,
    $what_to_show_bit,
    # This object must implement the acceptNode method
    My::Customer::NodeFilter->new
) || die( HTML::Object::DOM::TreeWalker->error, "\n" );

There is also HTML::Object::DOM::TreeWalker, which performs a somewhat similar function.

Choose HTML::Object::DOM::NodeIterator when you only need a simple iterator to filter and browse the selected nodes, and choose HTML::Object::DOM::TreeWalker when you need to access to the node and its siblings.

VERSION

v0.2.0

DESCRIPTION

The TreeWalker object represents the nodes of a document subtree and a position within them.

PROPERTIES

currentNode

Is the Node on which the TreeWalker is currently pointing at.

Example:

use HTML::Object::DOM::NodeFilter qw( :all );
my $treeWalker = $doc->createTreeWalker(
    $doc->body,
    SHOW_ELEMENT,
    sub{ return( FILTER_ACCEPT ); }
);
my $root = $treeWalker->currentNode; # the root element as it is the first element!

See also Mozilla documentation

expandEntityReferences

Normally this is read-only, but under perl you can set whatever boolean value you want.

Under JavaScript, this is a boolean value indicating if, when discarding an EntityReference its whole sub-tree must be discarded at the same time.

Example:

use HTML::Object::DOM::NodeFilter qw( :all );
my $treeWalker = $doc->createTreeWalker(
    $doc->body,
    SHOW_ELEMENT,
    sub{ return( FILTER_ACCEPT ); },
    # or
    # { acceptNode => sub{ return( FILTER_ACCEPT ); } },
);
my $expand = $treeWalker->expandEntityReferences;

See also Mozilla documentation

filter

Normally this is read-only, but under perl you can set it to a new HTML::Object::DOM::NodeFilter object you want, even after object instantiation.

Returns a HTML::Object::DOM::NodeFilter used to select the relevant nodes.

Example:

use HTML::Object::DOM::NodeFilter qw( :all );
my $treeWalker = $doc->createTreeWalker(
    $doc->body,
    SHOW_ELEMENT,
    sub{ return( FILTER_ACCEPT ); },
    # or
    # { acceptNode => sub{ return( FILTER_ACCEPT ); } },
);
my $nodeFilter = $treeWalker->filter;

See also Mozilla documentation

root

Normally this is read-only, but under perl you can set whatever node value you want.

Returns a Node representing the root node as specified when the TreeWalker was created.

Example:

use HTML::Object::DOM::NodeFilter qw( :all );
my $treeWalker = $doc->createTreeWalker(
    $doc->body,
    SHOW_ELEMENT,
    sub{ return( FILTER_ACCEPT ); },
    # or
    # { acceptNode => sub{ return( FILTER_ACCEPT ); } },
);
my $root = $treeWalker->root; # $doc->body in this case

See also Mozilla documentation

whatToShow

Normally this is read-only, but under perl you can set whatever number value you want.

Returns an unsigned long being a bitmask made of constants describing the types of Node that must to be presented. Non-matching nodes are skipped, but their children may be included, if relevant.

Possible constant values (exported by HTML::Object::DOM::NodeFilter) are:

SHOW_ALL (4294967295)

Shows all nodes.

SHOW_ELEMENT (1)

Shows Element nodes.

SHOW_ATTRIBUTE (2)

Shows attribute Attribute nodes. This is meaningful only when creating a TreeWalker with an Attribute node as its root; in this case, it means that the attribute node will appear in the first position of the iteration or traversal. Since attributes are never children of other nodes, they do not appear when traversing over the document tree.

SHOW_TEXT (4)

Shows Text nodes.

Example:

use HTML::Object::DOM::NodeFilter qw( :all );
my $treeWalker = $doc->createTreeWalker(
    $doc->body,
    ( SHOW_ELEMENT | SHOW_COMMENT | SHOW_TEXT ),
    sub{ return( FILTER_ACCEPT ); },
    # or
    # { acceptNode => sub{ return( FILTER_ACCEPT ); } },
);
if( ( $treeWalker->whatToShow & SHOW_ALL ) ||
    ( $treeWalker->whatToShow & SHOW_COMMENT ) )
{
    # $treeWalker will show comments
}
SHOW_CDATA_SECTION (8)

Will always returns nothing, because there is no support for xml documents.

SHOW_ENTITY_REFERENCE (16)

Legacy, no more used.

SHOW_ENTITY (32)

Legacy, no more used.

SHOW_PROCESSING_INSTRUCTION (64)

Shows ProcessingInstruction nodes.

SHOW_COMMENT (128)

Shows Comment nodes.

SHOW_DOCUMENT (256)

Shows Document nodes

SHOW_DOCUMENT_TYPE (512)

Shows DocumentType nodes

SHOW_DOCUMENT_FRAGMENT (1024)

Shows HTML::Object::DOM::DocumentFragment nodes.

SHOW_NOTATION (2048)

Legacy, no more used.

SHOW_SPACE (4096)

Show Space nodes. This is a non-standard extension under this perl framework.

See also Mozilla documentation

METHODS

firstChild

Moves the current Node to the first visible child of the current node, and returns the found child. It also moves the current node to this child. If no such child exists, returns undef and the current node is not changed.

Example:

my $treeWalker = $doc->createTreeWalker(
    $doc->body,
    SHOW_ELEMENT,
    sub{ return( FILTER_ACCEPT ); },
);
my $node = $treeWalker->firstChild(); # returns the first child of the root element, or null if none

See also Mozilla documentation

lastChild

Moves the current Node to the last visible child of the current node, and returns the found child. It also moves the current node to this child. If no such child exists, undef is returned and the current node is not changed.

Example:

my $treeWalker = $doc->createTreeWalker(
    $doc->body,
    SHOW_ELEMENT,
    sub{ return( FILTER_ACCEPT ); },
);
my $node = $treeWalker->lastChild(); # returns the last visible child of the root element

See also Mozilla documentation

nextNode

Moves the current Node to the next visible node in the document order, and returns the found node. It also moves the current node to this one. If no such node exists, returns undef and the current node is not changed.

Example:

my $treeWalker = $doc->createTreeWalker(
    $doc->body,
    SHOW_ELEMENT,
    sub{ return( FILTER_ACCEPT ); },
);
my $node = $treeWalker->nextNode(); # returns the first child of root, as it is the next $node in document order

See also Mozilla documentation

nextSibling

Moves the current Node to its next sibling, if any, and returns the found sibling. If there is no such node, undef is returned and the current node is not changed.

Example:

my $treeWalker = $doc->createTreeWalker(
    $doc->body,
    SHOW_ELEMENT,
    sub{ return( FILTER_ACCEPT ); },
);
$treeWalker->firstChild();
my $node = $treeWalker->nextSibling(); # returns null if the first child of the root element has no sibling

See also Mozilla documentation

parentNode

Moves the current Node to the first visible ancestor node in the document order, and returns the found node. It also moves the current node to this one. If no such node exists, or if it is before that the root node defined at the object construction, returns undef and the current node is not changed.

Example:

my $treeWalker = $doc->createTreeWalker(
    $doc->body,
    SHOW_ELEMENT,
    sub{ return( FILTER_ACCEPT ); },
);
my $node = $treeWalker->parentNode(); # returns null as there is no parent

See also Mozilla documentation

previousNode

Moves the current Node to the previous visible node in the document order, and returns the found node. It also moves the current node to this one. If no such node exists, or if it is before that the root node defined at the object construction, returns undef and the current node is not changed.

Example:

my $treeWalker = $doc->createTreeWalker(
    $doc->body,
    SHOW_ELEMENT,
    sub{ return( FILTER_ACCEPT ); },
);
my $node = $treeWalker->previousNode(); # returns null as there is no parent

See also Mozilla documentation

previousSibling

Moves the current Node to its previous sibling, if any, and returns the found sibling. If there is no such node, return undef and the current node is not changed.

Example:

my $treeWalker = $doc->createTreeWalker(
    $doc->body,
    SHOW_ELEMENT,
    sub{ return( FILTER_ACCEPT ); },
);
my $node = $treeWalker->previousSibling(); # returns null as there is no previous sibiling

See also Mozilla documentation

AUTHOR

Jacques Deguest <jack@deguest.jp>

SEE ALSO

Mozilla documentation

COPYRIGHT & LICENSE

Copyright(c) 2022 DEGUEST Pte. Ltd.

All rights reserved

This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.