NAME
HTML::Object::DOM::NodeIterator - HTML Object DOM Node Iterator Class
SYNOPSIS
With just one argument, this default to search for everything (SHOW_ALL
) and to use the default filter, which always returns FILTER_ACCEPT
use HTML::Object::DOM::NodeIterator;
my $nodes = HTML::Object::DOM::NodeIterator->new( $root_node ) ||
die( HTML::Object::DOM::NodeIterator->error, "\n" );
Or, passing an anonymous subroutine as the filter
my $nodes = HTML::Object::DOM::NodeIterator->new(
$root_node,
$what_to_show_bit,
sub{ return( FILTER_ACCEPT ); }
) || die( HTML::Object::DOM::NodeIterator->error, "\n" );
Or, passing an hash reference with a property 'acceptNode' whose value is an anonymous subroutine, as the filter
my $nodes = HTML::Object::DOM::NodeIterator->new(
$root_node,
$what_to_show_bit,
{
acceptNode => sub{ return( FILTER_ACCEPT ); }
}
) || die( HTML::Object::DOM::NodeIterator->error, "\n" );
Or, passing an object that implements the method "acceptNode"
my $nodes = HTML::Object::DOM::NodeIterator->new(
$root_node,
$what_to_show_bit,
# This object must implement the acceptNode method
My::Customer::NodeFilter->new
) || die( HTML::Object::DOM::NodeIterator->error, "\n" );
There is also HTML::Object::DOM::TreeWalker, which performs a somewhat similar function.
Choose HTML::Object::DOM::NodeIterator when you only need a simple iterator to filter and browse the selected nodes, and choose HTML::Object::DOM::TreeWalker when you need to access to the node and its siblings.
VERSION
v0.2.0
DESCRIPTION
The NodeIterator
interface represents an iterator over the members of a list of the nodes in a subtree of the DOM. The nodes will be returned in document order.
A NodeIterator
can be created using the "createNodeIterator" in HTML::Object::DOM::Document method, as follows:
use HTML::Object::DOM;
my $parser = HTML::Object::DOM->new;
my $doc = $parser->parse_data( $some_html_data ) || die( $parser->error );
my $nodeIterator = $doc->createNodeIterator( $root, $whatToShow, $filter ) ||
die( $doc->error );
PROPERTIES
expandEntityReferences
Normally this is read-only, but under perl you can set whatever boolean value you want.
Under JavaScript, this is a boolean value indicating if, when discarding an EntityReference
its whole sub-tree must be discarded at the same time.
Example:
use HTML::Object::DOM::NodeFilter qw( :all );
my $nodeIterator = $doc->createNodeIterator(
$doc->body,
SHOW_ELEMENT,
sub{ return( FILTER_ACCEPT ); },
# or
# { acceptNode => sub{ return( FILTER_ACCEPT ); } },
);
my $expand = $nodeIterator->expandEntityReferences;
See also Mozilla documentation
filter
Normally this is read-only, but under perl you can set it to a new HTML::Object::DOM::NodeFilter object you want, even after object instantiation.
Returns a HTML::Object::DOM::NodeFilter used to select the relevant nodes.
Example:
use HTML::Object::DOM::NodeFilter qw( :all );
my $nodeIterator = $doc->createNodeIterator(
$doc->body,
SHOW_ELEMENT,
sub{ return( FILTER_ACCEPT ); },
# or
# { acceptNode => sub{ return( FILTER_ACCEPT ); } },
);
my $nodeFilter = $nodeIterator->filter;
See also Mozilla documentation
pointerBeforeReferenceNode
Normally this is read-only, but under perl you can set whatever boolean value you want. Defaults to true.
Returns a boolean flag that indicates whether the NodeIterator
is anchored before, the flag being true, or after, the flag being false, the anchor node.
Example:
use HTML::Object::DOM::NodeFilter qw( :all );
my $nodeIterator = $doc->createNodeIterator(
$doc->body,
SHOW_ELEMENT,
sub{ return( FILTER_ACCEPT ); },
# or
# { acceptNode => sub{ return( FILTER_ACCEPT ); } },
);
my $flag = $nodeIterator->pointerBeforeReferenceNode;
See also Mozilla documentation
pos
Read-only.
This is a non-standard property, which returns the 0-based position in the array of the anchor element's children.
You can poll this to know where the iterator is at.
Example:
use HTML::Object::DOM::NodeFilter qw( :all );
# You need to first declare $nodeIterator to be able to use it in the callback
my $nodeIterator;
$nodeIterator = $doc->createNodeIterator(
$doc->body,
SHOW_ELEMENT,
sub
{
say "Current position is: ", $nodeIterator->pos );
return( $_->getName eq 'div' ? FILTER_ACCEPT : FILTER_SKIP );
},
);
referenceNode
Read-only.
Returns the Node to which the iterator is anchored.
Example:
use HTML::Object::DOM::NodeFilter qw( :all );
my $nodeIterator = $doc->createNodeIterator(
$doc->body,
SHOW_ELEMENT,
sub{ return( FILTER_ACCEPT ); },
# or
# { acceptNode => sub{ return( FILTER_ACCEPT ); } },
);
my $node = $nodeIterator->referenceNode;
See also Mozilla documentation
root
Normally this is read-only, but under perl you can set whatever node value you want.
Returns a Node representing the root node as specified when the NodeIterator
was created.
Example:
use HTML::Object::DOM::NodeFilter qw( :all );
my $nodeIterator = $doc->createNodeIterator(
$doc->body,
SHOW_ELEMENT,
sub{ return( FILTER_ACCEPT ); },
# or
# { acceptNode => sub{ return( FILTER_ACCEPT ); } },
);
my $root = $nodeIterator->root; # $doc->body in this case
whatToShow
Normally this is read-only, but under perl you can set whatever number value you want.
Returns an unsigned long being a bitmask made of constants describing the types of Node that must to be presented. Non-matching nodes are skipped, but their children may be included, if relevant.
Possible constant values (exported by HTML::Object::DOM::NodeFilter) are:
- SHOW_ALL (4294967295)
-
Shows all nodes.
- SHOW_ELEMENT (1)
-
Shows Element nodes.
- SHOW_ATTRIBUTE (2)
-
Shows attribute Attribute nodes. This is meaningful only when creating a
NodeIterator
with an Attribute node as its root; in this case, it means that the attribute node will appear in the first position of the iteration or traversal. Since attributes are never children of other nodes, they do not appear when traversing over the document tree. - SHOW_TEXT (4)
-
Shows Text nodes.
Example:
use HTML::Object::DOM::NodeFilter qw( :all ); my $nodeIterator = $doc->createNodeIterator( $doc->body, ( SHOW_ELEMENT | SHOW_COMMENT | SHOW_TEXT ), sub{ return( FILTER_ACCEPT ); }, # or # { acceptNode => sub{ return( FILTER_ACCEPT ); } }, ); if( ( $nodeIterator->whatToShow & SHOW_ALL ) || ( $nodeIterator->whatToShow & SHOW_COMMENT ) ) { # $nodeIterator will show comments }
- SHOW_CDATA_SECTION (8)
-
Will always returns nothing, because there is no support for xml documents.
- SHOW_ENTITY_REFERENCE (16)
-
Legacy, no more used.
- SHOW_ENTITY (32)
-
Legacy, no more used.
- SHOW_PROCESSING_INSTRUCTION (64)
-
Shows ProcessingInstruction nodes.
- SHOW_COMMENT (128)
-
Shows Comment nodes.
- SHOW_DOCUMENT (256)
-
Shows Document nodes
- SHOW_DOCUMENT_TYPE (512)
-
Shows
DocumentType
nodes - SHOW_DOCUMENT_FRAGMENT (1024)
-
Shows HTML::Object::DOM::DocumentFragment nodes.
- SHOW_NOTATION (2048)
-
Legacy, no more used.
- SHOW_SPACE (4096)
-
Show Space nodes. This is a non-standard extension under this perl framework.
CONSTRUCTOR
new
Provided with a root node, an optional bitwise value representing what to show and an optional filter callback and this will return a new node iterator.
METHODS
detach
This operation is a no-op. It does not do anything. Previously it was telling the web browser engine that the NodeIterator
was no more used, but this is now useless.
See also Mozilla documentation
nextNode
Returns the next Node in the document, or undef
if there are none.
Example:
use HTML::Object::DOM::NodeFilter qw( :all );
my $nodeIterator = $doc->createNodeIterator(
$doc->body,
SHOW_ELEMENT,
sub{ return( FILTER_ACCEPT ); },
# or
# { acceptNode => sub{ return( FILTER_ACCEPT ); } },
0 # false; this optional argument is not used any more
);
my $currentNode = $nodeIterator->nextNode(); # returns the next node
See also Mozilla documentation
previousNode
Returns the previous Node in the document, or undef
if there are none.
Example:
use HTML::Object::DOM::NodeFilter qw( :all );
my $nodeIterator = $doc->createNodeIterator(
$doc->body,
SHOW_ELEMENT,
sub{ return( FILTER_ACCEPT ); },
# or
# { acceptNode => sub{ return( FILTER_ACCEPT ); } },
0 # false; this optional argument is not used any more
);
my $currentNode = $nodeIterator->nextNode(); # returns the next node
my $previousNode = $nodeIterator->previousNode(); # same result, since we backtracked to the previous node
See also Mozilla documentation
AUTHOR
Jacques Deguest <jack@deguest.jp>
SEE ALSO
Mozilla documentation, StackOverflow topic on NodeIterator, W3C specifications
COPYRIGHT & LICENSE
Copyright(c) 2021 DEGUEST Pte. Ltd.
All rights reserved
This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.