NAME
HTML::Object::DOM::NodeFilter - HTML Object DOM Node Filter
SYNOPSIS
use HTML::Object::DOM::NodeFilter;
my $filter = HTML::Object::DOM::NodeFilter->new ||
die( HTML::Object::DOM::NodeFilter->error, "\n" );
VERSION
v0.2.0
DESCRIPTION
A NodeFilter
interface represents an object used to filter the nodes in a HTML::Object::DOM::NodeIterator or HTML::Object::DOM::::TreeWalker. A NodeFilter
knows nothing about the document or traversing nodes; it only knows how to evaluate a single node against the provided filter.
PROPERTIES
There are no properties.
METHODS
acceptNode
Returns an unsigned short that will be used to tell if a given Node must be accepted or not by the HTML::Object::DOM::NodeIterator or HTML::Object::DOM::TreeWalker iteration algorithm.
This method is expected to be written by the user of a NodeFilter
. Possible return values are:
- FILTER_ACCEPT
-
Value returned by the "acceptNode" method when a node should be accepted.
- FILTER_REJECT
-
Value to be returned by the "acceptNode" method when a node should be rejected. For HTML::Object::DOM::TreeWalker, child nodes are also rejected.
For
NodeIterator
, this flag is synonymous withFILTER_SKIP
. - FILTER_SKIP
-
Value to be returned by "acceptNode" for nodes to be skipped by the HTML::Object::DOM::NodeIterator or HTML::Object::DOM::TreeWalker object.
The children of skipped nodes are still considered. This is treated as "skip this node but not its children".
Example:
use HTML::Object::DOM::NodeFilter qw( :all ); my $nodeIterator = $doc->createNodeIterator( # Node to use as root $doc->getElementById('someId'), # Only consider nodes that are text nodes (nodeType 3) SHOW_TEXT, # Object containing the sub to use for the acceptNode method # of the NodeFilter { acceptNode => sub { my $node = shift( @_ ); # also available as $_ # Logic to determine whether to accept, reject or skip node # In this case, only accept nodes that have content other than whitespace if( $node->data !~ /^\s*$/ ) { return( FILTER_ACCEPT ); } } }, 0 # false ); # Show the content of every non-empty text node that is a child of root my $node; while( ( $node = $nodeIterator->nextNode() ) ) { say( $node->data ); }
See also Mozilla documentation
CONSTANTS
- SHOW_ALL (4294967295)
-
Shows all nodes.
- SHOW_ELEMENT (1)
-
Shows Element nodes.
- SHOW_ATTRIBUTE (2)
-
Shows attribute Attribute nodes. This is meaningful only when creating a NodeIterator with an Attribute node as its root; in this case, it means that the attribute node will appear in the first position of the iteration or traversal. Since attributes are never children of other nodes, they do not appear when traversing over the document tree.
- SHOW_TEXT (4)
-
Shows Text nodes.
Example:
use HTML::Object::DOM::NodeFilter qw( :all ); my $nodeIterator = $doc->createNodeIterator( $doc->body, SHOW_ELEMENT | SHOW_COMMENT | SHOW_TEXT, { acceptNode => sub{ return( FILTER_ACCEPT ); } }, 0 # false ); if( ( $nodeIterator->whatToShow & SHOW_ALL ) || ( $nodeIterator->whatToShow & SHOW_COMMENT ) ) { # $nodeIterator will show comments }
- SHOW_CDATA_SECTION (8)
-
Will always returns nothing, because there is no support for xml documents.
- SHOW_ENTITY_REFERENCE (16)
-
Legacy, no more used.
- SHOW_ENTITY (32)
-
Legacy, no more used.
- SHOW_PROCESSING_INSTRUCTION (64)
-
Shows ProcessingInstruction nodes.
- SHOW_COMMENT (128)
-
Shows Comment nodes.
- SHOW_DOCUMENT (256)
-
Shows Document nodes
- SHOW_DOCUMENT_TYPE (512)
-
Shows
DocumentType
nodes - SHOW_DOCUMENT_FRAGMENT (1024)
-
Shows HTML::Object::DOM::DocumentFragment nodes.
- SHOW_NOTATION (2048)
-
Legacy, no more used.
- SHOW_SPACE (4096)
-
Show Space nodes. This is a non-standard extension under this perl framework.
And for the callback control:
AUTHOR
Jacques Deguest <jack@deguest.jp>
SEE ALSO
Mozilla documentation, W3C specifications
COPYRIGHT & LICENSE
Copyright(c) 2022 DEGUEST Pte. Ltd.
All rights reserved
This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.