NAME

HTML::Zoom::FilterBuilder - Add Filters to a Stream

SYNOPSIS

Create an HTML::Zoom instance:

use HTML::Zoom;
my $root = HTML::Zoom
    ->from_html(<<MAIN);
<html>
  <head>
    <title>Default Title</title>
  </head>
  <body bad_attr='junk'>
    Default Content
  </body>
</html>
MAIN

Create a new attribute on the body tag:

$root = $root
  ->select('body')
  ->set_attribute(class=>'main');

Add a extra value to an existing attribute:

$root = $root
  ->select('body')
  ->add_to_attribute(class=>'one-column');

Set the content of the title tag:

$root = $root
  ->select('title')
  ->replace_content('Hello World');

Set content from another HTML::Zoom instance:

my $body = HTML::Zoom
    ->from_html(<<BODY);
<div id="stuff">
    <p>Well Now</p>
    <p id="p2">Is the Time</p>
</div>
BODY

$root = $root
  ->select('body')
  ->replace_content($body);

Set an attribute on multiple matches:

$root = $root
  ->select('p')
  ->set_attribute(class=>'para');

Remove an attribute:

$root = $root
  ->select('body')
  ->remove_attribute('bad_attr');

will produce:

<html>
  <head>
    <title>Hello World</title>
  </head>
  <body class="main one-column"><div id="stuff">
    <p class="para">Well Now</p>
    <p id="p2" class="para">Is the Time</p>
</div>
</body>
</html>

DESCRIPTION

Given a HTML::Zoom stream, provide methods to apply filters which alter the content of that stream.

METHODS

This class defines the following public API

set_attribute

Sets an attribute of a given name to a given value for all matching selections.

$html_zoom
  ->select('p')
  ->set_attribute(class=>'paragraph')
  ->select('div')
  ->set_attribute({name=>'class', value=>'divider'});

Overrides existing values, if such exist. When multiple "set_attribute" calls are made against the same or overlapping selection sets, the final call wins.

add_to_attribute

Adds a value to an existing attribute, or creates one if the attribute does not yet exist. You may call this method with either an Array or HashRef of Args.

Here's the 'long form' HashRef:

$html_zoom
  ->select('p')
  ->set_attribute(class=>'paragraph')
  ->then
  ->add_to_attribute({name=>'class', value=>'divider'});

And the exact same effect using the 'short form' Array:

$html_zoom
  ->select('p')
  ->set_attribute(class=>'paragraph')
  ->then
  ->add_to_attribute(class=>'divider');

Attributes with more than one value will have a dividing space.

remove_attribute

Removes an attribute and all its values.

$html_zoom
  ->select('p')
  ->set_attribute(class=>'paragraph')
  ->then
  ->remove_attribute('class');

Removes attributes from the original stream or events already added.

transform_attribute

Transforms (or creates or deletes) an attribute by running the passed coderef on it. If the coderef returns nothing, the attribute is removed.

$html_zoom
  ->select('a')
  ->transform_attribute( href => sub {
        ( my $a = shift ) =~ s/localhost/example.com/;
        return $a;
      },
    );

collect

Collects and extracts results of "select" in HTML::Zoom. It takes the following optional common options as hash reference.

into [ARRAY REFERENCE]

Where to save collected events (selected elements).

$z1->select('#main-content')
   ->collect({ into => \@body })
   ->run;
$z2->select('#main-content')
   ->replace(\@body)
   ->memoize;
filter [CODE]

Run filter on collected elements (locally setting $_ to stream, and passing stream as an argument to given code reference). Filtered stream would be returned.

$z->select('.outer')
  ->collect({
    filter => sub { $_->select('.inner')->replace_content('bar!') },
    passthrough => 1,
  })

It can be used to further filter selection. For example

$z->select('tr')
  ->collect({
    filter => sub { $_->select('td') },
    passthrough => 1,
  })

is equivalent to (not implemented yet) descendant selector combination, i.e.

$z->select('tr td')
passthrough [BOOLEAN]

Extract copy of elements; the stream is unchanged (it does not remove collected elements). For example without 'passthrough'

HTML::Zoom->from_html('<foo><bar /></foo>')
  ->select('foo')
  ->collect({ content => 1 })
  ->to_html

returns '<foo></foo>', while with passthrough option

HTML::Zoom->from_html('<foo><bar /></foo>')
  ->select('foo')
  ->collect({ content => 1, passthough => 1 })
  ->to_html

returns '<foo><bar /></foo>'.

content [BOOLEAN]

Collect content of the element, and not the element itself.

For example

HTML::Zoom->from_html('<h1>Title</h1><p>foo</p>')
  ->select('h1')
  ->collect
  ->to_html

would return '<p>foo</p>', while

HTML::Zoom->from_html('<h1>Title</h1><p>foo</p>')
  ->select('h1')
  ->collect({ content => 1 })
  ->to_html

would return '<h1></h1><p>foo</p>'.

See also "collect_content".

flush_before [BOOLEAN]

Generate flush event before collecting, to ensure that the HTML generated up to selected element being collected is flushed throught to the browser. Usually used in "repeat" or "repeat_content".

collect_content

Collects contents of "select" in HTML::Zoom result.

HTML::Zoom->from_file($foo)
          ->select('#main-content')
          ->collect_content({ into => \@foo_body })
          ->run;
$z->select('#foo')
  ->replace_content(\@foo_body)
  ->memoize;

Equivalent to running "collect" with content option set.

add_before

Given a "select" in HTML::Zoom result, add given content (which might be string, array or another HTML::Zoom object) before it.

$html_zoom
    ->select('input[name="foo"]')
    ->add_before(\ '<span class="warning">required field</span>');

add_after

Like "add_before", only after "select" in HTML::Zoom result.

$html_zoom
    ->select('p')
    ->add_after("\n\n");

You can add zoom events directly

$html_zoom
    ->select('p')
    ->add_after([ { type => 'TEXT', raw => 'O HAI' } ]);

prepend_content

Similar to add_before, but adds the content to the match.

HTML::Zoom
  ->from_html(q[<p>World</p>])
  ->select('p')
  ->prepend_content("Hello ")
  ->to_html
  
## <p>Hello World</p>

Acceptable values are strings, scalar refs and HTML::Zoom objects

append_content

Similar to add_after, but adds the content to the match.

HTML::Zoom
  ->from_html(q[<p>Hello </p>])
  ->select('p')
  ->prepend_content("World")
  ->to_html
  
## <p>Hello World</p>

Acceptable values are strings, scalar refs and HTML::Zoom objects

replace

Given a "select" in HTML::Zoom result, replace it with a string, array or another HTML::Zoom object. It takes the same optional common options as "collect" (via hash reference).

replace_content

Given a "select" in HTML::Zoom result, replace the content with a string, array or another HTML::Zoom object.

$html_zoom
  ->select('title, #greeting')
  ->replace_content('Hello world!');

repeat

For a given selection, repeat over transformations, typically for the purposes of populating lists. Takes either an array of anonymous subroutines or a zoom- able object consisting of transformation.

Example of array reference style (when it doesn't matter that all iterations are pre-generated)

$zoom->select('table')->repeat([
  map {
    my $elem = $_;
    sub {
      $_->select('td')->replace_content($e);
    }
  } @list
]);

Subroutines would be run with $_ localized to result of "select" in HTML::Zoom (of collected elements), and with said result passed as parameter to subroutine.

You might want to use CodeStream when you don't have all elements upfront

$zoom->select('.contents')->repeat(sub {
  HTML::Zoom::CodeStream->new({
    code => sub {
      while (my $line = $fh->getline) {
        return sub {
          $_->select('.lno')->replace_content($fh->input_line_number)
            ->select('.line')->replace_content($line)
        }
      }
      return
    },
  })
});

In addition to common options as in "collect", it also supports:

repeat_between [SELECTOR]

Selects object to be repeated between items. In the case of array this object is put between elements, in case of iterator it is put between results of subsequent iterations, in the case of streamable it is put between events (->to_stream->next).

See documentation for "repeat_content"

repeat_content

Given a "select" in HTML::Zoom result, run provided iterator passing content of this result to this iterator. Accepts the same options as "repeat".

Equivalent to using contents option with "repeat".

$html_zoom
   ->select('#list')
   ->repeat_content(
      [
         sub {
            $_->select('.name')->replace_content('Matt')
              ->select('.age')->replace_content('26')
         },
         sub {
            $_->select('.name')->replace_content('Mark')
              ->select('.age')->replace_content('0x29')
         },
         sub {
            $_->select('.name')->replace_content('Epitaph')
              ->select('.age')->replace_content('<redacted>')
         },
      ],
      { repeat_between => '.between' }
   );

ALSO SEE

HTML::Zoom

AUTHORS

See HTML::Zoom for authors.

LICENSE

See HTML::Zoom for the license.