NAME

HTML::RewriteAttributes - concise attribute rewriting

SYNOPSIS

Locate a tag in a provided block of HTML and delete, add, or rewrite the attributes associated with that tag. The updated HTML is returned.

Delete an attribute by returning undef.

$html = HTML::RewriteAttributes->rewrite($html, sub {
    my ($tag, $attr, $value) = @_;

    # delete any attribute that mentions..
    return if $value =~ /COBOL/i;

    $value =~ s/\brocks\b/rules/g;
    return $value;
});

Add an attribute by appending it to the $attr_list arrayref and adding the value to the $attrs hashref. For example, you could add loading="lazy" to all img tags.

$html = HTML::RewriteAttributes->rewrite($html, sub {
    my ( $tag, $attr, $value, $attrs, $attr_list ) = @_;
    return $value unless $tag eq 'img' && !$attrs->{loading};
    $attrs->{loading} = 'lazy';
    push @$attr_list, 'loading';
    return $value;
});

Modify an existing attribute by returning the new value. The example below would be a src attribute for an img in an email.

$html = HTML::RewriteAttributes::Resources->rewrite($html, sub {
    my $uri = shift;
    my $content = render_template($uri);
    my $cid = generate_cid_from($content);
    $mime->attach($cid => content);
    return "cid:$cid";
});

Passing a URL, HTML::RewriteAttributes::Links can update resources like hrefs or imgs to include the base URL, changing <img src="/bar.gif"> to <img src="https://search.cpan.org/bar.gif">. See also HTML::ResolveLink.

$html = HTML::RewriteAttributes::Links->rewrite($html, "https://search.cpan.org");

# Passing a subroutine reference, L<HTML::RewriteAttributes::Links> can
# extract all links, similar to L<HTML::LinkExtor>.

HTML::RewriteAttributes::Links->rewrite($html, sub {
    my ($tag, $attr, $value) = @_;
    push @links, $value;
    $value;
});

DESCRIPTION

HTML::RewriteAttributes is designed for simple yet powerful HTML attribute rewriting.

You simply specify a callback to run for each attribute and we do the rest for you.

This module is designed to be subclassable to make handling special cases easier. See the source for methods you can override.

See the SYNOPSIS above and included tests in the t directory for more examples.

METHODS

new

You don't need to call new explicitly - it's done in "rewrite". It takes no arguments.

rewrite HTML, callback -> HTML

This is the main interface of the module. You pass in some HTML and a callback, the callback is invoked potentially many times, and you get back some similar HTML.

As rewrite parses the HTML block, it calls the provided callback, passing as arguments the current tag name, the attribute name, and the attribute value (though subclasses may override this -- HTML::RewriteAttributes::Resources does). The callback can then use the arguments to determine if you want to change the current tag or attribute, or skip it by returning the current value unchanged. If you find the tag and attribute you want to change, return undef to remove the attribute, or any other value to set the value of the attribute.

The callback also is passed a hashref $attrs which has keys for attributes and values with the current values. Finally $attr_list is passed as an arrayref contain all attributes for the current tag. To add a new attribute, add the attribute name to the $attr_list arrayref, and add the new value to $attrs.

SEE ALSO

HTML::Parser, HTML::ResolveLink, Email::MIME::CreateHTML, HTML::LinkExtor

THANKS

Some code was inspired by, and tests borrowed from, Miyagawa's HTML::ResolveLink.

AUTHOR

Best Practical Solutions, LLC <modules@bestpractical.com>

LICENSE

Copyright 2008-2024 Best Practical Solutions, LLC. HTML::RewriteAttributes is distributed under the same terms as Perl itself.