NAME

WWW::LinkRot - check web page link rot

SYNOPSIS

use WWW::LinkRot;

VERSION

This documents version 0.01 of WWW-LinkRot corresponding to git commit 669d2c37bcc74b435d2e2d7e6983c48565e65081 released on Mon Mar 8 22:48:31 2021 +0900.

DESCRIPTION

Scan HTML pages for links, try to access the links, and make a report.

The HTML files need to be in UTF-8 encoding.

FUNCTIONS

check_links ($links);

Check the links returned by "get_links" and write to a JSON file specified by the out option.

check_links ($links, out => "link-statuses.json");
my $links = get_links (\@files);

Given a list of HTML files in @files, extract all the links from it. The return value $links contains a hash reference whose keys are the links and whose values are array references containing a list of all the files of @files which contain the link.

html_report

html_report (in => 'link-statuses.json', 

Write an HTML report using the JSON output by "get_links".

replace

replace (\%links, @files);

Make a regex of links with 30* statuses and new locations, then go through @files and replace them with the new locations.

DEPENDENCIES

LWP::UserAgent
HTML::Make

This is used to make the HTML report about the links.

HTML::Make::Page

This is used to make the HTML report about the links.

JSON::Parse
JSON::Create

This is used to make the report file about the links.

Trav::Dir

This is used to traverse the directory of HTML files.

File::Slurper

This is used for reading and writing files.

Convert::Moji

SEE ALSO

CPAN

HTTP::SimpleLinkChecker
W3C::LinkChecker

AUTHOR

Ben Bullock, <bkb@cpan.org>

COPYRIGHT & LICENCE

This package and associated files are copyright (C) 2021 Ben Bullock.

You can use, copy, modify and redistribute this package and associated files under the Perl Artistic Licence or the GNU General Public Licence.