NAME
WWW::LinkRot - check web page link rot
SYNOPSIS
use WWW::LinkRot;
VERSION
This documents version 0.01 of WWW-LinkRot corresponding to git commit 669d2c37bcc74b435d2e2d7e6983c48565e65081 released on Mon Mar 8 22:48:31 2021 +0900.
DESCRIPTION
Scan HTML pages for links, try to access the links, and make a report.
The HTML files need to be in UTF-8 encoding.
FUNCTIONS
check_links
check_links ($links);
Check the links returned by "get_links" and write to a JSON file specified by the out
option.
check_links ($links, out => "link-statuses.json");
get_links
my $links = get_links (\@files);
Given a list of HTML files in @files
, extract all the links from it. The return value $links
contains a hash reference whose keys are the links and whose values are array references containing a list of all the files of @files
which contain the link.
html_report
html_report (in => 'link-statuses.json',
Write an HTML report using the JSON output by "get_links".
replace
replace (\%links, @files);
Make a regex of links with 30* statuses and new locations, then go through @files and replace them with the new locations.
DEPENDENCIES
- LWP::UserAgent
- HTML::Make
-
This is used to make the HTML report about the links.
- HTML::Make::Page
-
This is used to make the HTML report about the links.
- JSON::Parse
- JSON::Create
-
This is used to make the report file about the links.
- Trav::Dir
-
This is used to traverse the directory of HTML files.
- File::Slurper
-
This is used for reading and writing files.
- Convert::Moji
SEE ALSO
CPAN
AUTHOR
Ben Bullock, <bkb@cpan.org>
COPYRIGHT & LICENCE
This package and associated files are copyright (C) 2021 Ben Bullock.
You can use, copy, modify and redistribute this package and associated files under the Perl Artistic Licence or the GNU General Public Licence.