NAME
W3C::LogValidator::HTMLValidator - [W3C Log Validator] Batch HTML validation (using the W3C Markup Validator)
SYNOPSIS
use W3C::LogValidator::HTMLValidator;
my %config = ("verbose" => 2);
my $validator = W3C::LogValidator::HTMLValidator->new(\%config);
$validator->uris('http://www.w3.org/Overview.html', 'http://www.yahoo.com/index.html');
my %results = $validator->process_list;
DESCRIPTION
This module is part of the W3C::LogValidator suite, and checks HTML validity of a given document via the W3C HTML validator service.
API
Constructor
- $val = W3C::LogValidator::HTMLValidator->new
-
Constructs a new
W3C::LogValidator:HTMLValidator
processor.You might pass it a configuration hash reference (see "config_module" in W3C::LogValidator and W3C::LogValidator::Config)
$validator = W3C::LogValidator::HTMLValidator->new(\%config);
Main processing method
- $val->process_list
-
Processes a list of sorted URIs through the W3C Markup Validator.
The list can be set
uris
. If the $val was given a config has when constructed, and if the has has a "tmpfile" key,process_list
will try to read this file as a hash of URIs and "hits" (popularity) with DB_File.Returns a result hash. Keys for this hash are:
name (string): the name of the module, i.e "HTMLValidator" intro (string): introduction to the processing results thead (array): headers of the results table trows (array of arrays): rows of the results table outro (string): conclusion of the processing results
General methods
- $val->uris
-
Returns a list of URIs to be processed (unless the configuration gives the location for the hash of URI/hits berkeley file, see
process_list
If an array is given as a parameter, also sets the list of URIs and returns it. - $val->trim_uris
-
Given a list of URIs of documents to process, returns a subset of this list containing the URIs of documents the module supposedly can handle. The decision is made based on file extensions (see
auth_ext
), content-type (seeHEAD_check
) , and the setting for ExcludedAreas - $val->HEAD_check
-
Checks whether a document with no extension is actually an HTML/XML document through an HTTP HEAD request returns 1 if the URI is of an expected content-type, 0 otherwise
- $val->auth_ext
-
Returns the file extensions (space separated entries in a string) supported by the Module. Public method accessing $self->{AUTH_EXT}, itself coming from either the AuthorizedExtensions configuration setting, or a default value
- $val->valid
-
Sets / Returns whether the document being processed has been found to be valid or not. If an argument is given, sets the variable, otherwise returns the current variable.
- $val->valid_err_num
-
Sets / Returns the number of validation errors for the document being processed. If an argument is given, sets the variable, otherwise returns the current variable.
- $val->valid_success
-
Sets / Returns whether the module was able to process validation of the current document successfully (regardless of valid/invalid result) If an argument is given, sets the variable, otherwise returns the current variable.
- $val->valid_head
-
Sets / Returns all HTTP headers returned by the markup validator when attempting to validate the current document. If an argument is given, sets the variable, otherwise returns the current variable.
- $val->new_doc
-
Resets all validation variables to 'undef'. In effect, prepares the processing module to the handling of a new document.
BUGS
Public bug-tracking interface at http://www.w3.org/Bugs/Public/
AUTHOR
Olivier Thereaux <ot@w3.org>
SEE ALSO
W3C::LogValidator, perl(1). Up-to-date complete info at http://www.w3.org/QA/Tools/LogValidator/