NAME
HTML::SummaryBasic - Basic summary info from HTML.
SYNOPSIS
use HTML::SummaryBasic;
my $p = new HTML::SummaryBasic {
PATH => "input.html",
# or HTML => '<html>...</html>',
NOT_AVAILABLE => undef,
};
foreach (keys %{$p->{SUMMARY}}){
warn "$_ ... $p->{SUMMARY}->{$_}\n";
}
DEPENDENCIES
use HTML::TokeParser;
use HTML::HeadParser;
DESCRIPTION
From a file or string of HTML, creates a hash of useful summary information from meta
and body
elements of an HTML document.
GLOBAL VARIABLE
- $NOT_AVAILABLE
-
Value for empty fields. Default is
[Not Available]
. May be over-ridden directly by supplying the constructor with a field of the same name. See "THE SUMMARY STRUCTURE".
CONSTRUCTOR (new)
Accepts a hash-like structure...
- HTML or PATH
-
Ref to a scalar of HTML, or plain string that is the path to an HTML file to process.
- SUMMARY
-
Filled after
get_summary
is called (see "METHOD get_summary" and "THE SUMMARY STRUCTURE"). - FIELDS
-
An array of
meta
tagname
s whosecontent
value should be placed into the respective slots of theSUMMARY
field afterget_summary
has been called.
THE SUMMARY STRUCTURE
A field of the object which is a hash, with key/values as follows:
- AUTHOR
-
HTML
meta
tagX-META-AUTHOR
. - TITLE
-
Text of the element of the same name.
- DESCRIPTION
-
Content of the
meta
tag namedX-META-DESCRIPTION
. - LAST_MODIFIED_META, LAST_MODIFIED_FILE
-
Time since of the modification of the file, respectively according to any
meta
tag of the same name, with aX-META-
prefix; failing that, according to the file system. - CREATED_META, CREATED_FILE
-
As above, but relating to the creation date of the file.
- FIRST_PARA
-
The first HTML
p
element of the document. - HEADLINE
-
The first
h1
tag; failing that, the firsth2
; failing that, the value of$NOT_AVAILABLE
. - PLUS...
-
Any meta-fields specified in the
FIELDS
field.
TODO
Maybe work on URI as well as file paths.
SEE ALSO
HTML::TokeParser, HTML::HeadParser.
AUTHOR
Lee Goddard (LGoddard@CPAN.org)
COPYRIGHT
Copyright 2000-2001 Lee Goddard.
This library is free software; you may use and redistribute it or modify it undef the same terms as Perl itself.
2 POD Errors
The following errors were encountered while parsing the POD:
- Around line 40:
'=item' outside of any '=over'
- Around line 49:
You forgot a '=back' before '=head1'