NAME
HTML::DublinCore - Extract Dublin Core metadata from HTML
SYNOPSIS
use HTML::DublinCore;
## pass HTML to constructor
my $dc = HTML::DublinCore->new( $html );
## get the title element and print it's content
my $title = $dc->element( 'Title' );
print "title: ", $title->content(), "\n";
## get the same title content in one step
print "title: ", $dc->element( 'Title' )->content(), "\n";
## list context will retrieve all of a particular element
foreach my $element ( $dc->element( 'Creator' ) ) {
print "creator: ",$element->content(),"\n";
}
## qualified dublin core
my $creation = $dc->element( 'Date.created' )->content();
DESCRIPTION
HTML::DublinCore is a module for easily extracting Dublin Core metadata that is embedded in HTML documents. The Dublin Core is a small set of metadata elements for describing information resources. Dublin Core is typically stored in the <HEAD> of and HTML document using the <META> tag. For more information on embedding DublinCore in HTML see RFC 2731 http://www.ietf.org/rfc/rfc2731. For a definition of the meaning of various Dublin Core elements please see http://www.dublincore.org/documents/dces/.
HTML::DublinCore actually extends Brian Cassidy's excellent DublinCore::Record framework by adding some asHTML() methods, and a new constructor.
METHODS
new()
Constructor which you pass HTML content.
$dc = HTML::DublinCore->new( $html );
asHtml()
Serialize your Dublin Core metadata as HTML <META> tags.
print $dc->asHtml();
TODO
More comprehensive tests.
Handle HTML entities properly.
Collect error messages so they can be reported out of the object.
SEE ALSO
DublinCore::Record
Dublin Core http://www.dublincore.org/
RFC 2731 http://www.ietf.org/rfc/rfc2731
HTML::Parser
perl4lib http://perl4lib.perl.org/
AUTHORS
Ed Summers <ehs@pobox.com>
Brian Cassidy <bricas@cpan.org>
COPYRIGHT AND LICENSE
Copyright 2004 by Ed Summers, Brian Cassidy
This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.