NAME

SVG::Metadata - Perl module to capture metadata info about an SVG file

SYNOPSIS

use SVG::Metadata;

my $svgmeta = new SVG::Metadata;

$svgmeta->parse($filename)
    or die "Could not parse $filename: " . $svgmeta->errormsg();
$svgmeta2->parse($filename2)
    or die "Could not parse $filename: " . $svgmeta->errormsg();

# Do the files have the same metadata (author, title, license)?
if (! $svgmeta->compare($svgmeta2) ) {
   print "$filename is different than $filename2\n";
}

if ($svgmeta->title() eq '') {
    $svgmeta->title('Unknown');
}

if ($svgmeta->author() eq '') {
    $svgmeta->author('Unknown');
}

if ($svgmeta->license() eq '') {
    $svgmeta->license('Unknown');
}

if (! $svgmeta->keywords()) {
    $svgmeta->addKeyword('unsorted');
} elsif ($svgmeta->hasKeyword('unsorted') && $svgmeta->keywords()>1) {
    $svgmeta->removeKeyword('unsorted');
}

print $svgmeta->to_text();

DESCRIPTION

This module provides a way of extracting, browsing and using RDF metadata embedded in an SVG file.

The SVG spec itself does not provide any particular mechanisms for handling metadata, but instead relies on embedded, namespaced RDF sections, as per XML philosophy. Unfortunately, many SVG tools don't support the concept of RDF metadata; indeed many don't support the idea of embedded XML "islands" at all. Some will even ignore and drop the rdf data entirely when encountered.

The motivation for this module is twofold. First, it provides a mechanism for accessing this metadata from the SVG files. Second, it provides a means of validating SVG files to detect if they have the metadata.

The motivation for this script is primarily for the Open Clip Art Library (http://www.openclipart.org), as a way of filtering out submissions that lack metadata from being included in the official distributions. A secondary motivation is to serve as a testing tool for SVG editors like Inkscape (http://www.inkscape.org).

FUNCTIONS

new()

Creates a new SVG::Metadata object. Optionally, can pass in arguments 'title', 'author', 'license', etc..

my $svgmeta = new SVG::Metadata;
my $svgmeta = new SVG::Metadata(title=>'My title', author=>'Me', license=>'Public Domain');

author()

Alias for creator()

keywords_to_rdf()

Generates an rdf:Bag based on the data structure of keywords. This can then be used to populate the subject section of the metadata. I.e.:

$svgobj->subject($svg->keywords_to_rdf());

See: http://www.w3.org/TR/rdf-schema/#ch_bag http://www.w3.org/TR/rdf-syntax-grammar/#section-Syntax-list-element http://dublincore.org/documents/2002/05/15/dcq-rdf-xml/#sec2

errormsg()

Returns the last encountered error message. Most of the error messages are encountered during file parsing.

print $svgmeta->errormsg();

parse($filename)

Extracts RDF metadata out of an existing SVG file.

$svgmeta->parse($filename) || die "Error: " . $svgmeta->errormsg();

This routine looks for a field in the rdf:RDF section of the document named 'ns:Work' and then attempts to load the following keys from it: 'dc:title', 'dc:rights'->'ns:Agent', and 'ns:license'. If any are missing, it

The $filename parameter can be a filename, or a text string containing the XML to parse, or an open 'IO::Handle', or a URL.

Returns false if there was a problem parsing the file, and sets an error message appropriately. The conditions under which it will return false are as follows:

* No 'filename' parameter given.
* Filename does not exist.
* Document is not parseable XML.
* No rdf:RDF element was found in the document, and the try harder
  option was not set.
* The rdf:RDF element did not have a ns:Work sub-element, and the
  try_harder option was not set.
* Strict validation mode was turned on, and the document didn't
  strictly comply with one or more of its extra criteria.

title()

Gets or sets the title.

$svgmeta->title('My Title');
print $svgmeta->title();

description()

Gets or sets the description

subject()

Gets or sets the subject. Note that the parse() routine pulls the keywords out of the subject and places them in the keywords collection, so subject() will normally return undef. If you assign to subject() it will override the internal keywords() mechanism, but this may later be discarded again in favor of the keywords, if to_rdf() is called, either directly or indirectly via to_svg().

publisher()

Gets or sets the publisher name. E.g., 'Open Clip Art Library'

publisher_url()

Gets or sets the web URL for the publisher. E.g., 'http://www.openclipart.org'

creator()

Gets or sets the creator.

$svgmeta->creator('Bris Geek');
print $svgmeta->creator();

creator_url()

Gets or sets the URL for the creator.

author()

Alias for creator() - does the same thing

$svgmeta->author('Bris Geek');
print $svgmeta->author();

owner()

Gets or sets the owner.

$svgmeta->owner('Bris Geek');
print $svgmeta->owner();

owner_url()

Gets or sets the owner URL for the item

license()

Gets or sets the license.

$svgmeta->license('Public Domain');
print $svgmeta->license();

license_date()

Gets or sets the date that the item was licensed

language()

Gets or sets the language for the metadata. This should be in the two-letter lettercodes, such as 'en', etc.

retain_xml()

Gets or sets the XML retention option, which (if true) will cause any subsequent call to parse() to retain the XML. You have to turn this on if you want to_svg() to work later.

strict_validation()

Gets or sets the strict validation option, which (if true) will cause subsequent calls to parse() to be pickier about how things are structured and possibly set an error and return undef when it otherwise would succeed.

try_harder()

Gets or sets the try harder option option, which causes subsequent calls to parse() to try to return a valid Metadata object even if it can't find any metadata at all. The resulting object may contain mostly empty fields.

Parse will still fail and return undef if the input file does not exist or cannot be parsed as XML, but otherwise it will attempt to return an object.

If you set both this option and the strict validation option at the same time, the Undefined Behavior Fairy will come and zap you with a frap ray blaster and take away your cookie.

keywords()

Gets or sets an array of keywords. Keywords are a categorization mechanism, and can be used, for example, to sort the files topically.

addKeyword($kw1 [, $kw2 ...])

Adds one or more a new keywords. Note that the keywords are stored internally as a set, so only one copy of a given keyword will be stored.

$svgmeta->addKeyword('Fruits and Vegetables');
$svgmeta->addKeyword('Fruit','Vegetable','Animal','Mineral');

removeKeyword($kw)

Removes a given keyword

$svgmeta->removeKeyword('Fruits and Vegetables');

Return value: The keyword removed.

hasKeyword($kw)

Returns true if the metadata includes the given keyword

compare($meta2)

Compares this metadata to another metadata for equality.

Two SVG file metadata objects are considered equivalent if they have exactly the same author, title, and license. Keywords can vary, as can the SVG file itself.

to_text()

Creates a plain text representation of the metadata, suitable for debuggery, emails, etc. Example output:

Title:    SVG Road Signs
Author:   John Cliff
License:  http://web.resource.org/cc/PublicDomain
Keywords: unsorted

Return value is a string containing the title, author, license, and keywords, each value on a separate line. The text always ends with a newline character.

esc_ents($text)

Escapes '<', '>', and '&' and single and double quote characters to avoid causing rdf to become invalid.

to_rdf()

Generates an RDF snippet to describe the item. This includes the author, title, license, etc. The text always ends with a newline character.

to_svg()

Returns the SVG with the updated metadata embedded. This can only be done if parse() was called with the retain_xml option. Note that the code's layout can change a little, especially in terms of whitespace, but the semantics SHOULD be the same, except for the updated metadata.

PREREQUISITES

XML::Twig

AUTHOR

Bryce Harrington <bryce@bryceharrington.org>

COPYRIGHT

Copyright (C) 2004 Bryce Harrington. All Rights Reserved.

This script is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

SEE ALSO

perl, XML::Twig