NAME
Plucene::SearchEngine::Index::RSS - Index RSS files
SYNOPSIS
my @articles = Plucene::SearchEngine::Index::URL->(
"http://planet.perl.org/rss10.xml"
);
$indexer->index($_->document) for @articles;
DESCRIPTION
This examines RSS files and creates document hashes for individual items in the feed. The objects have the following Plucene fields:
- modified
-
The date that this article was published.
- creator
-
The creator, if one was specified.
- feed
-
The name of the feed from which this was taken.
- id
-
The URL that the article links to, and the URL of the feed.
- text
-
The text of the article.
- title
-
The title of the article.
WARNING
Since Plucene::SearchEngine::Index
uses MIME types to determine the type of a file, this module doesn't work particularly well using the File
frontend. It works OK with the URL
frontend if the webserver sends the right content type header. If not, you may have to fudge it by registering your own handlers:
Plucene::SearchEngine::Index::RSS->register_handler("text/xml");
# For instance
SEE ALSO
AUTHOR
Simon Cozens, <simon@cpan.org>
COPYRIGHT AND LICENSE
Copyright (C) 2004 by Simon Cozens
This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.