NAME
Introduction - A Simple Introduction To RSS
RSS/RDF
Rich Site Summary/Resource Description Framework technology is a simple method of a site describing what it has, so that another site can summarise the content, and provide links back to the original content.
RSS was pioneered by Netscape Communications http://my.netscape.com/publish/formats/rss-spec-0.91.html for their my.netscape portal, and adopted quickly by many others, notably userland http://backend.userland.com/rss092 .
XML
A simple XML http://www.w3.org/XML/ file is produced by the site originating the articles. This file, easily obtainable by HTTP, is downloaded and parsed by the client, allowing the client to present the site summary in a way that suits the client. XML provides a simple human readable format that is easy to generate and read, using typical web tools.
Principle
The module came from a simple idea, gather RSS feeds, convert them into HTML fragments and then template them into a web page on a local web server.
Downloading RSS files
Originally I used wget http://www.gnu.org/software/wget/wget.html , to pull files down from their server. Other tools to do this include cURL http://curl.haxx.se/ and any web browser. I cached the RSS feeds on my web server's disk space to reduce unnecessary downloading.
RSS Normalisation
RSS feeds come in several incompatible families. To make conversion to HTML simple I opted to convert all RSS feeds to RSS version 0.91 as this is very simple to convert to HTML via XSLT http://www.w3.org/Style/XSL/ . You can turn off normalisation if you plan to use just one XSLT stylesheet.
The underlying XML::RSS http://perl-rss.sourceforge.net/ (up to version 0.97) core can parse and interconvert RSS Versions 0.9, 0.91 and 1.0, versions of XML::RSS 0.98 and beyond can additionally process RSS version 2.0, though it is unlikely to ever be able to process the largely unused versions 0.92, 0.93, and 0.94, which are the evolutionary steps from 0.91 to 2.0.
RSS Conversion
Most online examples of RSS use the XML::RSS module to programmatically convert the feed into HTML, either directly or via using one of the many quality HTML templating tools. This I felt was inefficient and so I opted to use "XML Stylesheet Language Transformation", which is industry standard and does not require programming. There are several XSLT processors available: Saxon http://saxon.sourceforge.net/ , Xalan http://xml.apache.org/ , MSXML, and Sablotron http://www.gingerall.com/charlie/ga/xml/p_sab.xml , however the fastest and easiest one for Perl is Matt Sergeant's XML C Library for Gnome http://xmlsoft.org/ based a XML::LibXSL.
Script to Module
After developing the script to do this I realised that much of the code could be converted into a module and distributed to the world. After a popular post to Perlmonks, I have moved the module up to CPAN. The code should be considered as pre-release code, and the API may be extended in the future.
Examples
Some basic examples of how to use this module are provided in the examples folder. The following examples and this page should print fine if you have a modern standard compliant browser Simple explanations can be found here:
Example 1 - simple example
Example 2 - complete example
Example 3 - concise example
Example 4 - single stylesheet example with no normalisation
Example 5 - a complete RSS client mini-application
RSS With XSLT - An article from The Perl Review
rss.xml - the RSS feed for this module
Resources
http://backend.userland.com/rss092 - the home of key RSS developments
http://soapclient.com/rss/rss.html - a RSS to HTML online tool
Netscape Communications, home of the original specifications - http://my.netscape.com/publish/formats/rss-spec-0.91.html
W3C Standards body: RDF http://www.w3.org/RDF/ and RDF Validator http://www.w3.org/RDF/Validator/
http://blogspace.com/rss/ - Blogspace RSS FAQ
http://www.webreference.com/authoring/languages/xml/rss/1/8.html Webreference.com RSS Versions
http://rss.benhammersley.com/ Ben Hammersley's Webloggery and recent book Content Syndication with RSS - http://www.oreilly.com/catalog/consynrss/
Dave Beckett's Resource Description Framework Resource Guide - http://www.ilrt.bristol.ac.uk/discovery/rdf/resources/
Mark Pilgrim and Sam Ruby's RSS Validator http://feeds.archive.org/validator/
brian d foy
has an interesting, article in The Perl Review - http://www.theperlreview.com/Issues/v0i6.shtmlMark Pilgrim has some nice RSS articles on XML.com (in Python): http://www.xml.com/lpt/a/2002/12/18/dive-into-xml.html http://www.xml.com/lpt/a/2003/01/22/dive-into-xml.html and http://www.xml.com/lpt/a/2003/02/26/dive-into-xml.html
Bob DuCharme wrote a simple introduction to using XSLT with RSS on XML.com: http://www.xml.com/lpt/a/2003/01/02/tr.html
The Perl-RSS Group has a nice selection of articles on http://perl-rss.sourceforge.net/bibliography.html
O'Reilly has a dedicated RSS section in O'Reilly Network: RSS DevCenter http://www.oreillynet.com/rss/
Some Example RSS Feeds
http://www.bbc.co.uk/syndication/feeds/news/ukfs_news/front_page/rss091.xml
http://freshmeat.net/backend/fm.rdf
http://slashdot.org/slashdot.rdf
http://www.oreillynet.com/meerkat/?_fl=rss10
http://www.theregister.co.uk/tonys/slashdot.rdf - this one has often been malformed
http://www.sophos.com/virusinfo/infofeed/tenalerts.xml
http://www.perl.com/pace/perlnews.rdf