Looking for help!
NAME
WWW::FetchStory::Fetcher - fetching module for WWW::FetchStory
VERSION
version 0.1704
DESCRIPTION
This is the base class for story-fetching plugins for WWW::FetchStory.
METHODS
new
$obj->WWW::FetchStory::Fetcher->new();
init
Initialize the object.
$obj->init(%args)
name
The name of the fetcher; this is basically the last component of the module name. This works as either a class function or a method.
$name = $self->name();
$name = WWW::FetchStory::Fetcher::name($class);
info
Information about the fetcher. By default this just returns the formatted name.
$info = $self->info();
priority
The priority of this fetcher. Fetchers with higher priority get tried first. This is useful where there may be a generic fetcher for a particular site, and then a more specialized fetcher for particular sections of a site. For example, there may be a generic LiveJournal fetcher, and then refinements for particular LiveJournal community, such as the sshg_exchange community. This works as either a class function or a method.
This must be overridden by the specific fetcher class.
$priority = $self->priority();
$priority = WWW::FetchStory::Fetcher::priority($class);
allow
If this fetcher can be used for the given URL, then this returns true. This must be overridden by the specific fetcher class.
if ($obj->allow($url))
{
....
}
fetch
Fetch the story, with the given options.
%story_info = $obj->fetch(
url=>$url,
basename=>$basename,
toc=>0,
yaml=>0);
- basename
-
Optional basename used to construct the filenames. If this is not given, the basename is derived from the title of the story.
- epub
-
Create an EPUB file, deleting the HTML files which have been downloaded.
- toc
-
Build a table-of-contents file if this is true.
- yaml
-
Build a YAML file with meta-data about this story if this is true.
- url
-
The URL of the story. The page is scraped for meta-information about the story, including the title and author. Site-specific Fetcher plugins can find additional information, including the URLs of all the chapters in a multi-chapter story.
Private Methods
get_story_basename
Figure out the file basename for a story by using its title.
$basename = $self->get_story_basename($title);
extract_story
Extract the story-content from the fetched content.
my ($story, $title) = $self->extract_story(content=>$content,
title=>$title);
make_css
Create site-specific CSS styling.
$css = $self->make_css();
tidy
Make a tidy, compliant XHTML page from the given story-content.
$content = $self->tidy(story=>$story,
title=>$title);
get_toc
Get a table-of-contents page.
get_page
Get the contents of a URL.
parse_toc
Parse the table-of-contents file.
This must be overridden by the specific fetcher class.
%info = $self->parse_toc(content=>$content,
url=>$url);
This should return a hash containing:
- chapters
-
An array of URLs for the chapters of the story. (In the case where the story only takes one page, that will be the chapter).
- title
-
The title of the story.
It may also return additional information, such as Summary.
parse_chapter_urls
Figure out the URLs for the chapters of this story.
parse_epub_url
Figure out the URL for the EPUB version of this story, if there is one.
parse_title
Get the title from the content
parse_ch_title
Get the chapter title from the content
parse_author
Get the author from the content
parse_summary
Get the summary from the content
parse_characters
Get the characters from the content
parse_universe
Get the universe/fandom from the content
parse_recipient
Get the recipient from the content
parse_category
Get the categories from the content
parse_rating
Get the rating from the content
derive_values
Calculate additional Meta values, such as current date.
get_chapter
Get an individual chapter of the story, tidy it, and save it to a file.
$filename = $obj->get_chapter(base=>$basename,
count=>$count,
url=>$url,
title=>$title);
get_epub
Get the EPUB version of the story, tidy it, and save it to a file.
$filename = $obj->get_epub(base=>$basename,
url=>$url);
epub_replace_description
Replace or add the description to an EPUB file.
epub_add_meta
Add the given meta-data to an EPUB file.
epub_parse_one_node
Parse a node of meta-information from an EPUB file.
wordcount
Figure out the word-count.
build_toc
Build a local table-of-contents file from the meta-info about the story.
$self->build_toc(info=>\%info);
build_epub
Create an EPUB file from the story files and meta information.
$self->build_epub()
tidy_chars
Remove nasty encodings.
$content = $self->tidy_chars($content);