NAME

WWW::Bookmark::Crawler - Personal bookmark search engine

SYNOPSIS

use WWW::Bookmark::Crawler;
$crawler = WWW::Bookmark::Crawler->new({
                                         SOURCE => 'bookmarks.html',
                                         DBNAME => 'mybookmark.db',
                                         PEEK   => 1,
                                         TOKENIZER => \&my_tokenizer,
                                       });
$crawler->peek();
$crawler->crawl();

$crawler->nopeek();

$crawler->query('Ars longa');

DESCRIPTION

WWW::Bookmark::Crawler is a WWW spider and a search engine for personal bookmark. It first extracts links in either a browser-generated bookmark or a plain html file, then retrieves each page's content online and builds the index file. User can use this module to build a personal bookmark search engine.

METHODS

new

Parameters:

SOURCE

User may feed it with either the name of bookmark file or reference to an array of urls.
DBNAME

The name of the index file.
PROXY

This is passed on to LWP agent.
TIMEOUT

Ditto. Default is 10 seconds.
PEEK

Set it to non-undef if user wants to see the debugging log dumping to STDOUT. Default is undef.
TOKENIZER

User may write an ad hoc tokenizer replacing the given one. WWW::Bookmark::Crawler uses OurNet::FuzzyIndex to play the role.

crawl

Starts fetching and building index file.

query

Returns an array of hashes of URLs and Titles related to the given terms. The default tokenizer treats space as intersection. This method builds an in-memory inverted file from index file when it appears the first time in a script.

No advanced IR skills are used.

peek

Turns on the debugging output. Same effective as PEEK given to new.

nopeek

Turns off the debugging information.

proxy

Sets the proxy server. Same effective as PROXY given to new.

timeout

Sets the TIMEOUT value. Same effective as PROXY given to new.

AUTHOR

xern <xern@cpan.org>

LICENSE

Released under The Artistic License.

To install WWW::Bookmark::TagStripper, copy and paste the appropriate command in to your terminal.

cpanm

cpanm WWW::Bookmark::TagStripper

CPAN shell

perl -MCPAN -e shell
install WWW::Bookmark::TagStripper

For more information on module installation, please visit the detailed CPAN module installation guide.

	Global
`s`	Focus search bar
`?`	Bring up this help dialog

	GitHub
`g` `p`	Go to pull requests
`g` `i`	go to github issues (only if github is preferred repository)

	POD
`g` `a`	Go to author
`g` `c`	Go to changes
`g` `i`	Go to issues
`g` `d`	Go to dist
`g` `r`	Go to repository/SCM
`g` `s`	Go to source
`g` `b`	Go to file browse

	Search terms
module: (e.g. module:Plugin)
distribution: (e.g. distribution:Dancer auth)
author: (e.g. author:SONGMU Redis)
version: (e.g. version:1.00)