NAME

NNexus::Index::Dispatcher - High-level dispatcher to the correct domain indexer classes.

SYNOPSIS

use NNexus::Index::Dispatcher; my $dispatcher = NNexus::Index::Dispatcher->new(db=>$db,domain=>$domain,verbosity=>0|1); my $invalidated_URLs = $dispatcher->index_step(%options); while (my $payload = $dispatcher->index_step ) { push @$invalidated_URLs, @{$payload}; }

DESCRIPTION

The NNexus::Dispatcher class provides a comprehensive high-level API for indexing web domains.

It requires that each $domain has its own NNexus::Index::$domain indexer plug-in, that follows a ucfirst(lc($domain)) naming convention.

Additionally, NNexus::Index::Dispatcher computes the concept diffs when re-indexing, an already visited page and updates the database as needed. Lastly, the return value of an indexing step is a list of suggested URLs to be relinked, a process called "invalidation" in previous NNexus releases.

METHODS

my $dispatcher = NNexus::Index::Dispatcher->new(domain=>$domain,db=>$db,$verbosity=>0|1, start=>$url, dom=>$dom);

The object constructor prepares a domain crawler object ( NNexus::Index::ucfirst(lc($domain)) ) and requires a NNexus::DB object, $db, for database interactions.

The returned dispatcher object can be used to iteratively index the domain, via the index_step method.

The method accepts the following options: - start - the initial URL, required for first invocation - dom - optional, provides a Mojo::DOM object for the current URL instead of performing an HTTP GET to retrieve it. - verbosity - 0 for quiet, 1 for detailed progress messages

my $invalidated_URLs = $dispatcher->index_step(%options);

Performs an indexing step by: - dispatches a crawl request to the domain indexer - computes a diff over the previously and currently indexed concepts for the given object/URL - updates the Database tables - Computes and returns an impact graph of previously linked objects (aka "invalidation")

Accepts no options, all customization is to be achieved through the "new" constructor.

AUTHOR

Deyan Ginev <d.ginev@jacobs-university.de>

COPYRIGHT

Research software, produced as part of work done by the KWARC group at Jacobs University Bremen. Released under the The MIT License (MIT)