NAME

bin/harvest_centroid.pl - extract centroid from SOIF or Harvest Broker/Gatherer

SYNOPSIS

bin/harvest_centroid.pl [-d] [-t tmpdb] [-h host] [-p port] [-s serverhandle]

DESCRIPTION

This program tries to extract a WHOIS++ compatible centroid from one of the following :-

A Harvest Broker
A Harvest Gatherer
A collection of SOIF templates

If invoked with a host name or IP address to contact, this program will try to establish whether it is talking to a Harvest Gatherer or Broker, and send the appropriate command to fetch a dump of the entire contents of the Gatherer or Broker's database.

With no -h argument, this program will expect to receive a collection of SOIF templates on STDIN, such as you could get by

gzip -dc /usr/local/harvest/gatherers/*/All-Templates.gz

or

gdbmutil dump /usr/local/harvest/gatherers/*/PRODUCTION.gdbm

Note that when generating a centroid from a flat file collection of SOIF templates, the -s argument should be used to specify a serverhandle for the resulting centroid.

A Berkeley DB database is used as temporary working storage - your Perl installation must support DB via the DB_File module.

OPTIONS

-d

Turn on debugging output - very verbose!

-t tmpdb

The path prefix of the temporary database for building the centroid. The size of this database is typically three times that of the final centroid.

-h host

The host name or IP address of the server to contact, if talking to a Gatherer or a Broker

-p port

The port number to use when connecting to a Gatherer or a Broker. This defaults to 8501 if not set, which is Harvest's default for a Broker when it's created.

-s serverhandle

BUGS

We should let people specify the starting time for the poll, and pass this on to the Broker/Gatherer, so that it's possible to do a relative "poll" of the Harvest server.

We don't do anything special about character sets/encodings.

Not up to date with current CIP specifications - this is really intended for use with a WHOIS++ server which speaks the old RFC 1913 indexing protocol.

Should be integrated with wpp_shim.pl, so that WHOIS++ servers which cannot load a centroid from a flat file can think they're polling a WHOIS++ server - when in fact the shim would simply be returning a centroid which had been calculated already.

SEE ALSO

"harvest_shim.pl" in bin, RFC 1913

COPYRIGHT

Copyright (c) 1988, Martin Hamilton <martinh@gnu.org> and Jon Knight <jon@net.lut.ac.uk>. All rights reserved.

This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

It was developed by the Department of Computer Studies at Loughborough University of Technology, as part of the ROADS project. ROADS is funded under the UK Electronic Libraries Programme (eLib), the European Commission Telematics for Research Programme, and the TERENA development programme.

AUTHOR

Martin Hamilton <martinh@gnu.org> Peter Valkenburg <valkenburg@terena.nl>