NAME
bin/harvest_centroid.pl - extract centroid from SOIF or Harvest Broker/Gatherer
SYNOPSIS
bin/harvest_centroid.pl [-d] [-t tmpdb] [-h host] [-p port] [-s serverhandle]
DESCRIPTION
This program tries to extract a WHOIS++ compatible centroid from one of the following :-
If invoked with a host name or IP address to contact, this program will try to establish whether it is talking to a Harvest Gatherer or Broker, and send the appropriate command to fetch a dump of the entire contents of the Gatherer or Broker's database.
With no -h argument, this program will expect to receive a collection of SOIF templates on STDIN, such as you could get by
gzip -dc /usr/local/harvest/gatherers/*/All-Templates.gz
or
gdbmutil dump /usr/local/harvest/gatherers/*/PRODUCTION.gdbm
Note that when generating a centroid from a flat file collection of SOIF templates, the -s argument should be used to specify a serverhandle for the resulting centroid.
A Berkeley DB database is used as temporary working storage - your Perl installation must support DB via the DB_File module.
OPTIONS
- -d
-
Turn on debugging output - very verbose!
- -t tmpdb
-
The path prefix of the temporary database for building the centroid. The size of this database is typically three times that of the final centroid.
- -h host
-
The host name or IP address of the server to contact, if talking to a Gatherer or a Broker
- -p port
-
The port number to use when connecting to a Gatherer or a Broker. This defaults to 8501 if not set, which is Harvest's default for a Broker when it's created.
- -s serverhandle
BUGS
We should let people specify the starting time for the poll, and pass this on to the Broker/Gatherer, so that it's possible to do a relative "poll" of the Harvest server.
We don't do anything special about character sets/encodings.
Not up to date with current CIP specifications - this is really intended for use with a WHOIS++ server which speaks the old RFC 1913 indexing protocol.
Should be integrated with wpp_shim.pl, so that WHOIS++ servers which cannot load a centroid from a flat file can think they're polling a WHOIS++ server - when in fact the shim would simply be returning a centroid which had been calculated already.
SEE ALSO
"harvest_shim.pl" in bin, RFC 1913
COPYRIGHT
Copyright (c) 1988, Martin Hamilton <martinh@gnu.org> and Jon Knight <jon@net.lut.ac.uk>. All rights reserved.
This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.
It was developed by the Department of Computer Studies at Loughborough University of Technology, as part of the ROADS project. ROADS is funded under the UK Electronic Libraries Programme (eLib), the European Commission Telematics for Research Programme, and the TERENA development programme.
AUTHOR
Martin Hamilton <martinh@gnu.org> Peter Valkenburg <valkenburg@terena.nl>