NAME

Configure - Installing the ROADS software

DESCRIPTION

The ROADS software is a suite of programs intended to aid in the setting up and day to day running of World-Wide Web based catalogues of on-line resources. A number of these so-called subject gateways have been funded under the Access to Network Resources strand of the UK Electronic Libraries Programme (eLib), each specializing in a particular subject area.

Although designed specifically to meet the requirements of the eLib subject gateways, we believe that the ROADS software will be useful in a variety of other situations and for a variety of other purposes. You can get an idea of what people have done with the ROADS software by looking at some of their WWW sites, e.g.

http://adam.ac.uk/

Art, Design, and Media (ADAM)

http://bized.ac.uk/

Business Education (Biz/ed)

http://ihr.sas.ac.uk/

History on the Internet

http://omni.ac.uk/

Organizing Medical Networked Information (OMNI)

http://sosig.ac.uk/

The Social Sciences Information Gateway (SOSIG)

Initial versions of the ROADS software were derived from code developed by Jon Knight for the Social Sciences Information Gateway (SOSIG). As the software was used in other situations, it was re-written so as to address problems or add new features.

DOCUMENTATION

The ROADS software is self-documenting, using the Perl POD system. This means that you can get to the documentation for any program or library module by typing perldoc followed by the Unix path to the relevant file, e.g.

perldoc bin/addsl.pl

to learn about the addsl tool.

You can also get to the ROADS documentation in a variety of formats via our software and documentation server:

http://www.roads.lut.ac.uk/

SYSTEM REQUIREMENTS

The ROADS software requires that your machine be running the following:

a variant of the Unix operating system

e.g. Linux, FreeBSD, NetBSD, OpenBSD, SunOS, Solaris, Digital Unix (OSF/1), IRIX, HP/UX.

an HTTP server

e.g. Apache, NCSA or CERN httpd.

the Perl language.

We will not be producing a version of the ROADS software to run under legacy operating systems such as MacOS, DOS, Windows 3, Windows 3.11, Windows for Workgroups, Windows 95, Windows NT, OS/2, VMS, VME, MPE, CTSS, Multics, GEORGE IV... !

The reasons for this decision are manyfold. A couple of the more important ones are that most versions of Unix do not have problems handling lots of processes operating simultaneously (multi-tasking), or talking to other computers via the Internet. The combination of the basic Unix features, a very high level development language (Perl), and HTML as a user interface means that we can easily deliver a customisable package which works on multiple platforms. It would be very difficult to do this in the first place, and a great deal more difficult to support it, if we had to produce multiple independent versions of the ROADS software for each supported operating system and hardware platform.

It should be noted in passing that there are numerous implementations of Unix or Unix-like operating systems for PCs based on the Intel hardware, and several for the Macintosh hardware. Many of these are freely available, notably FreeBsd, NetBSD, and Linux.

We do not require that you use any particular combination of hardware, operating system, and HTTP server software for running ROADS. In general, any HTTP server which supports the Common Gateway Interface (CGI) should be capable of running the ROADS software. ROADS places no great demands upon your machine other than the normal process of serving up some static HTML documents and some dynamic ones generated via CGI. Your hardware and operating system configuration should, needless to say, be powerful enough that it can cope with user demand.

Primary development of the ROADS software is being done under SunOS 4.1.4 and Linux 2.x, with the NCSA and Apache HTTP servers and Perl 5.

DOWNLOAD

You can get hold of the ROADS software by pointing your WWW browser at the URL:

http://www.roads.lut.ac.uk/

This is the ROADS software and technical documentation server.

UNPACKING

The software distribution comes in the compressed archive format conventionally used to distribute Unix software over the Internet. To unpack it, use the zcat and tar commands, e.g.

% zcat roads-v2.3.tar.Z | tar xvf -

The software distribution will be unpacked into a directory called roads-v2.3, in whatever directory you happen to be in at the time. If you choose to keep all or most of your ROADS related files in this directory, you may want to make a symbolic link from it to some common name which you will carry on using for some time, e.g.

% ln -s /usr/local/roads-v2.3 /usr/local/roads

or perhaps rename the directory:

% mv /usr/local/roads-v2.3 /usr/local/roads

Now you can carry on using the name /usr/local/roads to refer to the ROADS installation, no matter where it actually lives.

Once the software has been unpacked, you should see the following in your roads-v2.3 (or whatever :-) directory:

% ls -l roads-v2.3
total 466
-rw-r--r--  1 martin       7459 Aug  3 20:13 CHANGES
-r-xr-xr-x  1 martin      27386 Aug  3 20:42 Configure
-rw-r--r--  1 martin        522 Jul 31 04:43 Makefile
-rw-r--r--  1 martin       8535 Aug 24 05:02 README
-rw-r--r--  1 martin       1526 Aug  3 20:12 TODO
drwxrwxr-x  3 martin       1024 Aug 15 23:02 admin-cgi
drwxrwxr-x  3 martin       1024 Aug 24 04:53 bin
drwxrwxr-x  3 martin        512 Aug 16 09:12 cgi-bin
drwxrwxr-x 19 martin       1024 Jul 24 03:45 config.dist
drwxrwxrwx  4 martin        512 Aug 24 04:33 guts
drwxrwxr-x  7 martin        512 Aug 19 19:28 htdocs
drwxrwxr-x  3 martin       1024 Aug 24 03:04 lib
drwxrwxrwx  2 martin        512 May 23 12:27 logs
drwxr-xr-x  2 martin       1024 Jul 27 17:38 source.dist

Note that the owner, group, date stamps and permissions may be different on your system.

HOW IT WORKS

The ROADS software consists of a series of programs written in the Perl language. Perl is a very high level language which is well suited to text processing, database handling and networking. It also lends itself to rapid prototyping - since programs can be written and re-written quickly.

``ROADS'' consists of the following:

Administrative programs

e.g. to index your templates, delete a template from the database, reindex it, and generate HTML breakdowns of the templates in the database. These are found in the bin directory.

CGI programs which provide a WWW front-end to the software

e.g. searching, and database administration. The user-oriented programs are found in the cgi-bin directory, and the admin oriented programs in the admin-cgi directory.

Configuration files which let you alter the behaviour of the ROADS software

e.g. by letting you specify the directories where the database files may be found, the format of any HTML documents produced by the software, and add definitions for new templates. These are found in the config directory.

Files which are used internally by ROADS and are not user serviceable

e.g. the inverted index used by the search engine, and library files used by the ROADS programs. These are found in the guts directory.

Static HTML documents which are intended to be inserted into your HTTP server's document tree

e.g. help files and manual pages. These are found in the htdocs directory.

Log files generated by the software

e.g. the search log. These are found in the logs directory.

You can configure ROADS to place most of these directories anywhere you like, or keep them in the directory where the software distribution was unpacked. This is still not quite as flexible as we would like - part of the installation process has to modify some of the programs to tell them where their library files are for instance, Consequently it is inadvisable to move the software once installed unless you either make sure the previous filenames are still valid (by making a symbolic link from their old location), or re-install it. The consistency checking tool should tell you whether there are any problems.

CONFIGURING YOUR WWW SERVER

To run the CGI programs, you will need to either have them installed in your HTTP server's cgi-bin directory (or equivalent), or add a new bin directory to its configuration. We strongly urge you to use any protection schemes your WWW server may provide - such as password and/or IP address access controls. In particular, you should make sure that the administrative CGI programs are only accessible by the people who you intend to be using them. If your machine is shared with other users, you may also want to protect the ROADS files from being read from or written to by arbitrary passers-by.

For example, if you were using the NCSA or Apache HTTP servers, had installed the ROADS software under /usr/local/roads, and wanted to make the ROADS CGI directories visible on the Web as /ROADS/, /ROADS/cgi-bin/, and /ROADS/admin-cgi/, your HTTP server's srm.conf would contain something like this:

ScriptAlias /ROADS/cgi-bin/ /usr/local/roads/cgi-bin/
ScriptAlias /ROADS/admin-cgi/ /usr/local/roads/admin-cgi/
Alias /ROADS/ /usr/local/roads/htdocs

Note that the order of the above lines is significant. This approach may be attractive since it means that you can have several versions of the ROADS software installed on your machine, and use the HTTP server's alias mechanism to control which is made visible to the world at large as your ``production service''. See below for more information on how this may be done.

You should protect the admin-cgi directory so that only the people who are meant to have access to it can run the CGI programs within. If you used the NCSA or Apache servers, by putting the following lines in access.conf you could restrict HTTP access to the admin-cgi to to those people who were calling from the machine with the IP address 158.125.96.46 and were members of the basic authentication group roadies:

<Directory /usr/local/roads/admin-cgi>
Options ExecCGI
AuthName ROADS server
AuthType Basic
AuthUserFile /usr/local/roads/.htpasswd
AuthGroupFile /usr/local/roads/.htgroup
<Limit GET POST>
order deny,allow
deny from all
allow from 158.125.96.46
require group roadies
</Limit>
</Directory>

Some of the ROADS tools do not have a WWW based user interface (yet), and it is necessary to run these from your shell, e.g. in an xterm or rxvt window or via a telnet session to the server. You will need to make sure that the directory which contains these tools is in your PATH, otherwise your Unix shell will not be able to find them.

MORE ON SECURITY AND ACCESS CONTROL

Note that it may be necessary for the user ID your World-Wide Web server runs under to write to the following directories:

  • The logs directory, which will be used to store search statistics and survey results.

  • The template source directory, where the actual database records live.

  • The guts directory under the top level ROADS installation directory. This is used to store various internal files which the ROADS server administrator is discouraged from fiddling with!

  • The temporary directory, e.g. /tmp where working copies of files are created by many of the ROADS tools.

  • The various directories you have configured the ROADS tools to create HTML files in - typically under your WWW server's main HTML document tree, or in a separate directory tree which is aliased to appear in the main HTML tree.

If you are going to mix manual and WWW based administration, you will need to make sure that any files or directories produced in the process are still writeable by the WWW server. If you do not, you may encounter problems creating templates, indexing them, or generating What's New and subject breakdown listings. We suggest that you use the bogus.pl tool's WWW front end to flag any problems with permissions.

In order to prevent your ROADS server's administrative capabilities from being abused by outsiders, we urge you to protect them using the strongest authentication scheme your World-Wide Web server supports - for example, message digest authentication or end-to-end cryptographic techniques such as the Secure Sockets Layer (SSL).

If you share the machine ROADS is running on with other people, you may also find it worthwhile using Unix permissions to control who has access to the files and directories where ROADS lives. Note that anyone who has super user (or root) status on the machine will be able to subvert these restrictions!

Finally, it is commonplace to run Unix based World-Wide Web servers as the user ``nobody''. This is not necessarily the most appropriate approach to take with the ROADS software, because its files and directories may need to be manipulated both by CGI based tools and by hand. If you do not have to share your WWW server with anyone else, we suggest that you might prefer to create a special user, say www, or perhaps even roads, which has write access to all of these files, and under whose identity the WWW server runs. There are a number of similar alternative approaches, e.g. doing this purely with the group ID the WWW server runs as - and the group ID of the files and directories, running a separate WWW server for the ROADS based material (most modern WWW servers and Unix versions have virtual hosting features), and using a wrapper program such as CGIwrap which is launched by the WWW server but then changes its user and/or group ID before running any of the ROADS tools.

SITTING COMFORTABLY ?

To install the ROADS software on your machine, you will need to run the Configure shell script which can be found in the directory you unpacked the distribution in, e.g.

% cd roads-v2.3
% ./Configure

Configure will ask you a number of questions about the your machine, and how the ROADS software should be set up for it. If you make any mistakes answering these questions, or change your mind, you can run the script again and change your answers. To make this easier for you, the Configure script will offer to re-use any answers which you gave on a previous attempt to install the software.

QUESTIONS AND ANSWERS

In this section we go through the questions you will be asked and explain them in detail. Most of this information is also available during the installation as ``tips'', which Configure will give you the option of reading these as you install the software.

In version 2 of the ROADS software we've simplified the installation process, so that you're presented with the questions which the ROADS installer program would like to ask you, and have the option of simply accepting the default values.

Introductory questions

Where is perl located on your system?

Configure will attempt to find programs like Perl automatically, but will sometimes need helping out. You might have several versions of Perl on your machine, for instance.

What is the email address of the person running the machine?

Some of the ROADS tools will send this person email.

What is the email address of the ROADS database admin person ?

Some of the ROADS tools will send this person email.

What is the name of your service?

This will appear on some of the Web pages generated by the ROADS software.

What is your machine's domain name?

This is the name which will be used by the ROADS software to refer to your machine. If it has an alias, such as www.swedish-chef.org, you may want to use this here.

What port number is your WWW server running on?

Normally HTTP servers run on port 80.

WWW server executables directory?

You must have an HTTP daemon running that supports the Common Gateway Interface mechanism, aka CGI. The NCSA 1.x, Apache 1.x, and CERN 3.x servers do, for example. This installation program will need to know the the name of the directory where CGI programs are placed.

NB: you don't have to use the server's main cgi-bin directory. For instance, you could have a CGI directory which was dedicated to ROADS tools. You might want to keep the ROADS cgi-bin programs in the ROADS directory where you unpacked them, for instance.

The factory default is the cgi-bin subdirectory of the directory in which you unpacked the ROADS software.

WWW path to ROADS executables directory on HTTP server?

Now we need to know what name the outside world will use to get to the ROADS executable directory on your Web server, e.g. /ROADS/cgi-bin/, or just /cgi-bin/.

The factory default is /ROADS/cgi-bin/, which could be configured as outlined above.

Some of the ROADS tools create Web pages - you can have these put anywhere you like in your computer's filing system, including the directory where you unpacked the ROADS software. Watch out, though! If the directory where you opt to have these files installed already has any files in it, these may be overwritten. For example, an index.html file is automatically copied into place.

We suggest that you put the ROADS related Web documents into a directory of their own.

The factory default is to use the htdocs subdirectory of the directory in which you unpacked the ROADS software.

What is the Web path to the ROADS HTML documents directory?

Most WWW servers let you choose the path name by which your HTML documents are made visible to the outside world. For example, although you may have installed the ROADS Web documents tree in say /local/roads/htdocs, you can usually tell your Web server to make this area visible to the rest of the world as say /ROADS/.

You might want to export these documents using the name of your service.

The factory default is to make the ROADS related HTML files appear under /ROADS/.

WWW based admin programs directory?

The ROADS software comes with a number of World-Wide Web based tools for administration. It is essential that you have these in an area of your WWW server which is protected from casual access - the only people able to run these programs should be the people who are looking after the ROADS installation.

If you install these programs in your server's regular directory for CGI executables, you will need to be running a WWW server which lets you impose access controls on individual programs.

The factory default is to install them in a directory of their own - admin-cgi under the directory where the ROADS software was unpacked.

WWW path to admin executables directory?

Now we need to know what the outside world sees this directory as. For example, if you put the administrative programs in the directory which was mapped on your Web server to /ROADS/admin-cgi/, you would enter this here. Note that if this directory is not your regular cgi-bin, you will probably need to enable CGI processing for it.

The factory default is to use /ROADS/admin-cgi/.

Enter the URL of your proxy server

The link checking tool distributed with the ROADS software can make use of a caching proxy server. We strongly advise you to use this feature if you are going to run the link checker.

The factory default is not to use a proxy server.

Directory names

Do you want to run the ROADS software from this directory?

You can choose to have the ROADS package configured to run in the directory where it was unpacked, or have individual parts of it installed in separate directories - e.g. you might want the HTML documents to be copied to your WWW server's file space, and so on.

The factory default is to run everything out of the directory in which the ROADS software was unpacked.

If you answer no to this question, you'll be prompted for the directories where the individual ROADS components should be installed:

Absolute pathname for the directory to hold the internal programs?

The ROADS internal programs all live in a single directory. By default it is the 'bin' subdirectory of the ROADS installation directory. You can opt to change this to be any directory in your computer's filing system - e.g. /usr/local/bin.

Default config file directory?

The ROADS package uses a number of configuration files. Normally these live in a subdirectory of the ROADS installation directory called config, but you can have them put anywhere you like.

Absolute path to guts directory?

Some of the ROADS programs create files for internal use, e.g. the index of your resource descriptions. The default location for these is in the guts subdirectory of the top level ROADS directory, but you can have them placed wherever you like.

Absolute path to library directory?

There are a number of library files which are used by various parts of the ROADS software. These normally live in the lib directory, under the top level ROADS installation directory. To have them installed in an alternative directory, type its name in here.

Absolute path to log directory?

Some of the ROADS tools generate log files as part of their operation and these are normally placed in a directory called logs, under the top level ROADS installation directory. You can override this by typing in a new log file directory here.

Absolute path to ROADS template directory?

This package has a default directory in which the ROADS resource description templates are held. If you would like to change this, enter a new directory name.

Server handle?

Your ROADS WHOIS++ server needs to be given a unique ``server handle'' which identifies it. The installation process will attempt to come up with one of these for you, based on your machine's hostname - e.g. lutacuk01. If you are already using this server handle, or would prefer to use another, enter it below.

What port number will your WHOIS++ server be listening on?

We recommend that you use an unprivileged port, such as the factory default of 6663.

The configuration script will ask you if you wish to rebuild your database and start the index server towards the end of the configuration process. If you tell it not to, you'll need to start the WHOIS++ server yourself by running the program bin/wppd.pl in the ROADS bin directory! You'll also want to add a line to your startup scripts to restart the bin/wppd.pl server when the machine is rebooted. Where this script is depends on the version of UNIX you are running; see your OS documentation for details (typically it is in /etc/rc.local or /etc/rc.d but it is so variable it would be very difficult for us to produce a portable section of code that could do this for you)

Maximum number of results with full response?

What is the upper limit on the number of results which your WHOIS++ server should return in full ? e.g. if the limit were 100, and a search matched over 100 resource descriptions, only a summary response would be returned.

The factory default is to return 100 records in full.

What is the upper limit on the number of results to return?

What is the upper limit on the number of results which your WHOIS++ server should be allowed to return ? e.g. if the limit were 25, and a user asked for 100, they would only be returned the first 25.

The factory default is an upper limit of 100.

Default maximum number of results to return?

The factory default is 100.

Do you want to return a hit count with the search results?

You can choose whether or not your WHOIS++ server should return a hit count with its response to each search. If you choose not to return a hit count, your users will have no way of knowing how many records there are in your database. This may be desirable for some applications, e.g. White Pages databases.

The factory default is to return a hit count as a separate COUNT record at the end of the search results.

Miscellaneous questions

Absolute path to sort program?

Some of the ROADS programs use the Unix sort utility. Where can this be found on your system?

NB: the GNU version of sort may be preferable to the one distributed by your vendor, since it does not impose a size limit on the line length.

The factory default is determined by searching in well-known places for sort.

Absolute path to mv executable?

Some of the ROADS programs use the Unix mv utility. Where can this be found on your system?

The factory default is determined by searching in well-known places for mv.

Absolute path to cp executable?

Some of the ROADS programs use the Unix cp utility. Where can this be found on your system?

The factory default is determined by searching in well-known places for cp.

Absolute path to uniq executable?

Some of the ROADS programs use the Unix uniq utility. Where can this be found on your system?

The factory default is determined by searching in well-known places for uniq.

Absolute path to cat executable?

Some of the ROADS programs use the Unix cat utility. Where can this be found on your system?

The factory default is determined by searching in well-known places for cat.

Absolute path to ci executable?

You must have the GNU Revision Control System (RCS) Checkin program (ci) available on your machine if you wish to archive copies of templates that have been deindexed.

The factory default is determined by searching in well-known places for a program called ci.

Absolute path to mailer executable?

Some of the ROADS programs can email information to administrators. What program should they use to do this?

The factory default is to look for some common names for a user oriented program for sending email, e.g. mailx, Mail, and so on. This program should be capable of accepting a list of email addresses, and a subject line via the -s command line option - like the BSD incarnation of mail.

Absolute path to temporary directory?

Some ROADS programs make use of temporary files which are held in one of your system's temporary directories. What is the absolute path of the directory you wish to use for temporary files on this machine?

The factory default is to use /tmp.

URL of bullet point image on your Web server?

The default WWW interface to the ROADS software uses an inlined image as a bullet point. You can specify the image which should be used here.

The factory default is to use /ROADS/icons/redball.gif, which may be found in the htdocs/icons subdirectory of the directory in which the ROADS software was unpacked.

COMPLETING THE INSTALLATION

Substitutions on internal programs

Once you have answered all of Configure's questions, it will begin to install and configure the ROADS software for use on your machine. You'll see a series of messages like this:

Performing substitutions on internal programs...
Performing substitutions on WWW CGI programs...
Performing substitutions on WWW based admin programs...
Done substituting!

Unpacking shrink wrapped config files...

You'll be asked if you want to install the sample database of eLib projects which comes with the ROADS software. If you say yes, you should see the message:

Unpacking sample data...

Sample config files and data can be found, respectively, in the config.dist and source.dist subdirectories of the directory in which the ROADS software was unpacked.

If you already have either of these directories, e.g. because you keep your data (from a previous installation) separate, you will see messages of the form:

You already have the config directory "/usr/local/roads/config"
Do you want to overwrite it with the factory defaults? [n]

You already have the templates directory "/usr/local/roads/source"
Do you want to overwrite it with the factory samples? [n]

The default is not to overwrite them.

Checking directories

Next, the Configure program will check to see that the directories you have named actually exist, and attempt to create them if necessary. Some of this will have happened already as part of the question and answer process.

You should see messages of the form:

Checking directories exist:

  /usr/local/roads/bin ........ ok!
  /usr/local/roads/admin-cgi ........ ok!
  /usr/local/roads/cgi-bin ........ ok!
  /usr/local/roads/guts ........ ok!
  /usr/local/roads/htdocs ........ ok!
  /usr/local/roads/lib ........ ok!
  /usr/local/roads/logs ........ ok!
  /usr/local/roads/source ........ ok!
  /tmp ........ ok!

Special files

The installation process creates a small number of special files:

List of databases

This will be created as the file databases in the ROADS config directory. If you already have this file, you'll be asked whether you want to make a backup copy of it:

You already have "/usr/local/roads/config/databases"
Do you want to keep a safe copy of it ?
Run-time configuration - ROADS.pm

All of the ROADS tools read their run-time settings from a file called ROADS.pm in the ROADS lib directory. This is created as part of the installation process, but you can change it by hand. You should see a message like this as it is created:

Dumping configuration to lib/ROADS.pm...
search.pl vs. admin.pl

This CGI program serves a dual role as the search program for both admin users and end users - altering its behaviour depending on the name by which it is invoked. Because of the split between admin oriented CGI programs and user oriented CGI programs, it's necessary to make a copy of the program. As this is done, you'll see the message:

Symlinking /usr/local/roads/cgi-bin/search.pl to
  /usr/local/roads/admin-cgi/admin.pl...

Installing directories

Having updated the relevant files to reflect your preferences, Configure will now install the various directories which together form the ROADS software:

Installing individual ROADS components...
 From "bin" into "/usr/local/roads/bin" ... already there!
 From "admin-cgi" into "/usr/local/roads/admin-cgi" ... already there!
 From "cgi-bin" into "/usr/local/roads/cgi-bin" ... already there!
 From "htdocs" into "/usr/local/roads/htdocs" ... already there!
 From "lib" into "/usr/local/roads/lib" ... already there!

If the inode number of the destination directory is the same as that of the source directory, the message already there is displayed, and the files aren't copied.

REGISTRATION

You will be asked at the end of the installation process whether you want to register your server. The default is to do this.

We encourage you to register your server with us, so that we can get an idea of how widely the ROADS software is being used, and keep in touch with our users. If you answer yes to the question below, the configuration information you supplied will be sent to us, along with information on the versions of the ROADS tools which you have, and your machine type and operating system.

If this worries you (remember The Microsoft Network!) say no, and you can use the 'server info' utility from the ROADS admin centre to see exactly what information would be transmitted. The admin centre also has a utility for registering your ROADS server, so you can register from there if you want to :-)

RESOURCE DESCRIPTION TEMPLATES

ROADS provides tools to help in the cataloguing of Internet resources, using a simple resource description format known as the IAFA template. This was initially devised as a way of cataloguing resources on Internet anonymous FTP archives, but then extended for use with the World-Wide Web in all its glory.

These templates look very much like the headers on an email message, but most of the attribute names are different - e.g.

Template-Type: SERVICE
Handle: SOSIG428
Title: UKBORDERS Information
Copyright-Owner: JISC, ESRC
Keywords: Geography, Digitised Bourndary Data (DBD), Census,
  Demography, Population, GIS, Computer Aided Mapping
Description: UKBORDERS provides the digitised boundary data
  associated with the  1991 Census of Population. The boundary
  data allows users to map 1991 Census data systematically at
  any scale from small area to the whole country and can be
  used to design new zones from the small area building blocks
  and to integrate census data fully in geographical information
  systems. Academic staff and students of UK Higher Education
  institutions may access the digitised boundary data from the
  Universities of Edinburgh only after completing the required
  registration process. The data is also available through MIDAS
  at Manchester.
Subject-Descriptor-Scheme-v1:   UDC
Subject-Descriptor-v1:  312 91
Admin-Email-v1: <ukborders@ed.ac.uk>
URI-v1: http://datalib.ed.ac.uk/UKBORDERS/start.html
Record-Last-Modified-Email:     ecdh@ssa.bris.ac.uk
Record-Last-Modified-Date:      Tue Jan 16 23:00:00 1996

These resource description templates may either be created by hand, using a text editor, or entered via a World-Wide Web form. Once a template is in the database it may be edited via a WWW form, or again by hand.

When creating or editing a template using the WWW forms interface, the view of the template may be constrained to a restricted set of the attributes. This means that fields which are not used on a regular basis can effectively be eliminated. These fields are, however, still available should they be needed - just not displayed by the WWW based editor.

Note that you can trivially modify large numbers of templates using Perl's -spi feature. This lets you essentially edit a file in place, e.g.

perl -spi -e 's/Bourndary/Boundary/g' source/*

would change all instances of the word Bourndary to Boundary, in all of the files in the source directory.

CATALOGUING DIFFERENT KINDS OF RESOURCE

A number of default resource description templates are provided with the software. Each of these is intended for describing a particular type of resource:

DOCUMENT

a book or technical report

FAQ

a Frequently Asked Questions document

IMAGE

a GIF or JPEG images

MAILARCHIVE

the location of a mailing list archive

ORGANIZATION

information about an organization

PROJECT

information about a project

SERVICE

information about an on-line service

SOFTWARE

information about a software package, e.g. ROADS

SOUND

an AIFF or WAV object

TRAINMAT

training materials

USENET

information about a Usenet News conference

USER

information about a person

VIDEO

an MPEG or QuickTime object

These default templates provide a framework for describing commonly occurring objects such as documents, images and sounds, and also more nebulous things such as network services. The subject service maintainer is free to change the templates distributed with the ROADS software, and new templates may be created either for describing new types of resource - or as alternatives to the distributed templates.

We suggest that the existing templates be used as much as possible. Hopefully we can avoid the situation that separate groups independently arrive at different ways of writing the same information down in the template format. If it seems to be necessary to change a template, or create a new one, we recommend liasing with other ROADS users and the ROADS developers.

WHAT THE END USER SEES

The templates in the ROADS database are made visible to the end user in three ways:

A search capability, the results of which lead to these resource descriptions

CUSTOMIZATION

The subject service maintainer may dramatically alter the appearance of the Web pages generated by the ROADS software, so as to add local customizations. Consequently, two sites both using ROADS may have very little in common in terms of the appearance of their Web pages - even though the software used to generate them is the same.

The maintainer has a great deal of latitude in customizing the format of the resource description listings, should they wish to. The start and end of the HTML pages generated by the ROADS software can be modified using a local <em>subject listing outline</em> document. This also makes it possible to specify the format which should be used for each entry. This outline document is formatted just like regular HTML, but has special ROADS-specific tags in it which will be replaced with the subject listing information by the ROADS software.

Likewise, the search capability may be customized so as to restrict the attributes which may be included in a search. Version 2 of the ROADS software uses outline HTML documents to specify the format of the list of search results, and the format of an individual resource description template when it is rendered into HTML.

Finally, a list of resources which have recently been added to the subject service can be generated automatically for by the ROADS software. The format of the resulting HTML document can be specified in a similar way to that of the subject listings, and it is possible to specify how long a resource remains on the list for before it is automatically removed.

RUN-TIME SETTINGS

The tools we supply under the banner name ``ROADS'' all read their run-time configuration from a single file, ROADS.pm, as described in the installation guide. A separate document, ``Run-time configuration of the ROADS software'' describes the function of each of these settings.

By changing the values of the variables in ROADS.pm you will alter the behaviour of the ROADS tools in this installation the next time they're run. It's important to note that there are two cases where this does not apply:

Tools which are already running, e.g. the ROADS WHOIS++ server, which normally runs permanently.
Settings which have been used to generate static files, such as the HTML produced by the ``What's New'' and subject listing tools.

Tools which are already running will need to be re-run or re-started before changes to the run-time configuration will affect them, and static files such as the subject listing HTML tree will need to be regenerated by re-running the tools which produced them with the appropriate arguments. How best to do this for your installation is discussed in the section below on management tools.

CUSTOMISING THE USER INTERFACE

The primary user interface to the ROADS software is the World-Wide Web, and the HyperText Markup Language (HTML), though many ROADS features are also accessible from the Unix command line. Most of the ROADS tools draw in their HTML from external files, rather than having this hard coded into the tools themselves - making it trivial to modify the HTML returned to the user to include local customizations such as logos and help text which is specific to the subject area.

We also provide a number of HTML style tags which allow you to introduce run-time settings from your ROADS installation and dynamic information such as search results into the HTML generated by the ROADS tools.

Most of the HTML associated with the ROADS tools can be found in the appropriate subdirectory of the ROADS config/multilingual directory - we supply a default set of message files in config/multilingual/UK-English, and are planning to gather together message sets in other European languages as part of our work on the DESIRE project. Contributions would be very welcome! This directory space is subdivided by tool name, so, for example, the opening page of HTML returned by the ROADS search tool may be found in config/multilingual/*/search/search.html.

For example, in order to customise the initial search page returned to the end user to include a locally produced logo, make use of HTML 3.2 specific colour features, and hide some of the more advanced search options you might change search.html as follows:

<HTML>
<HEAD>
<TITLE>Supercool Web Search Thingie</TITLE>
</HEAD>
<BODY BGCOLOR="#000000" TEXT="#ffffff">
<H1>Supercool Web Search Thingie</H1>

<IMG SRC="/logo.gif" ALT="">

<THISGETFORM>
 <HR>
 Query:<INPUT TYPE="text" NAME="query" SIZE=50>
 <INPUT TYPE="submit" VALUE="Start search">
 <INPUT TYPE="reset" VALUE="Clear this form">
 <HR>
Type in your search term and press the <EM>start search</EM> button.
 <HR>
 <INPUT TYPE="checkbox" NAME="caseful" VALUE="on">Case sensitive search.
 <INPUT TYPE="checkbox" NAME="ranking" VALUE="on">Rank results in order.
Type of resource:
<ALLTEMPLATETYPES>
Database:
<REALDATABASES>
 <HR>
</FORM>
</HTML>

The choice of a particular set of message files is determined by the Accept-Language: and Accept-Charset: settings of the user's WWW browser, and defaults to English in the event that the user's preferred language and character set combination is unavilable. This mechanism may be controlled by editing the file config/languages, which indicates the directory which should be used for a given language and character set combination.

Alternatively, you may prefer to take a static copy of the HTML dynamically generated by the ROADS software and modify this to incorporate your local customizations. Note that if you do this, it may be necessary to re-write your HTML from time to time if the ROADS installation details change, e.g. when moving your HTTP server to a different port number.

OTHER COOL STUFF

Whilst we have concentrated on a single metadata format, the IAFA template, and a single search and retrieval protocol, WHOIS++ - there are a variety of other formats and protocols which it may be useful to use to mount databases created with the ROADS tools. Similarly, there are a variety of other metadata formats which it may be useful to convert from, in order to create resource description templates for use with the ROADS tools.

If you wish to provide access to your database for Z39.50 clients, grab the Zplugin from:

<URL:http://www.ilrt.bris.ac.uk/roads/software/zplugin/>

This uses Index Data's Zebra Z39.50 server to provide access to ROADS records and includes a selection of Unix binaries to make it easy to set up without needing to compile the Index Data code.

As part of ROADS and other related projects, UKOLN have produced a number of software tools which may be useful to people running ROADS servers:

<URL:http://www.ukoln.ac.uk/metadata/software-tools/>

ADAM, one of the eLib subject gateway services using ROADS, have written a WWW indexing robot which can dump its results out as ROADS records. You can get it from:

<URL:http://www.adam.ac.uk/adam/tech/dc.bot/>

If you plan to modify the default database schema shipped with the ROADS software, see UKOLN's metadata pages:

<URL:http://www.ukoln.ac.uk/metadata/>

and in particular, the ROADS template registry and the ROADS cataloguing guidlines.

Our principal aim is a degree of interoperability with Harvest and Glimpse based systems using SOIF, LDAP servers using LDIF, and Z39.50 servers using GRS-1. We have also been investigating and participating in the development of the Dublin Core Element Set, which aims to provide a minimal set of metadata elements which may be used in information interchange between different systems, and in integrating the automated data gathering process used by WWW indexing systems such as the Harvest Gatherer with the cataloguing approach the ROADS software was originally designed around.

SUPPORT

We maintain a contact address for the ROADS team as a whole - roads-liaison@bristol.ac.uk, which we recommend that you use as your first point of contact with us in the event of any problems or questions.

We also run a number of mailing lists related to the ROADS software. In particular, you might want to subscribe to one or both of the following:

open-roads

for general discussions about the ROADS software.

roads-hackers

for explicitly technical discussions, e.g. changes to the code base.

You can join these lists by sending an email message to majordomo@net.lut.ac.uk consisting of the word subscribe followed by the mailing list's name, e.g.

subscribe roads-hackers

Our mailing lists are archived via the World-Wide Web at:

http://www.roads.lut.ac.uk/lists/

UPGRADING

People upgrading from version 1 of the ROADS software should be aware that:

More tools now use our generic HTML rendering code - see config/multilingual/* for tool names.

The format and location of some of the configuration files has changed. In particular, those of the subject listing and What's New tools.

The various ROADS library routines have been repackaged as modules in the ROADS:: namespace. This process is only partially complete as of version 2 - in that there still exists a complex web of dependencies between modules, e.g. through the use of global variables.

There's no longer a separate ROADS manual - this is now generated from the embedded POD documentation within the ROADS tools and library modules.

CHANGES

Changes since version 1 of the ROADS software include the following :-

Sensible defaults.

In version 1 of the ROADS software we chose some deliberately obtuse default settings and values, in an attempt to force people to configure the software for their own services. This didn't work too well in practice, so now we try to provide sensible defaults in the first place!

Variants and clusters.

The WWW based template editor (admin-cgi/mktemp.pl) now allows you to dynamically alter the number of variants and clusters in a record. Additional buttons appear in the main template editor form for each cluster (e.g. user, organization, and so on) and variant (e.g. URI).

Trusted information providers.

admin-cgi/mktemp.pl now supports the concept of access controls on templates, so that trusted information providers may be restricted as to which templates they can edit and the options (e.g. check in to database, offline entry, email to administrator) they can use. A new WWW based admin tool admin-cgi/tempuserauth.pl has been provided to make it easier to administer these access control lists.

Identifying duplicate URLs.

A new tool bin/dup_urls.pl has been produced to do this. It also has a WWW based front end - admin-cgi/dup_urls.pl.

More flexible search forms.

In ROADS version 1 it was only possible to have a single HTML form returned when someone called the search program cgi-bin/search.pl. With version 2 it's now possible to have multiple search forms, with the filename of the form to dump out specified by a CGI parameter form. See "search" in cgi-bin for more information on how we've implemented this.

On the search form, we now support up to three attribute/value pairs, as an alternative to the single text entry box which we supported in ROADS versions 0 and 1. The rationale is that this approach can be used to add additional search constraints via hidden HTML form fields, and also to provide end users with a simple user interface to potentially complex Boolean searching operations. The sample search configuration we ship with the ROADS software shows how this can be used to implement dynamic browsing by subject category as an example.

Multiple views of search results.

In ROADS version 1, you could configure the way search results were rendered, but they would always be rendered the same way. In version 2 we've added the ability to render the search results in multiple formats, with the format again selected by a CGI variable - view in this case.

Furthermore, we've made it possible to associate different rendering rules with a particular record depending on heuristics such as the type of the record (image, document, and so on) and the server it was returned by (when cross-searching multiple servers). For more information, see "ROADS/Render.pm" in lib.

Children in subject listings.

It's now possible to indicate parent/child relationships between resources when generating subject listings. See "addsl.pl" in bin and "cullsl.pl" in bin for more information.

Additional record types.

We've bundled the TRAINMAT schema definition being used by the eLib Netskills project and the PROJECT schema definition being used by UKOLN for their eLib projects database.

Improved documentation.

All programs and library files are now documented using Perl's POD format, and a Perl based Pod to SGML converter has been written. This makes it possible to provide versions of the ROADS documentation in a multiplicity of formats, including plain text, HTML, RTF, and PostScript.

The technical documentation tries to cover a number of the common scenarios encountered when installing/configuring the ROADS software and setting up a service based on it. See Configure for more information.

Inserting clusters.

It's now possible to insert other records into cluster sections of new/updated records being edited using the WWW based template editor. See "mktemp.pl" in admin-cgi for more information.

Creating new templates based on existing ones.

In the main admin-cgi/mktemp.pl editing screen, type in the name of the template handle you would like to be created whilst editing an existing template - the result will be saved as a new record with this handle.

Context sensitive help.

admin-cgi/mktemp.pl now supports context sensitive help, with the ability to display brief notes next to nominated fields of nominated template types. These are HTML fragments stored in the HTML messages directory. See "mktemp.pl" in admin-cgi for more information.

Commonly used attributes.

We now ship a Common Elements view with the ROADS template editor admin-cgi/mktemp.pl. This lets you restrict the attributes which the template editor displays to ones which a survey of ROADS users determined were the most popular.

Substring searching.

We now support the WHOIS++ "substring" search constraint, allowing for matches against substrings in indexed terms.

Configurable indexing.

It's now possible to control the way that the ROADS index builder creates its index, by setting the lib/ROADS.pm variable ROADS::IndexSplitPattern. This means that, for instance, you can arrange for whole URLs to be indexed. Just make sure you have lots of memory free :-)

Access control on admin programs.

All of the programs in the "admin" area (distributed as admin-cgi) now support fine grained access control based on the user name value supplied by the HTTP server after authentication. This appears in the process environment as the variable REMOTE_USER. For more information, see "ROADS/Auth.pm" in lib.

What's new listings.

This is now more flexible, including the ability to restrict the list generated to the last N resources added, and resources added since a particular date. See "addwn.pl" in bin for more infomation.

Subject and What's New views.

Both the subject listing and What's New tools now use the generic ROADS library code for rendering records into HTML. For more information, see "ROADS/HTMLOut.pm" in lib and "ROADS/Render.pm" in lib.

As an example of the flexibility of this approach, we include a BUBL style mailhost view with the What's New tool.

The ROADS link checker bin/lc.pl now detects when a resource has been updated by monitoring the HTTP Last-modififed and Content-length headers. It also supports a new argument -w which may be used to flag resources which have been updated in a given number of days - i.e. as a measure of currency.

Cross-searching.

We now support the indexing and searching of other WHOIS++ servers using the WHOIS++ indexing protocol defined in RFC 1913. We also include preliminary support for the Common Indexing Protocol being promulgated by the Internet Engineering Task Force. For more information, see "wig.pl" in bin.

In addiiton to the native WHOIS++ indexing and searching support, we also ship code to index (by generating RFC 1913 centroids) both Harvest and Z39.50 servers, and to gateway WHOIS++ queries to both types of server. The Harvest support (see "harvest_shim.pl" in bin and "harvest_centroid.pl" in bin) is pretty much complete, and shouldn't need major changes to get working. By contrast, the Z39.50 support (see "z3950_shim.pl" in bin and "z3950_centroid.pl" in bin) is likely to require significant changes for use outside its intended environment - because of the complexity of the Z39.50 protocol and its attendant interoperability problems.

I18N.

It's now possible, at least in a limited sense, to use the ROADS software to enter, search for, and retrieve records which contain localized characters such as those in the ISO Latin 1 character set (ISO 8859-1). Since most WWW browsers do not currently support HTML 4's tagging of form input fields with localization information, or strip input fields to US ASCII, this feature is likely to only be of limited use for the moment.

COPYRIGHT

Copyright (c) 1988, Martin Hamilton <martinh@gnu.org> and Jon Knight <jon@net.lut.ac.uk>. All rights reserved.

This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

It was developed by the Department of Computer Studies at Loughborough University of Technology, as part of the ROADS project. ROADS is funded under the UK Electronic Libraries Programme (eLib), the European Commission Telematics for Research Programme, and the TERENA development programme.

AUTHORS

Martin Hamilton <martinh@gnu.org>, Jon Knight <jon@net.lut.ac.uk>, with apologies to Tom Christiansen, and Larry Wall :-)