NAME
Net::Z3950::AsyncZ - Perl extension for the Z3950 module
SYNOPSIS
- Overview
-
use Net::Z3950::AsyncZ; use Net::Z3950::AsyncZ qw(:record :headers :errors); use Net::Z3950::AsyncZ qw(asyncZOptions isZ_MARC isZ_GRS isZ_RAW isZ_DEFAULT noZ_Response isZ_Header isZ_ServerName Z_serverName); my $asnycZ = Net::Z3950::AsyncZ->new(servers=>\@servers, query=>$query,cb=>\&output); my $asnycZ = Net::Z3950::AsyncZ->new( servers=>\@servers, query=>$query, timeout=>$tm, num_to_fetch=>$num,cb=>\&output, options=>\@options, log=>$log, format=>\&format, timeout_min=>$min, interval=>$interval, maxpipes =>$max, );
- Example 1
-
my @servers = ( [ 'amicus.nlc-bnc.ca', 210, 'NL'], ['bison.umanitoba.ca', 210, 'MARION'], [ 'library.anu.edu.au', 210, 'INNOPAC' ] ); my $query = ' @attr 1=1003 "Henry James" '; my $asnycZ = Net::Z3950::AsyncZ->new(servers=>\@servers, query=>$query,cb=>\&output);
\&output
is a reference to a callback function which outputs the records returned by the servers. Basically, the callback function gets the records in the form of an array, in which each element of the array is a line of the record. At the simplest level, you just loop through the array, printing each line and anewline
. - Example 2
-
my $asnycZ = Net::Z3950::AsyncZ->new(servers=>\@servers, query=>$query, cb=>\&output, log=>"errors.log", num_to_fetch=>10);
Same as Example 1 but requesting 10 records from each server, instead of the default 5 and setting a log for debug error output.
- Example 3
-
my @servers = ( [ 'amicus.nlc-bnc.ca', 210, 'NL'], ['bison.umanitoba.ca', 210, 'MARION'], [ 'library.anu.edu.au', 210, 'INNOPAC' ] ); my $query = ' @attr 1=1003 "Henry James" '; my @options = ( asyncZOptions (num_to_fetch=>5,log=>bison_errors.log"), #amicus asyncZOptions (num_to_fetch=>10, query=>' @attr 1=1003 "James Joyce" '), # bison undef # library.anu.edu.au ); $options[0]->set_GRS1(); my $asnycZ = Net::Z3950::AsyncZ->new(servers=>\@servers, query=>$query,cb=>\&output, options=>\@options, log=>"errors_main.log" );
Here we set options which apply to individual servers in the @options array.
asyncZOptions
returns a reference to aNet::Z3950::AsyncZ::Options::_params
object; we can pass into it options we want to set for individual servers. We have not defined a_params
object for library.anu.edu.au, so a default_params
will be created for it.As you can see, we can set different queries for different servers; we can set separate logs, assuming we want to track errors separately-- we can even suppress error reporting on an individual basis. In the case of 'amicus', we have asked that the
preferredRecordSyntax
be set toNet::Z3950::RecordSyntax::GRS1
, since the Natonal Library of Canada uses GRS-1 as its default output; we could also have done that in the call toasyncZOptions
:asyncZOptions(preferredRecordSyntax=>Net::Z3950::RecordSyntax::GRS1);
In addition to detailed logging of error messages, there's also error reporting aimed at the user, to inform users when records haven't been returned. See "Errors" below.
ABSTRACT
Net::Z3950::AsyncZ adds additional asynchronous support for the Z3950 module through the use of multiple forked processes.
DESCRIPTION
Net::Z3950::AsyncZ adds an additional layer of asynchronous support for the Z3950 module through the use of multiple forked processes. Users may also find that it provides a convenient front end to Z3950.
Apologia
My own experience with Z3950 async mode was that I could connect to servers and get back the number of records waiting to be fetched, but I was unable to retrieve the records themselves.
The Z3950 documentation talks about this situation:
when the connection is anychronous, the errcode() may
be zero, indicating simply that the record has not yet been fetched from
the server. In this case, the calling code should try again later. (How
much later? As a rule of thumb, after it's done ``something else'', such
as request another record or issue another search.)
The documentation promises to provide user code for asynchronous access at a later date, and since synchronous access is apparently written on top of asynchronous code, the techniques for the async mode no doubt exist. But I searched the mailing list archive and couldn't find anything relevant. So, at the risk of carrying coals to Newcastle, I wrote AsyncZ.
The Basic Mechanisms of Net::Z3950::AsyncZ
AsyncZ forks off maxpipes
processes at a time. After these processes have returned and reported their results, or after a timeout
period, the next set of maxpipes
are forked off, and so forth. An Event loop is set in motion that enables AsyncZ to wait for results--either records or error messages--to return from the Z39.50 servers. Records are passed through, in the order in which they arrive, to a callback function (cb
), which you supply and which outputs the records.
Each of the forked processes, in turn, runs in its own Event loop while waiting for results to return from the server. The two-fold purpose of these loops, local to each forked process, is:
[1] to help insure that a request to a server doesn't get swallowed up on the network and never return, causing a script or program to hang;
[2] to set a timeout on how long you are prepared to wait for a response.
The loop in the child process is not always enough in itself to prevent a script from hanging; for such cases you can set a monitor
which will kill the main process after a timeout period. See the discussion of monitor
in Options.pod
.
monitor
which will kill the main process
after a timeout period. See the discussion of monitor
in Options.html.
Various conditions may be responsible for the failure to receive records from a server. In some circumstances, such as timing out, it may be worth a second try. In such cases AsyncZ will try the server a second time. (I refer to these two tries as two cycles.)
The constructor does not return a reference to Net::Z3950::AsyncZ until this two cycle process is completed. This reference gives you access to any errors which may have been reported, i.e. you can check to see why a server has not returned any records and provide error messages to the user as you see fit. In addition, you can keep an Error log with considerably more detailed error reporting; you can in fact keep a separate log for any one or combination of the servers you contact.
Everything essentially proceeds from the constructor. Once you provide the constructor with a list of servers and a query (or queries), and a callback function to output your records, you have nothing to do except wait for the reference which gives you access to the error messages. You can exercise a great deal of control by setting options for both the parent process and any or all of its children.
The Basic Script
- # basic.pl
-
use Net::Z3950::AsyncZ qw(isZ_Error); my @servers = ( [ 'amicus.nlc-bnc.ca', 210, 'NL'], ['bison.umanitoba.ca', 210, 'MARION'], [ 'library.anu.edu.au', 210, 'INNOPAC' ], ['130.17.3.75', 210, 'MAIN*BIBMAST'], [ 'library.usc.edu', 2200,'unicorn'], [ 'z3950.loc.gov', 7090, 'Voyager' ], [ 'fc1n01e.fcla.edu', 210, 'FI' ], [ 'axp.aacpl.lib.md.us', 210, 'MARION'], [ 'jasper.acadiau.ca', 2200, 'UNICORN'] ); my $query = ' @attr 1=1003 "Henry James" '; my $asyncZ = Net::Z3950::AsyncZ->new(servers=>\@servers,query=>$query,cb=>\&output); showErrors($asyncZ); exit; #------END MAIN------# sub output { my($index, $array) = @_; foreach my $line(@$array) { print "$line\n" if $line; } print "\n--------\n\n"; } sub showErrors { my $asyncZ = shift; print "The following servers have not responded to your query: \n"; for(my $i=0; $i< $asyncZ->getMaxErrors();$i++) { my $err = $asyncZ->getErrors($i); next if !isZ_Error($err); print "$servers[$i]->[0]\n"; print " $err->[0]->{msg}\n" if $err->[0]->{msg}; print " $err->[1]->{msg}\n" if $err->[1]->{msg}; } }
You will notice that I have retained the @servers array used in Mike Taylor's sample scripts for the Net::Z3950 module, i.e. an array of references to 3-element arrays of servers, ports, and databases.
When you run this script at the terminal, you will find several types of headers and detailed error messages interspersed with the query results. For a "clean" output see basic_pretty.pl
, which is included in the distribution.
Also, see "Errors" and "Headers".
Constructor, Methods, and Exports
Constructor
- Net::Z3950::AsyncZ::new
-
my $asyncZ = Net::Z3950::AsyncZ->new( servers=>\@servers, # array of references to servers in form: [ $host, $port, $database] query=>$query, # format depends on Z3950 querytype: defaults to 'prefix' timeout=>25, # total timeout in seconds for all processes timeout_min=>5, # minumum timeout in secs to exit event loop if all processes are finished interval=>1, # Event loop timer interval maxpipes => 4, # maximum number of forks to be executed at one time log=>undef, # undef, name of log file to which extended error messages are written # or Net::Z3950::AsyncZ::Errors::suppressErrors() cb=>\&cb, # callback function to which records will be sent as available format=>\&format, # callback function to format individual lines of records num_to_fetch=>$num, # number of records to fetch from each server options=>\@options, # array of references to Net::Z3950::AsyncZ::Options::_params objects monitor => 0 # timeout in seconds for a monitoring child process: if # 0 no monitor is created );
A Word about Parameters and Options
AsyncZ::new() takes a set of named parameters. Some of them, like maxpipes
and timeout
apply to the overall functioning of Net::Z3950::AsyncZ, i.e. to the parent process. Others, like num_to_fetch
and format
can be set individually for each server in the servers
array, i.e. for each child process. Settings for the child processes are made using the options
parameter and the Net::Z3950::AsyncZ::Options::_params array. If a _params object does not exist for a child process, one is automatically created using default values. The indices of the _params
array must be synchronized with the indices of the servers array.
Options are treated fully in the separate Options documentation.
For the HTML documentation see: Options.htmlRequired Parameters for Constructor
For every query sent to a server you must supply three required parameters: servers
, query
, and cb
. That is, you must supply an array reference to the server's $host, $port, and $database, you must supply the the query itself, and finally a callback function, which is responsible for outputting the data returned from the Z39.50 server. This is the minimal configuration, the one shown above in "The Basic Script".
Optional Parameters for Constructor
The optional parameters have either default values or default behaviors. Some of the optional parameters are exclusive to the functioning of the parent process, for instance timeout
and interval
. Others are for use only in the child processes, for instance format
and num_to_fetch
, while log
is used in both the parent and its children.
Methods
There are three kinds of methods in AsyncZ:
- [1] Methods to set options for Net::Z3950::AsyncZ::Options::_params objects
- [2] Methods to deal with errors and error messages
- [3] Methods to handle several types of headers which AsyncZ attaches to records
Object Methods
- Net::Z3950::AsyncZ::getErrors
-
$err_array_ref = $asyncZ->getErrors($index);
- params:
-
$index
: index of the server for which error inquiry is being made. (Seeservers=>\@servers
parameter of "Constructor") - return value:
-
$err_array_ref
: a reference to an array of twoNet::Z3950::AsyncZ::ErrMsg
objects orundef
if the server pointed to by this$index
had no errors.This array reference must be tested using
isZ_Error()
to determine whether it represents represent a valid error. The twoErrMsg
objects are referred to as $err_array_ref->[0] and $err_array_ref->[1].$err_array_ref->[0] references a ycle 1 error if it exists $err_array_ref->[0] references a cycle 2 error if it exists
- Net::Z3950::AsyncZ::getMaxErrors
-
$error_number = $asnycZ->getMaxErrors();
- return value:
-
$error_number
: the Maximum number of possible errors which have occurred for all servers during current session; because of the two-cycle process, some errors reported in the first cycle are nullified by successful outcomes during the second cycle; the class methodisZ_Error()
tests for whether a cycle 1 error has been nullified by a successful second attempt. See Net::Z3950::AsyncZ::isZ_Error.
- Net::Z3950::AsyncZ::_printError
-
$asnycZ->_printError($err)
- outputs an error string of the following format:
-
[error_number] error_message Type_of_Error is_Retry_able
- For example:
-
[111] Connection refused NET
- or:
-
[225] An error occurred when accessing the library database. --Z3950 ERROR --RETRY
(This is an internal method I used for debugging but leave it here for its possible utility.)See "Net::Z3950::AsyncZ::Errors" for explanations of error types, etc.
Class Methods
- Net::Z3950::AsyncZ::asyncZOptions
-
$params_ref = asyncZOptions([option_1=>opt_1, option_2=>opt_2, . . .option_n=>opt_n]);
- params:
-
an optional list of named parameters which set the options for a child process. When called without parameters, the
_params
object is created with a set of default values. Unless you plan to override the default values, it's not necessary to callasyncZOptions
: AsyncZ.pm will create a default_params
object for you.There is a full range of accessor methods by which each option can be set and queried in the form of
$params_ref->set_option_1(value)
and$value=$params_ref->get_option_1()
. This makes it possible to set options dynamically.Options are treated fully in the separate Options documentation.
For HTML documentation see: Options.html - return:
-
$param_ref
: reference to a Net::Z3950::AsyncZ::Options::_params object.Net::Z3950::AsyncZ::Options::_params objects are used internally by AsyncZ and hence treated as private. Creating a _params object directly by calling its
new
method is not recommended. See Net::Z3950::AsyncZ::Options::_params
- Net::Z3950::AsyncZ::isZ_MARC
- Net::Z3950::AsyncZ::isZ_GRS
- Net::Z3950::AsyncZ::isZ_RAW
- Net::Z3950::AsyncZ::isZ_DEFAULT
-
$bool = isZ_<TYPE>
- params:
-
$line
: current $line of record array - returns:
-
$bool
: true if header $line designates that current record is of <TYPE>, otherwise false
These utilities test for the type of record which is currently being presented to the callback function. Each record is sent to the callback prefaced with headers that provide information about the record, including its type. If you are querying a variety of servers, some might send back MARC records, others GRS-1.
foreach my $line(@$array) { isZ_MARC($line) and do_something(); isZ_GRS($line) and do_something_else(); . . . . . . }
See also Net::Z3950::AsyncZ::isZ_Header which tests for whether a $line is a type-header, as opposed to whether it designates a particular type of record
Records are sent to the callback function as an array of lines in which records are separated from one other by a set of headers; you can determine the number of the current record by extracting the record number from its type-header using getZ_RecNum. See "Headers" and "getZ_RecNum".
- Net::Z3950::AsyncZ::isZ_Header
-
$bool = isZ_Header($line);
This function tests whether $line is a type-header (i.e. whether this is a USMARC reocord, GRS-1, etc).
- Net::Z3950::AsyncZ::getZ_RecNum
-
$recnum = getZ_RecNum($line)
- params:
-
$line
: The current$line
of the records array. - returns:
-
$recnum
: The number of the current record in the Record Set, i.e. if there are 20 records matching the query, and you have asked for 5 at time, the record number is not one of five, but one of 20. You must first test the line to make sure it is a header:if(isZ_Header($line)) { print "Recnum = ", getZ_RecNum($line),"\n"; }
- getZ_RecSize
-
$recsize = getZ_RecSize($index);
- Net::Z3950::AsyncZ::isZ_Error
-
$retv = isZ_Error($err_array_ref)
- params:
-
$err_array_ref
: an array reference returned byNet::AscyncZ::getErrors
(the array holds twoNet::Z3950::AsyncZ::ErrMsg
objects).Because of the two-cycle process, some errors reported in the first cycle are nullified by successful outcomes during the second cycle; this method tests for whether a cycle 1 error has been nullified by a successful second attempt.
- return:
-
$retv
: 0 if not an error; 1 if non-recoverable cycle 1 error; 2 if cycle 2 error.In other words, it returns
false
if there has been no error andtrue
if there has been. The type of true value it returns is used byNet::Z3950::AsyncZ::isZ_nonRetryable
to determine whether this error was non-recoverable.
- Net::Z3950::AsyncZ::isZ_nonRetryable
-
$retv = isZ_Error($err); $bool = isZ_nonRetryable($retv); $bool = isZ_nonRetryable(isZ_Error($err))
- params:
-
$retv
: the return value fromisZ_Error
. - return:
-
$bool
: true if $err is non-recoverable, otherwise false
This is a convenience method in which the idiom
isZ_nonRetryable(isZ_Error($err))
tests whether $err is a non-recoverable cycle 1 error. Since such errors often occur at the system level, this enables you to side-step outputting what might be gobbledygook (e.g. "illegal seek") to the user:print "There has been an error in contacting this server\n" if isZ_nonRetryable(isZ_Error($err));
Since there are some non-recoverable cycle 1 errors which might be of interest to the user (e.g. "connection refused", which is identified as a network error), you might test whether it is also a system error:
print "There has been an error in contacting this server\n" if isZ_nonRetryable(isZ_Error($err)) && $err->isSystem();
- Net::Z3950::AsyncZ::isZ_Info
-
$bool = isZ_Info($line);
- params:
-
$line
: current $line of record array - returns:
-
$bool
: true if header $line contains internal data, otherwise false
See "Headers", Net::Z3950::AsyncZ::isZ_PID, and Net::Z3950::AsyncZ::noZ_Response.
- Net::Z3950::AsyncZ::isZ_PID
-
$bool = isZ_PID($line);
- params:
-
$line
: current $line of record array - returns:
-
$bool
: true if header $line contains pid of child process, otherwise falseThe preferred method for testing for the PID header is
isZ_Info
. Therefore,isZ_PID
is not explicitly exported and requires the full package name: Net::Z3950::AsyncZ::isZ_PID.
- Net::Z3950::AsyncZ::delZ_header
- Net::Z3950::AsyncZ::delZ_pid
- Net::Z3950::AsyncZ::delZ_serverName
-
These functions are used as follows:
$line = delZ_header($line, $gmodifier, $subst);
- params:
-
$line
: string or reference to a string: current$line
of record data$gmodifier
: boolean--iftrue
then theg
modfier is applied to substitutions: s///g$subst
: the value to be subtituted for the item being deleted - returns:
-
$line
: either string or reference to string, depending on whether a reference or a string was intially passed in paramter$_[0]
.
-
These functions are used internally by AsyncZ but they can be a useful supplement to
isZ_Header
,isZ_Server
, andisZ_PID
; instead of testing for these headers, they enable you to either delete or substitute another string for them.You might, for instance, find it useful to substitute the name of an institution for the name of a server:
$line = delZ_serverName($line, 0, "University of Manitoba Libraries");
- Net::Z3950::AsyncZ::prep_Raw
-
This function and
get_ZRawRec
are used to retrieve raw record data, which is returned whenraw
is set to true andrender
set to false in the_params
array.$recs = prep_Raw($array);
- param:
-
$array
: reference to array of raw records passed into the callback function whenrender=>0
- returns:
-
$recs
: reference to string representing all records in records array whenraw
is true andrender
is false.
This function "preps" an array of raw records for use with
get_ZRawRec
. To use this function andget_ZRawRec
you must setrender=>0
in theoptions
array. - Net::Z3950::AsyncZ::get_ZRawRec
-
$rec = get_ZRawRec($recs)
- params:
-
$recs
: reference to a string representing array of record data - returns:
-
$rec
: string representing the next record in array orundef
if no record is available.
get_ZRawRec
behaves as a "get-next" function: with each access ofget_ZRawRec
, the next record is returned and deleted from the string of records created inprep_Raw
.
Exported Names
Exports from Net::Z3950::AsyncZ
- @EXPORT_OK
-
asyncZOptions isZ_MARC isZ_GRS isZ_RAW isZ_Error isZ_nonRetryable isZ_Info isZ_DEFAULT noZ_Response isZ_Header isZ_ServerName Z_serverName getZ_RecNum getZ_RecSize delZ_header delZ_pid delZ_serverName prep_Raw get_ZRawRec
- :record
-
isZ_MARC isZ_GRS isZ_RAW isZ_DEFAULT getZ_RecNum
- :errors
-
isZ_Error isZ_nonRetryable
- :header
-
isZ_ServerName Z_serverName noZ_Response isZ_Header isZ_Info delZ_header delZ_pid delZ_serverName isZ_Info
Exports from Net::Z3950::AsyncZ::Errors
Exports from Net::Z3950::AsyncZ::ErrMsg
Callback Functions
For the record: A callback is a function which you supply and which AsyncZ calls upon as required.
AsyncZ uses two callback functions. One handles the general output of records fetched from the servers queried. The second formats individual lines of the record to your specifications. The format callback is not required.
Output Callback (required)
- parameters:
-
$index
: index of the server to which the current records belong, i.e. the index of the server in the @servers array which you pass into the constructor:servers=>\@servers
.$array_ref
: array of records which have been returned from the server
The output callback is called whenever records become available from one of the child processes. The most basic callback would be something like this:
sub output {
my($index, $array_ref) = @_;
foreach my $line(@$array_ref) {
print "$line\n" if $line;
}
print "\n--------\n\n";
}
Note: It is important to note the sequence in which the parameters are passed to the callback:
-
my($index, $array_ref) = @_;
The array which is referenced by $array_ref contains all of the records fetched from the current server. Each element of the array holds either one line of the record or one of the AsyncZ headers. The headers separate the records, while the format of the record and its lines depends up two factors:
- the type of record:
-
MARC, RAW, GRS, etc.
- the format function:
-
either the format callback, or the default HTML or Plain Text method (if no format callback is specified)
Here is typical output from the default Plain Text method:
<!--jasper.acadiau.ca-->
<#--4498-->
[MARC 4]
020 ISBN: 0472110101 (cloth : alk. paper)
050 LC call number: PS2123.A4 1999
100 author: James, Henry,1843-1916.Correspondence.Selections.
245 title: Dear munificent friends :Henry James's letters to four women /edited by Susan E. Gunter.
260 publication: Ann Arbor :University of Michigan Press,c1999.
300 description: xxiv, 288 p. ;24 cm.
650 subject: Authors, American19th centuryCorrespondence.
650 subject: Authors, American20th centuryCorrespondence.
700 auth, illus, ed: Gunter, Susan E.,1947-
<!--130.17.3.75-->
<#--4518-->
[MARC 5]
020 ISBN: 080066755
050 LC call number: G62.T7 1968
245 title: Trends in geography;an introductory survey.Edited by Ronald U. Cooke and James H. Johnson.
250 edition: [1st ed.]
260 publication: Oxford,New York,Pergamon Press[1969]
300 description: x, 287 p.illus.23 cm.
500 note: Collection of essays originally presented at a conference organized by the University of London Institute of Education and held at University College London in 1968.
500 note: Pergamon Oxford geographies.
650 subject: Geography
700 auth, illus, ed: Johnson, James Henry,1930-
700 auth, illus, ed: Cooke, Ronald U.
The first three lines of each record are headers, indicating that you have encountered a new record. The headers hold the following information:
Server name
pid of child process
type of record and record number.
At the very least you would probably want to ignore the headers and add a newline to separate one record from another. The set of class methods provided by Net::Z3950::AsyncZ allows you to deal with the headers as you see fit: you can ignore them, you can identify the record type and extract the record number, and you can extract the server name.
If a server fails to return any records, the array will consist of one line of the following form:
{!-- library.anu.edu.au --}
This line does not tell us which server has failed, only that one of the child processes has not returned any records.
- Using the
$index
-
While the server's name is given in the headers to each record, knowing the
$index
will enable you to track the servers you've queried. For instance, you might want to create an array with the names of the institutions at which servers are located, so that you can tell your users that the current record is a response from Acadia University in Wolfville, N.S., rather from jasper.acadiau.ca. Knowing the index in the callback enables you to do this.
See "Headers" and basic_pretty.pl
, included with the distribution, for some ways of testing for and handling headers.
Format Callback (not required)
- parameters:
-
$row
: a reference to a 2 element array:$row->[0]
: a MARC tag or the null string if there is no tag$row->[1]
: the field's data string
Records are formatted one row at a time. There are two default behaviors-- plain text and HTML. The plain text is as illustrated in Output Callback:
050 LC call number: PS2123.A4 1999
100 author: James, Henry,1843-1916.Correspondence.Selections.
245 title: Dear munificent friends
The first column is a MARC tag, the second a string name for that tag, and the third is the field data. The HTML default would ouput the following:
<tr><td>ISBN<td>0472110101 (cloth : alk. paper)
<tr><td>LC call number<td>PS2123.A4 1999
<tr><td>author<td>James, Henry,1843-1916.Correspondence.Selections.
<tr><td>title<td>Dear munificent friends
In the HTML each field is placed within a <td>. It would then be up to you, in your output callback, to complete the HTML by adding the <TABLE>. . .</TABLE> tags and any attributes to those tags. You could also, for instance, format the table using CSS.
The functions which create this output are in Net::Z3950::AsyncZ::Report:
sub _defaultRecordRowHTML {
my ($row) = @_;
return "<tr><td>" . $MARC_FIELDS{$row->[0]} . "<td>" . $row->[1] . "\n";
}
sub _defaultRecordRow {
my ($row) = @_;
return $row->[0] . "\t" . $MARC_FIELDS{$row->[0]} . ":\t" . $row->[1] . "\n";
}
You can specify your own row formatter using the format
parameter of AsyncZ's constructor. It will always be passed the reference to a two element array, but if there is no MARC tag, then $row-
[0]> will be set to the null string and $row-
[1]> will hold whatever data is available.
Tip: The default row formatter is _defaultRecordRow
. To make _defaultRecordRowHTML
your default, set the constructor's format
parameter to Net::Z3950::AsyncZ:Report::_defaultRecordRowHTML:
format=>\&Net::Z3950::AsyncZ::Report::_defaultRecordRowHTML
Headers
Types of Headers
As noted under Output Callback there are four types of headers:
[1] server name:
<!--library.anu.edu.au-->
[2] pid of the child function which accessed the server:
<#--13076-->
[3] type of record and its record number:
[MARC 2]
[4] failure of the child process to return any records:
{!-- library.anu.edu.au --}
The first three headers occur at the start of each new record:
<!--library.anu.edu.au-->
<#--13076-->
[MARC 2]
020 ISBN: 0060154497
100 author: Henry, James F.,1930-
245 title: The manager's guide to resolving legal disputes
250 edition: 1st ed.
260 publication: New York :Harper & Row,c1985.
300 description: v, 162 p. ;22 cm.
But the fourth header occurs as a single line by itself:
{!-- library.anu.edu.au --}
This fourth header tells us that one of the servers failed to return records--but not which one failed. library.anu.edu.au
is not the server which failed to respond but the last server which did respond. (The reasons for this have to do with asynchononicity and shared memory.)
Dealing with Headers in the Callback Function
The following methods, detailed in Class Methods
, are used for handling headers in the callback function:
Their use is demonstrated in the callback function from basic_pretty.pl
:
sub output {
my($index, $array) = @_;
foreach my $line(@$array) {
return if noZ_Response($line);
next if isZ_Info($line); # remove internal data
next if isZ_Header($line); # again remove internal data
# you could first test for type of output:
# isZ_MARC, etc. or extract the record number
# extract server name from header
(print "\nServer: ", Z_serverName($line), "\n"), next
if isZ_ServerName($line);
print "$line\n" if $line;
}
print "\n--------\n\n";
}
This produces the following result:
Server: bison.umanitoba.ca
050 LC call number: PS2124.H46
245 title: Henry James review. --
260 publication: [Louisville, KY :Dept. of English, University of Louisville/,1979-
300 description: v. ;25-28 cm.
650 subject: Ejournals -- UML
700 auth, illus, ed: Fogel, Daniel Mark,1948-
If you wanted to get the Record Number, you could replace
next if isZ_Header($line);
with
$recnum = getZ_RecNum($line) if isZ_Header($line);
This may be useful when you are requesting additional records for the same query. If you are getting 5 records at a time, in your second request to the server, the first of the records returned would be number 6.
If you wanted toget rid of the MARC tags and the following white space you could put each line through this filter:
$line =~ s/\d+\s+//;
Incorporating both these modifications would give us the following:
sub output {
my($index, $array) = @_;
my $recnum = 1;
foreach my $line(@$array) {
return if noZ_Response($line);
next if isZ_Info($line); # remove internal data
if(isZ_Header($line)) {
print "Record: ", getZ_RecNum($line),"\n";
next;
}
# extract server name from header
(print "\nServer: ", Z_serverName($line), "\n"), next
if isZ_ServerName($line);
$line =~ s/\d+\s+//;
print "$line\n" if $line;
}
print "\n--------\n\n";
}
Errors
There are two sets of error messages in AsyncZ
:
[1] detailed messages for debugging and tracking: these are handled by the Net::Z3950::AsyncZ::Errors
module
[2] informational messages for the user: these are handled by Net::Z3950::AsyncZ::ErrMsg
Net::Z3950::AsyncZ::Errors
The detailed messages contain a number of different kinds of information:
1. a trace back 3 levels
2. server name and query string
3. Z3950 error messages where available
4. system error messages
Detailed errors are either sent to a file or to the terminal or are suppressed. How they are dealt with depends on the log
options of Net::AsnyncZ::new
and Net::Z3950::AsyncZ::Options::_params
. This means that you can have different error reporting mechanisms for each of your servers as well as for the parent process.
The default behavior is to write all error messages to the terminal. To write them to a log file you set log
to a filename:
log=>$filespec
NOTE: Do not open the file yourself. All files are automatically opened and closed by AsyncZ
.
To suppress all errors you do the following:
log=>Net::Z3950::AsyncZ::Errors::suppressErrors()
Since suppressErrors() is exported, you can do this:
use Net::Z3950::AsyncZ::Errors(suppressErrors);
log=>suppressErrors()
System error messages and Perl library messages are routinely sent to STDERR; AsyncZ
sends its error messages to STDOUT. This means that if you don't do do something to redirect the AsyncZ
messages and you are operating in a web browser, the AsyncZ
messages will go to the browser.
Net::Z3950::AsyncZ::ErrMsg
AsyncZ
keeps a record of which processes have returned records and which have not. It also keeps track of the exit codes of each process. For each process which has not returned records,it creates a Net::Z3950::AsyncZ::ErrMsg
object, based on its exit code. There is a separate set of Net::Z3950::AsyncZ::ErrMsg
objects for each of the two AsyncZ
cycles (See "The Basic Mechanisms of Net::Z3950::AsyncZ"). A query which reported failure in the first cycle may have been successful in its second attempt. Net::Z3950::AsyncZ::isZ_Error
returns true if a server has not returned any records, false if it has.
Net::Z3950::AsyncZ::ErrMsg Object
- errno
-
the error number
- msg
-
the error string
- type
-
System, Network, Z3950, Success
See "Net::Z3950::AsyncZ::ErrMsg methods for ErrMsg Handling"
- retry
-
returns true from
doRetry
- abort
-
returns true from
doAbort
Net::Z3950::AsyncZ methods for ErrMsg handling
Net::Z3950::AsyncZ
supplies four methods, two "Object Methods" and two "Class Methods".
- getErrors
-
$err = $asyncZ->getErrors($index);
this method returns a reference to an array of two ErrMsg objects:
[$errors[$index]->[0], $errors[$index]->[1]]
$index is the index of the server in the
servers=>\@servers
array.See
Net::Z3950::AsyncZ::getErrors
. - getMaxErrors
-
$error_number = $asnycZ->getMaxErrors();
the maximum possible errors encountered: some of these may not if fact be errors and therefore will not test
true
inisZ_Error($err)
See
Net::Z3950::AsyncZ::getMaxErrors
- isZ_Error
-
$retv = isZ_Error($err)
See
Net::Z3950::AsyncZ::isZ_Error
- isZ_nonRetryable
-
$bool = isZ_nonRetryable(isZ_Error($err))
See
Net::Z3950::AsyncZ::isZ_nonRetryable
Net::Z3950::AsyncZ::ErrMsg methods for ErrMsg Handling
Net::Z3950::AsyncZ::ErrMsg
supplies eight object methods, which enable you to determine the general category under which an error falls and how serious it is. They all return true
or false
.
The basic syntax for all of these methods is:
$err->method();
- isSystem
-
These are ususally errors reported back from Perl or C library routines. For instance:
Device or resource busy Too many users Permission denied Software caused connection abort Invalid argument
An "Invalid argument" will often come back when a query fails and a library routine attempts to do something which can't be done without the return value
- isNetwork
-
These can be various problems, for instance:
Connection timed out Network is down Network is unreachable Connection refused
- isTryAgain
-
This applies to two cases: [1] EAGAIN: the system error which returns a "try again" message [2] a process which has been created but never gets far enough to return an exit code, presumably because it has timed out.
- isSuccess
-
An error which answers
true
toisSuccess
is one for which the exit code is 0, i.e. one in which the process ended without an error but did not return any records. - isUnspecified
-
An Unspecified error is generally one which has been reported by the system but which I have not included among the errors worth reporting back to ordinary users. (You will, however, find them reported in the log file.) Even some of the errors which I do list might not be worth reporting back to the user (usually those answer
true
toisZ_nonRetryable
.) - isZ3950
-
These are error messages returned from the Z3950 module.
- doRetry
-
Errors which are temporary and make retrying a worthwhile prospect
- doAbort
-
Fatal errors
Examples of Net::Z3950::AsyncZ::ErrMsg Error Handling
A very basic routine for handling errors is demonstrated in basic.pl
:
sub showErrors {
my $asyncZ = shift; # [1]
print "The following servers have not responded to your query: \n";
for(my $i=0; $i< $asyncZ->getMaxErrors();$i++) {
my $err = $asyncZ->getErrors($i); # [2]
next if !isZ_Error($err); # [3]
print "$servers[$i]->[0]\n";
print " $err->[0]->{msg}\n" if $err->[0]->{msg}; # [4]
print " $err->[1]->{msg}\n" if $err->[1]->{msg}; # [5]
}
}
[1] Get reference to the Net::Z3950::AsyncZ object
[2] Get reference to array of ErrMsg Objects for index $i
[3] Check to see whether this array holds a valid error
[4] print the cycle 1 error if it exists (it should if you've gotten this far)
[5] print the cycle 2 error if it exists (it will not, if cyle 1 was non-retryable)
A more useful error routine is demonstrated in basic_pretty.pl
:
sub showErrors {
my $asyncZ = shift;
# substitute some general statement for a system level error instead
# of something puzzling to the user like: 'illegal seek'
my $systemerr = "A system error occurred on the server\n";
print "The following servers have not responded to your query: \n";
for(my $i=0; $i< $asyncZ->getMaxErrors();$i++) {
my $err = $asyncZ->getErrors($i); # [1]
next if !isZ_Error($err); # [2]
print "$servers[$i]->[0]\n"; # [3]
if($err->[0]->isSystem()) {
print $systemerr; # [4]
}
else {
print " $err->[0]->{msg}\n" if $err->[0]->{msg}; # [5]
}
if($err->[1] && $err->[1]->isSystem()) {
print $systemerr; # [6]
}
else {
print " $err->[1]->{msg}\n" # [7]
if $err->[1]->{msg} && $err->[1]->{msg} != $err->[0]->{msg};
}
}
}
The first three steps are a repeat of basic.pl
:
[1] Get reference to the Net::Z3950::AsyncZ object
[2] Get reference to array of ErrMsg Objects for index $i
[3] Check to see whether this array holds a valid error
Cycle 1 Error:
[4] If this is a system-type error, print a non-specialist message
[5] Otherwise, print the error message for this error
Cycle 2 Error:
[6] If this is a system-type error, print a non-specialist message
[7] Otherwise, print the error message for this error but only if
the cycle 2 error message is not the same as the cycle one message
AUTHOR
Myron Turner <turnermm@shaw.ca> or <mturner@ms.umanitoba.ca>
COPYRIGHT AND LICENSE
Copyright 2003 by Myron Turner
This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.
1 POD Error
The following errors were encountered while parsing the POD:
- Around line 920:
Expected text after =item, not a bullet