Setting Options
Net::AsycnZ
sets options by means of named parameters for both the parent process and each of its child processes. Options for the parent are set in Net::AsycnZ->new
. Options for the child processes are set via the options
parameter of Net::AsycnZ->new
; the value of this parameter must be an array of Net::Z3950::AsyncZ::Options::_params
objects.
If a _params
object doesn't exist for a child process, Net::AsycnZ->new
will create it with a set of default options. There will always be a _params
object for every server in the servers
array, and they are cross-indexed, that is $_params_object[0]
is used for $server[0]
, etc. So, if you are creating your own array of _params
objects, you must keep this parallelism in mind.
Types of Options
[1] Options set in Net::Z3950::AsyncZ::new
which control the parent process and selected features of the child processes for which no alternatives are present: the alternatives are set as indicated in [2] and [3].
[2] Options set in a Net::Z3950::AsyncZ:Options::_params
object: this is returned by Net::Z3950::AsyncZ::asyncZOptions()
. There is one _params
object for each server: if you don't create one, it is created for you with the default values. If you don't create a _params
object for a server, then log
and query
options set in the AsyncZ
constructor will be used. The rationale behind this is that you usually will be asking one question across all servers and will usually be using only one log file for debugging.
But in all other cases where it is possible to set an option for the child in both the AsyncZ
constructor or _params
, the _params
setting will be used. At the moment this affects the format
and num_to_fetch
options.
[3] Options set in the Net::Z3950::Manager
by using the Z3950_options
option of the _params
object. These take precedence over any others and must be passed in with the first _params
object, that is, $_params_object[0]
, because AsyncZ uses only one Net::Z3950::Manager
. The Manager is created when setting up the first server passed into the constructor.
- Note:
-
Default values for options are shown to the right of the
=>
operator:HTML=>0
In some instances, the type of variable is shown and defaults detailed in commentary:
format=>\&format
Net::Z3950::AsyncZ::new
- cb
-
cb=>\&cb
callback function to which records will be sent as available. See Output Callback. - format
-
format=>\&format
callback function to format individual lines of records. See Format Callback. If you create a_params
object for a server and do not set itsformat
option, then the defaultformat
will be used, even if you set theformat
option of theAsyncZ
constructor to another value. - interval
-
interval=>1
Event loop timer interval in seconds: This controls how frequently AsyncZ checks to see if servers have responded and if thetimeout
period is up. - log
-
log=>undef
controls how extended error messages are handled. There are two sets of error messages--those handled through Net::Z3950::AsyncZ::ErrMsg and which are meant for the user and those meant for debugging. The latter are generated by both AsyncZ and the Perl library and can accumlulate at a rapid clip. AsyncZ writes its debugging messages to STDOUT, while those coming from library routines almost always go to STDERR. There are 3 options forlog
.[1]
undef
, the default, in which case all debugging messages go to the terminal, and those written to STDOUT will end up in a browser if you are on the web.[2]
log=>Net::Z3950::AsyncZ::Errors::suppressErrors()
(orlog=>suppressErrors()
if you import the function)--in which case these messages will be suppressed[3]
log=>$filespec
, in which case all of these messages will go to the file specified in$filespec
The
Net::Z3950::AsyncZ::Options::_param
object also has alog
option--which means that you can specify a log file for each child process--ie. for each server queried-- while keeping a separate one for the parent. Or you can set up a system where parent and child_1 write to log.1, while child_2 and child_3 write to log.2, etc.Note: All error logs are automatically opened and closed. Do NOT open or close them yourself!
Do NOT open or close log files yourself!
- maxpipes
-
maxpipes=>4
maximum number of forks to be executed at one time--the greater the number the more resources are used--both of memory and cpu. - monitor
-
monitor=>0
timeout in seconds for a monitoring child process, or 0, in which case a monitor is not set.The monitor is a child process which runs a timer and kills the parent process, if it exceeds the timeout period. You run the monitor only if your software hangs. An orderly shutdown of all runnning processes is put into effect, the purpose of which is to prevent the development of zombie processes and to release all shared memory.
- num_to_fetch
-
num_to_fetch->5
number of records to fetch; this setting will be used only if you have not created a_params
object. This means that if you create_params
object for the server and do not set itsnum_to_fetch
option, thennum_to_fetch
will default to 5 even if you have set another value fornum_to_fetch
in theAsyncZ
constructor. - options
-
options=>\@options
reference to an array of references to "Net::Z3950::AsyncZ::Options::_params" objects. Each reference is obtained from a call to "Net::Z3950::AsyncZ::asyncZOptions". For instance:@options = ( asyncZOptions(option_1=>opt_1,option_2=>opt_2, . . .), undef, asyncZOptions(option_1=>opt_1,option_2=>opt_2, . . .) );
This array parallels the
servers
array:@servers = ( [$host_1, $port_1, $database_1], [$host_2, $port_2, $database_2], [$host_3, $port_3, $database_3] );
$options[0]
is used for$server[0]
and$options[2]
for$server[2]
. If a_params
object is not found or if it is not defined, as for $server[1], then a default_params
object is created for the server. - query
-
query=> undef
the query string: its format depends on Z3950 querytype and defaults to 'prefix' (as inNet::Z3950
). You can set a separate Z3950 querytype for each query, or you can change the querytype for all servers by usingZ3950_options
.If you create a
_params
for a server but do not set thequery
option in_params
, then thisquery
will be used. This means that you can set onequery
for all of your servers without having to re-set it for each of the_params
objects you create. But if you create a_params
with a differentquery
, then the query set in_params
will be used. - servers
-
servers=>\@servers
array of references to servers in form: [ $host, $port, $database]See
See also basic.ploptions
above andAsyncZ.pod: "The Basic Script"
. - swap_attempts
-
swap_attempts=>5
the number of times that a swap check will be done before exiting; seeswap_check
for details. - swap_check
-
swap_check=>0
the number of seconds between checks for swapping activity-- used when querying a great number of servers and requesting large amounts of data. It instructs AsyncZ to sleep forswap_check
number of seconds before processing any further connections. If you are attempting to process too much data for the size of your RAM, the system will have to swap out of memory into the swap space on your disk; too much swapping causes loss of data and disk "thrashing"--i.e. repeated disk access--and will overburden the system. Whenswap_check
is set, AsyncZ will check for signs of swap activity; if it finds swap activity it will go to sleep for the number of seconds set inswap_check
and then re-check forswap_attempts
number of times. If the swap activity continues beyond this number of checks, AsyncZ dies. For large throughput, you will probably want to set the monitor, and to set it for a long period of time, for instance, 3000 seconds. This means that you can setswap_check
to a period of 10,20, 30 seconds. The values you set on these variables will depend on your own system memory resources and the amount of data you are processing. Note: This has been tested only on Linux but should also work on Unix, at least on Solaris. - timeout
-
timeout=>25
total timeout in seconds for all processes to complete their work. - timeout_min
-
timeout_min=>5
minumum timeout in secs to exit Event loop if all processes are finished; a security blanket to make sure all processes get a chance to report their results to the parent process before exiting the loop.
Net::Z3950::AsyncZ::Options::_params
Where a _param
option duplicates an AsyncZ::new
option, consult the AsyncZ::new
description for more details.
- HTML
-
HTML=>0
if true use default HTML formatting for records, if false format as plain text; see "Row Formatting Priorities". - Z3950_options
-
Z3950_options=>undef
reference to hash of additional Z3950 options.These options are passed to the Z3950 Manager and take precedence over
_param
options and options set inNet::Z3950::AsyncZ->new
.Z3950_options
makes it possible to implement Z3950 options which may not be specifically accounted for in any of the options to the AsyncZ module. For instance, to ask for "full" as opposed to "brief" records (which is the Z3950 default):@options = (asyncZOptions(Z3950_options=>{elementSetName =>'f'}) <, (asyncZOptions(. . .), . . >);
Note: To use this option, it must appear in the first
_params
object of the_params
array,$options[0]
, as in the above example. It is ignored in any subsequent uses. This means that you cannot set these options on a per-server basis; they apply across to board to all the servers you are querying. In the above exmaple, for instance, you could not ask for brief records from some servers and full from others. - cb
-
cb=>\&cb
reference to callback function to which records will be sent as available - format
-
format=>\&format
reference to a callback function that formats each row of a record - interval
-
interval=>5
timer interval for this forked process. Seeinterval
above underNet::Z3950::AsyncZ::new
. - log
-
log=>undef
controls how extended error messages are handled for this child process. A separate log file can be opened for each process.Note: All error logs are automatically opened and closed.
See
log
above underNet::Z3950::AsyncZ::new
. - marc_fields
- marc_subst
- marc_userdef
- marc_xcl
-
These options are fully described and illustrated in Report.pod under the heading "MARC Bibliographic Format".
- num_to_fetch
-
num_to_fetch=>5
number of records to fetch from this server. - pipetimeout
-
pipetimeout=>20
timeout in seconds for this child process - preferredRecordSyntax
-
preferredRecordSyntax=>Net::Z3950::RecordSyntax::USMARC
the Z3950 preferredRecordSyntax for this child process - query
-
query=>undef
the query for this process - querytype
-
querytype=>'prefix'
Z3950 querytype for this child process; it can be set to'ccl', or 'ccl2rpn'. - raw
-
raw=>0
(boolean) if true the raw record data for this process is returned; its format is dependent on therender
option. - render
-
render=>1
(boolean) iftrue
the raw record data for this process is returned filtered through the Z3950Record::render
function; this is the default. Iffalse
the raw data is returned unfiltered in its original state. The unfiltered raw data can be read usingNet::Z3950::AsyncZ::prep_Raw
andNet::AsyncZ::get_ZRawRec
. - startrec
-
startrec=>1
number of the record with which to start result from Record Set. - utf8
-
utf8=>0
when set totrue
conversions will be made toutf8/unicode
characters from the character codes used in MARC records to represent non-latin1 and accented latin1 chatacters. When ouputtingutf8
, you must callbinmode
on the ouput stream, for example:binmode(STDOUT, ":utf8");
When outputting to a browser, you should also notify the browser:
print "Content-type: text/html;charset=utf-8'\n\n"; print '<head><META http-equiv="Content-Type" content="text/html; charset=utf-8"></head><body>';
See the sample script:
MARC_HTML.pl
.Note: To use
utf8
you must have theMARC::Charset
module installed. Otherwise, theutf8
option will be ignored.
Row Formatting Priorities
If more than one option is set that affects the formatting of a record's rows, the following priority squence is in effect:
raw, format, HTML, plaintext (default)
Methods for Setting _params Options
get/set methods
Net::Z3950::AsyncZ::Options::_params
provides a full range of get_option
/ set_option
methods, enabling the dynamic setting of option values.
$_params_object->set_HTML(0);
$num_to_fetch = $_params_object->get_num_to_fetch();
In addition there are functions for setting options with fixed values:
Function Equivalent
set_marc_xtra() set_marc_fields($Net::Z3950::AsyncZ::Report::xtra)
set_marc_all() set_marc_fields($Net::Z3950::AsyncZ::Report::all)
set_marc_std() set_marc_fields($Net::Z3950::AsyncZ::Report::std)
set_raw_on() set_raw(1)
set_raw_off() set_raw(0)
set_plaintext() set_HTML(0)
set_HTML() set_HTML(1)
set_prefix() set_querytype('prefix')
set_ccl=>() set_querytype('ccl')
set_GRS1() set_preferredRecordSyntax(Net::Z3950::RecordSyntax::GRS1)
set_USMARC() set_preferredRecordSyntax(Net::Z3950::RecordSyntax::USMARC)
The get/set methods guarantee that you have in fact set or queried the option you are interested in and, in the case of the fixed value options, that you have set it to the value required. You don't have to be concerned that a meaningless hash key will spring into existence through misspelling:
$_params_object = asyncZoptions(leg=>Error.LOG, num_to_fish=>3);
In the case of the some of the fixed value methods, one advantage is the obvious simplicity of calling set_GRS1()
instead of set_preferredRecordSyntax(Net::Z3950::RecordSyntax::USMARC)
.
Net::Z3950::AsyncZ::Option::_params::option
This method works to both get and set values.
$value = $_params_obj->option('option');
$old_options_ref = $_params_obj->option(option=>value,option=>value,option=>value. . . );
params
in get mode: 'option' to be queried
in set mode: list of option=>value pairs to be set (or %hash)
returns
in get mode: $value of option being queried
in set mode: $old_options_ref -- reference to a hash of option=>value pairs
which have been replaced by list or %hash
Net::Z3950::AsyncZ::Option::_params::validOption
$bool = $_params_obj->validOption('option');
Net::Z3950::AsyncZ::Option::_params::invalidOption
$bool = $_params_obj->invalidOption('option');
Both of the above methods will enable you to determine whether an option you choose to set is a valid option. Useful when using Net::Z3950::AsyncZ::Option::_params::option
.
$option = 'num_to_fetch';
$_params_obj->validOption($option) ? $_params_obj->option($option=>3) :
die "invalid option: $option";
Net::Z3950::AsyncZ::Option::_params::test
$_params_obj->test();
Calling this function will print a listing of defined options and values for $_params_obj
.
AUTHOR
Myron Turner <turnermm@shaw.ca> or <mturner@ms.umanitoba.ca>
COPYRIGHT AND LICENSE
Copyright 2003 by Myron Turner
This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.