NAME
PICA::Record - Perl module for handling PICA+ records
VERSION
version 0.585
SYNOPSIS
To get a deeper insight to the API have a look at the documentation, the examples (directory examples
) and tests (directory t
). Here are some additional two-liners:
# create a field
my $field = PICA::Field->new(
"028A", "9" => "117060275", "d" => "Martin", "a" => "Schrettinger" );
# create a record and add some fields (note that fields can be repeated)
my $record = PICA::Record->new();
$record->append( '044C', 'a' => "Perl", '044C', 'a' => "Programming", );
# read all records from a file
my @records = PICA::Parser->new->parsefile( $filename )->records();
# read one record from a file
my $record = readpicarecord( $filename );
# read one record from a string
my ($record) = PICA::Parser->parsedata( $picadata, Limit => 1)->records();
# get two fields of a record
my ($f1, $f2) = $record->field( 2, "028B/.." );
# extract some subfield values
my ($given, $surname) = ($record->sf(1,'028A$d'), $record->sf(1,'028A$a'));
# read records from a STDIN and print to STDOUT of field 003@ exists
PICA::Parser->new->parsefile( \STDIN, Record => sub {
my $record = shift;
print $record if $record->field('003@');
return;
});
# print record in normalized format and in HTML
print $record->normalized;
print $record->html;
# write some records in XML to a file
my $writer = PICA::Writer->new( $filename, format => 'xml' );
$writer->write( @records );
DESCRIPTION
PICA::Record is a module for handling PICA+ records as Perl objects.
Clients and examples
This module includes and installs the scripts parsepica
, picaimport
, and winibw2pica
. They provide most functionality on the command line without having to deal with Perl code. Have a look at the documentation of this scripts! More examples are included in the examples directory - maybe the application you need it already included, so have a look!
On character encoding
Character encoding is an issue of permanent confusion both in library databases and in Perl. PICA::Record treats character encoding the following way: Internally all strings are stored as Perl strings. If you directly read from or write to a file that you specify by filename only, the file will be opened with binmode utf8, so the content will be decoded or encoded in UTF-8 Unicode encoding.
If you read from or write to a handle (for instance a file that you have already opened), binmode utf8 will also be enabled unless you have already specified another encoding layer:
open FILE, "<$filename";
$record = readpicarecord( \*FILE1 ); # implies binmode FILE, ":utf8"
open FILE, "<$filename";
binmode FILE,':encoding(iso-8859-1)';
$record = readpicarecord( \*FILE ); # does not imply binmode FILE, ":utf8"
If you read or write from Perl strings, UTF-8 is never implied. This means you must explicitely enable utf8 on your strings. As long as you read and write PICA record data from files and other sources or stores you should not need to do anything, but if you modify records in your scripts, use utf8.
If you download PICA+ records with the WinIBW3 client software, you may first need to convert the records to valid PICA+ syntax. For this reason this module contains the script winibw2pica
.
INTRODUCTION
What is PICA+?
PICA+ is the internal data format of the Local Library System (LBS) and the Central Library System (CBS) of OCLC, formerly PICA. Similar library formats are the MAchine Readable Cataloging format (MARC) and the Maschinelles Austauschformat für Bibliotheken (MAB). In addition to PICA+ in CBS there is the cataloging format Pica3 which can losslessly be convert to PICA+ and vice versa.
What is PICA::Record?
PICA::Record is a Perl package that provides an API for PICA+ record handling. The package contains a parser interface module PICA::Parser to parse PICA+ (PICA::PlainParser) and PICA XML (PICA::XMLParser). Corresponding modules exist to write data (PICA::Writer and PICA::XMLWriter). PICA+ data is handled in records (PICA::Record) that contain fields (PICA::Field). To fetch records from databases via SRU or Z39.50 there is the interface PICA::Source and to access a record store via CWS webcat interface there is PICA::Store.
You can use PICA::Record for instance to:
convert between PICA+ and PicaXML
download records in native format via SRU or Z39.50
process PICA+ records that you have downloaded with WinIBW
store PICA+ records in a database
CONSTRUCTORS
new ( [ ...data... | $filehandle ] )
Base constructor for the class. A single string will be parsed line by line into PICA::Field objects, empty lines and start record markers will be skipped. More then one or non scalar parameters will be passed to append
so you can use the constructor in the same way:
my $record = PICA::Record->new('037A','a' => 'My note');
If no data is given then it just returns a completely empty record. To load PICA records from a file, see PICA::Parser, to load records from a SRU or Z39.50 server, see PICA::Source.
If you provide a file handle or IO::Handle, the first record is read from it. Each of the following four lines has the same result:
$record = PICA::Record->new( IO::Handle->new("< $filename") );
($record) = PICA::Parser->parsefile( $filename, Limit => 1 )->records(),
open (F, "<:utf8", $plainpicafile); $record = PICA::Record->new( \*F ); close F;
$record = readpicarecord( $filename );
copy
Returns a clone of a record by copying all fields.
$newrecord = $record->copy;
ACCESSOR METHODS
field ( [ $limit, ] { $field }+ [ $filter ] ) or f ( ... )
Returns a list of PICA::Field
objects with tags that match the field specifier, or in scalar context, just the first matching Field.
You may specify multiple tags and use regular expressions.
my $field = $record->field("021A","021C");
my $field = $record->field("009P/03");
my @fields = $record->field("02..");
my @fields = $record->field( qr/^02..$/ );
my @fields = $record->field("039[B-E]");
If the first parameter is an integer, it is used as a limitation of response size, for instance two get only two fields:
my ($f1, $f2) = $record->field( 2, "028B/.." );
The last parameter can be a function to filter returned fields in the same way as a field handler of PICA::Parser. For instance you can filter out all fields with a given subfield:
my @fields = $record->field( "021A", sub { $_ if $_->sf('a'); } );
subfield ( [ $limit, ] { [ $field, $subfield ] | $fullspec }+ ) or sf ( ... )
Shortcut method to get subfield values. Returns a list of subfield values that match or in scalar context, just the first matching subfield or undef. Fields and subfields can be specified in several ways. You may use wildcards in the field specifications.
These are equivalent (in scalar context):
my $title = $pica->field('021A')->subfield('a');
my $title = $pica->subfield('021A','a');
You may also specify both field and subfield seperated by '$' (don't forget to quote the dollar sign) or '_'.
my $title = $pica->subfield('021A$a');
my $title = $pica->subfield("021A\$a");
my $title = $pica->subfield("021A$a"); # $ not escaped
my $title = $pica->subfield("021A_a"); # _ instead of $
You may also use wildcards like in the field()
method of PICA::Record and the subfield()
method of PICA::Field:
my @values = $pica->subfield('005A', '0a'); # 005A$0 and 005A$a
my @values = $pica->subfield('005[AIJ]', '0'); # 005A$0, 005I$0, and 005J$0
If the first parameter is an integer, it is used as a limitation of response size, for instance two get only two fields:
my ($f1, $f2) = $record->subfield( 2, '028B/..$a' );
Zero or negative limit values are ignored.
values ( [ $limit ] { [ $field, $subfield ] | $fullspec }+ )
Same as subfield
but always returns an array.
fields
Returns an array of all the fields in the record. The array contains a PICA::Field
object for each field in the record. An empty array is returns if the record is empty.
size
Returns the number of fields in this record.
occurrence or occ
Returns the occurrence of the first field of this record. This is only useful if the first field has an occurrence.
main
Get the main record (level 0, all tags starting with '0').
holdings ( [ $iln ] )
Get a list of local records (holdings, level 1 and 2) or the local record with given ILN. Returns an array of PICA::Record objects or a single holding. This method also sorts level 1 and level 2 fields.
items
Get an array of PICA::Record objects with fields of each copy/item included in the record. Copy records are located at level 2 (tags starting with '2') and differ by tag occurrence.
empty
Return true if the record is empty (no fields or all fields empty).
ACCESSOR AND MODIFCATION METHODS
ppn ( [ $ppn ] )
Get or set the identifier (PPN) of this record (field 003@, subfield 0). This is equivalent to $self->subfield('003@$0')
and always returns a scalar or undef. Pass undef
to remove the PPN.
epn ( [ $epn[s] ] )
Get zero or more EPNs (item numbers) of this record, which is field 203@/.., subfield 0. Returns the first EPN (or undef) in scalar context or a list in array context. Each copy record (get them with method items) should have only one EPN.
iln
Get zero or more ILNs (internal library numbers) of this record, which is field 101@$a. Returns the first ILN (or undef) in scalar context or a list in array context. Each holdings record is identified by its ILN.
MODIFICATION METHODS
append ( ...fields or records... )
Appends one or more fields to the end of the record. Parameters can be PICA::Field objects or parameters that are passed to PICA::Field->new
.
my $field = PICA::Field->new( '037A','a' => 'My note' );
$record->append( $field );
is equivalent to
$record->append('037A','a' => 'My note');
You can also append multiple fields with one call:
my $field = PICA::Field->new('037A','a' => 'First note');
$record->append( $field, '037A','a' => 'Second note' );
$record->append(
'037A', 'a' => '1st note',
'037A', 'a' => '2nd note',
);
Please note that passed PICA::Field objects are not be copied but directly used:
my $field = PICA::Field->new('037A','a' => 'My note');
$record->append( $field );
$field->update( 'a' => 'Your note' ); # Also changes $record's field!
You can avoid this by cloning fields or by using the appendif method:
$record->append( $field->copy() );
$record->appendif( $field );
You can also append copies of all fields of another record:
$record->append( $record2 );
The append method returns the number of fields appended.
appendif ( ...fields or records... )
Optionally appends one or more fields to the end of the record. Parameters can be PICA::Field objects or parameters that are passed to PICA::Field->new
.
In contrast to the append method this method always copies values, it ignores empty subfields and empty fields (that are fields without subfields or with empty subfields only), and it returns the resulting PICA::Record object.
For instance this command will not add a field if $country
is undef or ""
:
$r->appendif( "119@", "a" => $country );
update ( $tag, ( $field | @fieldspec | $coderef ) )
Replace a field. You must pass a tag and a field. If you pass a code reference, the code will be called for each field and the field is replaced by the result unless the result is undef
.
Please do not use this to replace repeatbale fields because they would all be set to the same values.
remove ( $tag(s) )
Delete fields specified by tags and returns the number of deleted fields. You can also use wildcards, and compiled regular expressions as tag selectors.
sort
Sort the fields of this records. Respects level 0, 1, and 2.
add_headers ( [ %options ] )
Add header fields to a PICA::Record. You must specify two named parameters (eln
and status
). This method is experimental. There is no test whether the header fields already exist. This method may be removed in a later release.
SERIALIZATION METHODS
string ( [ %options ] )
Returns a string representation of the record for printing. See also PICA::Writer for printing to a file or file handle.
normalized ( [ $prefix ] )
Returns record as a normalized string. Optionally adds prefix data at the beginning.
print $record->normalized();
print $record->normalized("##TitleSequenceNumber 1\n");
See also PICA::Writer for printing to a file or file handle.
xml ( [ $xmlwriter | %params ] )
Write the record to an XML::Writer or return an XML string of the record. If you pass an existing XML::Writer object, the record will be written with it and nothing is returned. Otherwise the passed parameters are used to create a new XML writer. Unless you specify an XML writer or an OUTPUT parameter, the resulting XML is returned as string. By default the PICA-XML namespaces with namespace prefix 'pica' is included. In addition to XML::Writer this methods knows the 'header' parameter that first adds the XML declaration and the 'xslt' parameter that adds an XSLT stylesheet.
html ( [ %options ] )
Returns a HTML representation of the record for browser display. See also the pica2html.xsl
script to generate a more elaborated HTML view from PICA-XML.
write ( [ $output ] [ format => $format ] [ %options ] )
Write a single record to a file or stream and end the output. You can pass the same parameters as known to the constructor of PICA::Writer. Returns the PICA::Writer object that was used to write the record. Use can check the status of the writer with a simple boolean check.
FUNCTIONS
The functions readpicarecord and writepicarecord are exported by default. On request you can also export the function picarecord which is a shortcut for the constructor PICA::Record->new and the functions pgrep and pmap. To export all functions, import the module via:
use PICA::Record qw(:all);
pgrep { COND } $record
Evaluates the COND for each field of $record
(locally setting $_ to each field) and returns a new PICA::Record containing only those fields that match. Instead of a PICA::Record field you can also pass any values that will be passed to the record constructor. An example:
# all fields that contain a subfield 'a' which starts with '2'
pgrep { $_ =~ /^2/ if ($_ = $_->sf('a')); } $record;
# all fields that contain a subfield '0' in level 0
pgrep { defined $_->sf('0') } $record->main;
pmap { COND } $record
Evaluates the COND for each field of $record
(locally setting $_ to each field), treats the return value as PICA::Field (optionally passed to its constructir), and returns a new record build if this fields. Instead of a PICA::Record field you can also pass any values that will be passed to the record constructor.
readpicarecord ( $filename [, %options ] )
Read a single record from a file. Returns a non-empty PICA::Record object or undef. Shortcut for:
PICA::Parser->parsefile( $filename, Limit => 1 )->records();
In array context you can use this method as shortcut to read multiple records if you specify a Limit
parameter. use Limit=>0
to read all records from a file. The following statements are equivalent:
@records = readpicarecord( $filename, Limit => 0 );
@records = PICA::Parser->parsefile( $filename )->records()
writepicarecord ( $record, [ $output ] [ format => $format ] [ %options ] )
Write a single record to a file or stream. Shortcut for
$record->write( [ $output ] [ format => $format ] [ %options ] )
as described above - see the constructor of PICA::Writer for more details. Returns the PICA::Writer object that was used to write the record - you can use a simple if to check whether an error occurred.
picarecord ( ... )
Shortcut for PICA::Record->new( ... )
SEE ALSO
At CPAN there are the modules MARC::Record, MARC, and MARC::XML for MARC records and Encode::MAB2 for MAB records. The deprecated module Net::Z3950::Record also had a subclass Net::Z3950::Record::MAB for MAB records. You should now better use Net::Z3950::ZOOM which is also needed if you query Z39.50 servers with PICA::Source.
AUTHOR
Jakob Voß <voss@gbv.de>
COPYRIGHT AND LICENSE
This software is copyright (c) 2014 by Verbundzentrale Goettingen (VZG) and Jakob Voss.
This is free software; you can redistribute it and/or modify it under the same terms as the Perl 5 programming language system itself.