The London Perl and Raku Workshop takes place on 26th Oct 2024. If your company depends on Perl, please consider sponsoring and/or attending.

NAME

picadata - parse and validate PICA+ data

SYNOPSIS

picadata [<command>] {path} {options} {files}

DESCRIPTION

Convert, analyze and validate PICA+ data from the command line.

COMMANDS

convert

Convert between PICA+ serialization formats (the default command).

split

Split records into multiple records for each level. Implies -o.

count

Count number of records, holdings, items, and fields.

fields/subfields/sf

List distinct fields or subfields in the data. Provide an Avram schema (-s/--schema) to include documenation.

explain

Lookup (sub)fields in an Avram schema given by option or from stdin. Optional (o/*), mandatory (./+), repeatable (+/*).

validate

Validate data against an Avram schema (-s/--schema).

build

Build an Avram schema from input data, optionally based on an existing schema (-s/--schema). Add option -B/--abbrev to abbreviate.

OPTIONS

--from, -f

PICA serialization type (plain, plus, binary, XML, ppxml) with Plain as default. Guessed from first input filename unless specified. See format documentation at http://format.gbv.de/pica.

--to, -t

PICA serialization type to enable writing parsed PICA data.

--number, -n

Stop parsing after n records. Can be abbreviated as -1, -2...

--order, -o

Sort record fields by field identifier and by occurrence at level 2.

--annotate, -a, -A

Enforce annotated PICA as output format or prevent with -A. Combined with --schema this will set annotations ! and ? to mark validation errors.

--path, -p

Select fields or subfield values specified by PICA Path expressions. Multiple expressions can be separated by | or by repeating the option.

--schema, -s

Avram Schema to validate against. Can be a file or an URL.

--unknown, -u

Report unknown fields and subfields on validation (disabled by default).

--abbrev, -B

Abbreviate the Avram schema (with command <build>).

--color, -C

Colorize output. Only supported for PICA plain and PICA plus format.

--mono, -M

Monochrome (don't colorize output).

--version, -V

Print version number and exit.

EXAMPLES

picadata pica.dat -t xml                    # convert binary to XML
picadata count -f plain < pica.plain        # parse and count records
picadata 003@ pica.xml                      # extract field 003@
picadata validate pica.xml -s schema.json   # validate against Avram schema

# document fields used in a record
picadata fields pica.xml -s https://format.k10plus.de/avram.pl?profile=k10plus

SEE ALSO

See catmandu for a more elaborated command line tool for data processing (transformation, API access...), including PICA+ with Catmandu::PICA.