NAME

picadata - parse and validate PICA+ data

SYNOPSIS

picadata [<command>] {path} {options} {files}

DESCRIPTION

Convert, analyze and validate PICA+ data from the command line.

COMMANDS

convert

Convert between PICA+ serialization formats (the default command).

split

Split records into multiple records for each level. Implies -o.

count

Count number of records, holdings, items, and fields.

fields/subfields/sf

List distinct fields or subfields in the data. Provide an Avram schema (-s/--schema) to include documenation.

explain

Lookup (sub)fields in an Avram schema given by option or from stdin. Optional (o/*), mandatory (./+), repeatable (+/*).

validate

Validate data against an Avram schema (-s/--schema).

diff

Compare PICA records from two inputs. Output is always annotated PICA Plain.

patch

Apply modifications given in annotated PICA Plain.

build

Build an Avram schema from input data, optionally based on an existing schema (-s/--schema). Add option -B/--abbrev to abbreviate.

OPTIONS

--from, -f

PICA serialization type (plain, plus, binary, XML, ppxml) with Plain as default. Guessed from first input filename unless specified. See format documentation at http://format.gbv.de/pica.

--to, -t

PICA serialization type to enable writing parsed PICA data.

--number, -n

Stop parsing after n records. Can be abbreviated as -1, -2...

--order, -o

Sort record fields by field identifier and by occurrence at level 2.

--annotate, -a, -A

Enforce annotated PICA as output format or prevent with -A. Combined with --schema this will set annotations ! and ? to mark validation errors.

--path, -p

Select fields or subfield values specified by PICA Path expressions. Multiple expressions can be separated by | or by repeating the option. Positions such as /3-7 are read as occurrence ranges.

--schema, -s

Avram Schema to validate against. Can be a file or an URL.

--unknown, -u

Report unknown fields and subfields on validation (disabled by default).

--abbrev, -B

Abbreviate the Avram schema (with command <build>).

--color, -C

Colorize output. Only supported for PICA plain and PICA plus format.

--mono, -M

Monochrome (don't colorize output).

--version, -V

Print version number and exit.

EXAMPLES

picadata pica.dat -t xml                    # convert binary to XML
picadata count -f plain < pica.plain        # parse and count records
picadata 003@ pica.xml                      # extract field 003@
picadata validate pica.xml -s schema.json   # validate against Avram schema

# document fields used in a record
picadata fields pica.xml -s https://format.k10plus.de/avram.pl?profile=k10plus

SEE ALSO

See catmandu for a more elaborated command line tool for data processing (transformation, API access...), including PICA+ with Catmandu::PICA.