NAME
picadata - parse and validate PICA+ data
SYNOPSIS
picadata [<command>] {path} {options} {files}
DESCRIPTION
Convert, analyze and validate PICA+ data from the command line.
COMMANDS
convert
Convert between PICA+ serialization formats (the default command).
get
Print subfield values.
levels
Split records into multiple records for each level. Implies -o
.
join
Join multiple records into one and sort afterwards.
count
Count number of records, holdings, items, and fields.
filter
Filter records that include any of some given (sub)fields.
fields/subfields/sf
List distinct fields or subfields in the data. Provide an Avram schema (-s/--schema
) to include documenation.
explain
Lookup (sub)fields in an Avram schema given by option or from stdin. Optional (o
/*
), mandatory (.
/+
), repeatable (+
/*
).
validate
Validate data against an Avram schema (-s/--schema
).
diff
Compare PICA records from two inputs. Output is always annotated PICA Plain.
patch
Apply modifications given in annotated PICA Plain.
modify
Change subfield values and return result or patch (option -a
).
build
Build an Avram schema from input data, optionally based on an existing schema (-s/--schema
). Add option -B/--abbrev
to abbreviate.
OPTIONS
--from, -f
PICA serialization type (plain, plus/normalized, binary, import, XML, ppxml, pixml, patch) with Plain as default. Guessed from first input filename unless specified. See format documentation at http://format.gbv.de/pica.
--to, -t
PICA serialization type to enable writing parsed PICA data.
--number, -n
Stop parsing after n
records. Can be abbreviated as -1
, -2
...
--order, -o
Sort record fields by field identifier and by occurrence at level 2.
--level, -l
Split record into selected level, includes higher level identifiers.
--annotate, -a, -A
Enforce annotated PICA as output format or prevent with -A
. Combined with --schema
this will set annotations !
and ?
to mark validation errors.
--path, -p
Select fields or subfield values specified by PICA Path expressions. Multiple expressions can be separated by |
or by repeating the option. Positions such as /3-7
are read as occurrence ranges.
--schema, -s
Avram Schema given by file or URL. Default set via environment variable PICA_SCHEMA
.
--unknown, -u
Report unknown fields and subfields on validation (disabled by default).
--abbrev, -B
Abbreviate the Avram schema (with command <build>).
--color, -C
Colorize output. Only supported for PICA plain and PICA plus format.
--mono, -M
Monochrome (don't colorize output).
--version, -V
Print version number and exit.
EXAMPLES
picadata pica.dat -t xml # convert binary to XML
picadata count -f plain < pica.plain # parse and count records
picadata 003@ pica.xml # extract field 003@
picadata validate pica.xml -s schema.json # validate against Avram schema
picadata modify 021A.a "New Title" pica.pp # modify subfield value
# document fields used in a record
picadata fields pica.xml -s https://format.k10plus.de/avram.pl?profile=k10plus
SEE ALSO
See catmandu for a more elaborated command line tool for data processing (transformation, API access...), including PICA+ with Catmandu::PICA.