NAME
TBX::XCS - Extract data from an XCS file
VERSION
version 0.05
SYNOPSIS
use TBX::XCS;
my $xcs = TBX::XCS->new(file=>'/path/to/file.xcs');
my $languages = $xcs->get_languages();
my $ref_objects = $xcs->get_ref_objects();
my $data_cats = $xcs->get_data_cats();
DESCRIPTION
This module allows you to extract and edit the information contained in an XCS file. In the future, it may also be able to serialize the contained information into a new XCS file.
METHODS
new
Creates a new TBX::XCS object.
parse
Takes a named argument, either file
for a filename or string
for a string pointer.
This method parses the XCS content given by the specified file or string pointer. The contents of the XCS can then be accessed via get_ref_objects
, get_languages
, and get_data_cats
.
get_languages
Returns a pointer to a hash containing the languages allowed in the langSet xml:lang
attribute, as specified by the XCS languages
element. The keys are abbreviations, values the full names of the languages.
get_ref_objects
Returns a pointer to a hash containing the reference objects specified by the XCS. For example, the XML below:
<refObjectDef>
<refObjectType>Foo</refObjectType>
<itemSpecSet type="validItemType">
<itemSpec type="validItemType">data</itemSpec>
<itemSpec type="validItemType">name</itemSpec>
</itemSpecSet>
</refObjectDef>
</refObjectDefSet>
will yield the following structure:
{ Foo => ['data', 'name'] },
get_data_cats
Returns a hash pointer containing the data category specifications. For example, the XML below:
<datCatSet>
<descripSpec name="context" datcatId="ISO12620A-0503">
<contents/>
<levels>term</levels>
</descripSpec>
<descripSpec name="descripFoo" datcatId="">
<contents/>
<levels/>
</descripSpec>
<termNoteSpec name="animacy" datcatId="ISO12620A-020204">
<contents datatype="picklist" forTermComp="yes">animate inanimate
otherAnimacy</contents>
</termNoteSpec>
<xrefSpec name="xrefFoo" datcatId="">
<contents targetType="external"/>
</xrefSpec>
</datCatSet>
would yield the data structure below:
{
'descrip' =>
[
{
'datatype' => 'noteText',
'datCatId' => 'ISO12620A-0503',
'levels' => ['term'],
'name' => 'context'
},
{
'datatype' => 'noteText',
'levels' => ['langSet', 'termEntry', 'term'],
'name' => 'descripFoo'
}
],
'termNote' => [{
'choices' => ['animate', 'inanimate', 'otherAnimacy'],
'datatype' => 'picklist',
'datCatId' => 'ISO12620A-020204',
'forTermComp' => 'yes',
'name' => 'animacy'
}],
'xref' => [{
'datatype' => 'plainText',
'name' => 'xrefFoo',
'targetType' => 'external'
}]
};
get_title
Returns the title of the document, as contained in the title element.
get_name
Returns the name of the XCS file, as found in the TBXXCS element.
FUTURE WORK
extract
datCatDoc
extract
refObjectDefSet
Setter methods for XCS data
Print an XCS file
SEE ALSO
The XCS and the TBX specification can be found on GitHub.
AUTHOR
Nathan Glenn <garfieldnate@gmail.com>
COPYRIGHT AND LICENSE
This software is copyright (c) 2013 by Alan K. Melby.
This is free software; you can redistribute it and/or modify it under the same terms as the Perl 5 programming language system itself.