NAME
Convert::TBX::Basic - Convert TBX-Basic data into TBX-Min
VERSION
version 0.03
SYNOPSIS
use Convert::TBX::Basic 'basic2min';
# create a TBX-Min document from the TBX-Basic file, using EN
# as the source language and DE as the target language
print ${ basic2min('/path/to/file.tbx', 'EN', 'DE')->as_xml };
DESCRIPTION
TBX-Basic is a subset of TBX-Default which is meant to contain a smaller number of data categories suitable for most needs. To some users, however, TBX-Basic can still be too complicated. This module allows you to convert TBX-Basic into TBX-Min, a minimal, DCT-style dialect that stresses human-readability and bare-bones simplicity.
METHODS
basic2min
# example usage
basic2min('path/to/file.tbx', 'EN', 'DE');
Given TBX-Basic input and the source and target languages, this method returns a TBX::Min object containing a rough equivalent of the specified data. The source and target languages are necessary because TBX-Basic can contain many languages, while TBX-Min must contain exactly 2 languages. The TBX-Basic data may be either a string containing a file name or a scalar ref containing the actual TBX-Basic document as a string.
Obviously TBX-Min allows much less structured information than TBX-Basic, so the conversion must be lossy. <termNote>
s, <descrip>
, and <admins>
s will be converted if there is a correspondence with TBX-Min, but those with type
attribute values with no correspondence in TBX-Min will simply be pasted as a note, prefixed with the name of the category and a colon. This is only possible for elements at the term level (children of a <termEntry>
element) because TBX-Min only allows notes inside of its <termGrp>
elements.
As quite a bit of data can be packed into a single <note>
element, the result can be quite messy. Log::Any is used to record the following:
(info) the elements which are stuffed into a note
(info) the elements that are skipped altogether during the conversion process
(warn) The entries that are skipped because they contained no relevant language sets, and
(warn) The entries that are skipped because they did not have any language specified.
TODO
It would be nice to preserve the xml:id
attributes in order to make the conversion process more tranparent to the user.
SEE ALSO
- basic2min (the included script)
- TBX::Min
- Convert::TBX::Min
AUTHOR
BYU Translation Research Group <akmtrg@byu.edu>
COPYRIGHT AND LICENSE
This software is copyright (c) 2016 by Alan K. Melby.
This is free software; you can redistribute it and/or modify it under the same terms as the Perl 5 programming language system itself.