NAME

create_database.pl - Build CLDR SQLite Database

SYNOPSIS

# From https://github.com/unicode-org/cldr
# Or, from https://cldr.unicode.org/index/downloads
# download the archive, and open it, and point this script to its directory
create_database.pl /some/where/cldr-common-45.0
create_database.pl --debug 4 /some/where/cldr-common-45.0
create_database.pl --debug 4 \
    --maintainer John Doe \
    --replace \
    --extended-timezones-cities /some/where/timezones_supplemental_cities.json \
    --db-file /some/where/db.sqlite3 \
    --created 2024-07-01 /some/where/cldr-common-45.0
create_database.pl --noapply-patch /some/where/cldr-common-45.0
create_database.pl --debug 4 \
    --extend \
    --extended-timezones-cities /some/where/timezones_supplemental_cities.json \
    --db-file /some/where/db.sqlite3 \

Get help:

create_database.pl --help

Access this documentation:

create_database.pl --man

or:

perldoc create_database.pl

Beware, the output on STDOUT is quite verbose, so you may want to do instead:

create_database.pl /some/where/cldr-common-45.0 >/tmp/cldr_debug.log 2>/tmp/cldr_debug.err

DESCRIPTION

This script build the SQLite database into a database file by reading from the CLDR (Common Locale Data Repository) repository and collecting all data and storing them into various SQL tables as documented in Locale::Unicode::Data

It requires the following files from IANA zones database: zone1970.tab and backward, which you need to place in the scripts directory before running this script.

This script is unforgiving by nature, which means it has hard expectations about the types of data it expects, and will die if those expectations are not met. If this happens, it most likely means something has changed in the CLDR data, and this script, and possibly the module Locale::Unicode::Data, need to be adjusted accordingly.

Please note that building the database can take some time depending on your computer CPU. However, you should not have to build it, since one is already shipped with this distribution.

Once the SQLite database has been built, you should move it to ./lib/Locale/Unicode/unicode_cldr.sqlite3 where Locale::Unicode::Data expects to find it.

Then, you can install the distribution, as usual:

perl Makefile.PL
make
make test
make install

OPTIONS

--apply-patch

Boolean value whether to apply known corrections to the CLDR data or not.

Right now, this includes a few fixes for calendar interval formats, and missing languages in territories data and missing territories in languages data.

--cldr-version

The CLDR version number. If not provided, this will be derived from the data directory name.

--created

The SQLite database creation date, for example 2017-11-10

This defaults to the current datetime

--db-file

The file path to the SQLite database that will be created.

This defaults to a system temporary location. You will need to move it to its final location once done, unless you have enabled the option --replace

If the option --replace is not enabled, this script will tell you the location of the temporary SQLite database, so you can move it yourself.

--debug

create_database.pl --debug 1

Enable debug mode with considerable verbosity using an integer. Above 4, the debugging output is more extensive.

--nodebug

Disable debug mode.

--extend

Extends the existing SQLite database by adding the time zones extended cities data, and then quits.

This command requires that the option --extended-timezones-cities be also provided.

--extended-timezones-cities

Path to a JSON-formatted file containing extended data for time zones cities.

By default, the Unicode CLDR data provide very few time zone cities that are used with the v or V format pattern characters. Using this option, you can tell this script to load those data onto the table timezones_cities_supplemental, and those data will automatically be made available from the SQL view timezones_cities_extended, which is built as a union between the original table timezones_cities and the supplemental data in table timezones_cities_supplemental

The format of the JSON data must be as follows:

{
   "Asia/Tokyo" : {
      "locales" : {
         "ar" : "طوكيو",
         "az" : "Tokio",
         "be" : "Токіо",
         "bg" : "Токио",
         "bn" : "টোকিও",
         # etc...
      }
    }
}

You can get a list of all known time zones with the method timezones_cities

--help, -h, -?

Print a short help message.

--log-file

File path to a log file to write to. Defaults to create_database.log in the same directory as this script.

This is only used if the option --use-log is enabled.

--maintainer

create_database.pl --maintainer John Doe

--man

Print this help as man page.

--replace

Boolean whether to move the temporary SQLite database built to its location in the module lib directory at lib/Locale/Unicode/unicode_cldr.sqlite3

Defaults to false.

By default, it will show the file path of the temporary SQLite database file.

--use-log

Boolean whether to write verbose output to a log file.

This is automatically enabled if debugging is enabled. See option --debug

-v

Show version number and exits.

--verbose

Enable verbose mode.

Actually, this has no effect.

--noverbose

Disable verbose mode.

Actually, this has no effect.

AUTHOR

Jacques Deguest <jack@deguest.jp>

COPYRIGHT & LICENSE

Copyright(c) 2024 DEGUEST Pte. Ltd.

All rights reserved

This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.