NAME
App::CELL::Guide - Introduction to App::CELL (POD-only module)
VERSION
Version 0.231
SYNOPSIS
$ perldoc App::CELL::Guide
INTRODUCTION
App::CELL is the Configuration, Error-handling, Localization, and Logging (CELL) framework for applications written in Perl. In the "GENERAL APPROACH" section, this Guide describes the CELL approach to each of these four areas, separately. Then, in the "RATIONALE" section, it presents the author's reasons for bundling them together.
HISTORY
The original App::CELL was written by Nathan Cutler in 2013 and 2014, initially as part of the App::Dochazka::REST project. Later, with a view to its generic nature, it was spun off into a separate project.
GENERAL APPROACH
This section presents CELL's approach to each of its four principal functions: "Configuration", "Error handling", Localization, and Logging.
Approach to configuration
CELL provides the application developer and site administrator with a straightforward and powerful way to define configuration parameters as needed by the application. If you are familiar with Request Tracker, you will know that there is a directory (/opt/...
by default) which contains two files, called RT_Config.pm
and RT_SiteConfig.pm
-- as their names would indicate, they are actually Perl modules. The former is provided by the upstream developers and contains all of RT's configuration parameters and their "factory default" settings. The content of the latter is entirely up to the RT site administrator and contains only those parameters that need to be different from the defaults. Parameter settings in RT_SiteConfig.pm
, then, override the defaults set in RT_Config.pm
.
App::CELL provides this same functionality in a drop-in Perl module, with some subtle differences. While RT uses a syntax like this:
set( 'MY_PARAM', ...arguments...);
where ...arguments...
is a list of scalar values (as with any Perl subroutine), App::CELL uses a slightly different format:
set( 'MY_PARAM', $scalar );
where $scalar
can be any scalar value, i.e. including references.
(Another difference is that App::CELL provides both immutable site parameters _and_ mutable meta
configuration parameters, whereas RT's meta parameters are only used by RT itself.) For more information on configuration, see "Configuration in depth".
Error handling
To facilitate error handling and make the application's source code easier to read and understand, or at least mitigate its impenetrability, CELL provides the App::CELL::Status module, which enables functions in the application to return status objects if desired.
Status objects have the following principal attributes: level
, code
, args
, and payload
, which are given by the programmer when the status object is constructed, as well as attributes like text
, lang
, and caller
, which are derived by CELL. In addition to the attributes, Status.pm
also provides some useful methods for processing status objects.
In order to signify an error, subroutine foo_dis
could for example do this:
return $CELL->status_err( code => 'Gidget displacement %s out of range',
args => [ $displacement ],
);
(Instead of having the error text in the code
, it could be placed in a message file in the sitedir with a code like DISP_OUT_OF_RANGE.)
On success, foo_dis
could return an 'OK' status with the gidget displacement value in the payload:
return $CELL->status_ok( payload => $displacement );
The calling function could check the return value like this:
my $status = foo_dis();
return $status if $status->not_ok;
my $displacement = $status->payload;
For details, see App::CELL::Status and App::CELL::Message.
CELL's error-handling logic is inspired by brian d foy's article "Return error objects instead of throwing exceptions"
L<http://www.effectiveperlprogramming.com/2011/10/return-error-objects-instead-of-throwing-exceptions/>
Localization
This CELL component, called "Localization", gives the programmer a way to encapsulate a "message" (in its simplest form, a string) within a message object and then use that object in various ways.
So, provided the necessary message files have been loaded, the programmer could do this:
my $message = $CELL->message( code => 'FOOBAR' );
print $message->text, '\n'; # message FOOBAR in the default language
print $message->text( lang => 'de' ) # same message, in German
Messages are loaded when CELL is initialized, from files in the site configuration directory. Each file contains messages in a particular language. For example, the file Dochazka_Message_en.conf
contains messages relating to the Dochazka application, in the English language. To provide the same messages in German, the file would be copied to Dochazka_Message_de.conf
and translated.
Since message objects are used by App::CELL::Status, it is natural for the programmer to put error messages, warnings, etc. in message files and refer to them by their codes.
App::CELL::Message
could also be extended to provide methods for encrypting messages and/or converting them into various target formats (JSON, HTML, Morse code, etc.).
For details, see </Localization in depth> and <App::CELL::Message>.
Logging
For logging, CELL uses Log::Any and optionally extends it by adding the caller's filename and line number to each message logged.
Message and status objects have 'log' methods, of course, and by default all statuses (except 'OK') are logged upon creation.
Here's how to set up (and do) logging in the application:
use App::CELL::Log qw( $log );
$log->init( ident => 'AppFoo' );
$log->debug( "Three lines into AppFoo" );
App::CELL::Log provides its own singleton, but since all method calls are passed to Log::Any, anyway, the App::CELL::Log singleton behaves just like its Log::Any counterpart. This is useful, e.g., for testing log messages:
use Log::Any::Test;
$log->contains_only_ok( "Three lines into AppFoo" );
To actually see your log messages, you have to do something like this:
use Log::Any::Adapter ( 'File', $ENV{'HOME'} . '/CELLtest.log' );
or, even simpler:
use Log::Any::Adapter ( 'Stderr' );
DETAILED SPECIFICATIONS
Configuration
Three types of parameters
CELL recognizes three types of configuration parameters: meta
, core
, and site
. These parameters and their values are loaded from files prepared and placed in the sitedir in advance.
Meta parameters
Meta parameters are by definition mutable: the application can change a meta parameter's value any number of times, and App::CELL will not care. Initial meta
param settings are placed in a file entitled $str_MetaConfig.pm
(where $str
is a string free of underscore characters) in the sitedir. For example, if the application name is FooApp, its initial meta
parameter settings could be contained in a file called FooApp_MetaConfig.pm
. At initialization time, App::CELL looks in the sitedir for files matching this description, and attempts to load them. (See "How configuration files are named".)
Core parameters
As in Request Tracker, core
parameters have immutable values and are intended to be used as "factory defaults", set by the developer, that the site administrator can override by setting site parameters. If the application is called FooApp, its core configuration settings could be contained in a file called FooApp_Config.pm
located in the sitedir. (See "How configuration files are named" for details.)
Site parameters
Site parameters are kept separate from core parameters, but are closely related to them. As far as the application is concerned, there are only site parameters. How this works is best explained by two examples.
Let FOO
be an application that uses App::CELL.
In the first example, core param FOO
is set to "Bar" and site param FOO
is not set at all. When the application calls $site->FOO
the core parameter value "Bar" is returned.
In the second example, the core param FOO
is set to "Bar" and site param FOO
is also set, but to a different value: "Whizzo". In this scenario, when the application calls $site->FOO
the site parameter ("Whizzo") value is returned.
This setup allows the site administrator to customize the application.
Site parameters are set in a file called $str_SiteConfig.pm
, where $str
could be the appname.
Conclusion
How these three types of parameters are defined and used is up to the application. As far as App::CELL is concerned, they are all optional.
App::CELL itself has its own internal meta, core, and site parameters, but these are located elsewhere -- in the so-called "sharedir", a directory that is internal to the App::CELL distro/package.
All these internal parameters start with CELL_
and are stored in the same namespaces as the application's parameters. That means the application programmer should avoid using parameters starting with CELL_
.
Where configuration files are located
sitedir
Configuration parameters are placed in specially-named files within a directory referred to by App::CELL as the "site configuration directory", or "sitedir". This directory is not a part of the App::CELL distribution and App::CELL does not create it. Instead, the application is expected to provide the full path to this directory to CELL's initialization route, either via an argument to the function call or with the help of an environment variable. CELL's initialization routine calls App::CELL::Load::init to do the actual work of walking the directory.
This "sitedir" (site configuration directory) is assumed to be the place (or a place) where the application can store its configuration information in the form of core
, site
, and meta
parameters. For "LOCALIZATION" purposes, message
codes and their corresponding texts (in one or more languages) can be stored here as well, if desired.
sharedir
CELL itself has an analogous configuration directory, called the "sharedir", where it's own internal configuration defaults are stored. CELL's own core parameters can be overridden by the application's site params, and in some cases this can even be desirable. For example, the parameter CELL_DEBUG_MODE
can be overridden in the site configuration to tell CELL to include debug-level messages in the log.
During initialization, CELL walks first the sharedir, and then the sitedir, looking through those directories and all their subdirectories for meta, core, site, and message configuration files.
The sharedir is part of the App::CELL distro and CELL's initialization routine finds it via a call to the dist_dir
routine in the File::ShareDir module.
How the sitedir is specified
The sitedir must be created and populated with configuration files by the application programmer. Typically, this directory would form part of the application distro and the site administrator would be expected to make a site configuration file for application-specific parameters. The application developer and site administrator have flexibility in this regard -- CELL's initialization routine, $CELL->load
will work without a sitedir, with one sitedir, or even with multiple sitedirs.
No sitedir
It is possible, but probably not useful, to call $CELL->load
without a sitedir parameter and without any sitedir specified in the environment. In this case, CELL just loads the sharedir and returns OK.
One sitedir
If there is only one sitedir, there are three possible ways to specify it to CELL's load routine: (1) a sitedir
parameter, (2) an enviro
parameter, or (3) the hard-coded CELL_SITEDIR
environment variable.
Multiple sitedirs
If the application needs to load configuration parameters from multiple sitedirs, this can be accomplished simply by calling $CELL->load
multiple times with different sitedir
arguments.
Sitedir search algorithm
Every time it is called, the load routine uses the following algorithm to search for a/the sitedir:
sitedir
parameter -- asitedir
parameter containing the full path to the sitedir can be passed. If it is present, CELL will try it first. If needed for portability, the path can be constructed using File::Spec (e.g. thecatfile
method) or similar. It should be string containing the full path to the directory. If thesitedir
argument points to a valid sitedir, it is loaded and OK is returned. If asitedir
argument is present but invalid, an ERR status results. If nositedir
argument was given, CELL continues to the next step.enviro
parameter -- if nositedir
parameter is given,$CELL->load
looks for a parameter calledenviro
which it interprets as the name of an environment variable containing the sitedir path. If theenviro
argument points to a valid sitedir, it is loaded and OK is returned. If anenviro
argument is present but invalid, an ERR status results. If there is noenviro
argument at all, CELL continues to the next step.CELL_SITEDIR
environment variable -- if no viable sitedir can be found by consulting the function call parameters, CELL's load routine falls back to this hardcoded environment variable. If theCELL_SITEDIR
environment variable exists and points to a valid sitedir, it is loaded and OK is returned. If it exists but the directory is invalid, an ERR status is returned. If the environment variable doesn't exist, CELL writes a warning to the log (all attempts to find the sitedir failed). The return status in this case can be either WARN (if no sitedir was found in a previous call to the function) or OK if at least one sitedir has been loaded.
The load
routine is re-entrant: it can be called any number of times. On first call, it will load CELL's own sharedir, as well as any sitedir that can be found using the above algorithm. All further calls will just run the sitedir search algorithm again. Each time it will find and load at most one sitedir. CELL maintains a list of loaded sitedirs in $meta->CELL_META_SITEDIR_LIST
.
For examples of how to call the load
routine, see "SYNOPSIS" in App::CELL.
How configuration files are named
Once it finds a valid sitedir, CELL walks it (including all its subdirectories), assembling a list of filenames matching one four regular expressions:
^.+_MetaConfig.pm$
(meta)^.+_Config.pm$
(core)^.+_SiteConfig.pm$
(site)^.+_Message(_[^_]+){0,1}.conf$
(message)
Files with names that don't match any of the above regexes are ignored.
After the directory is walked, the files are loaded (i.e. parsed for config params and messages).
The syntax of these files is simple and should be obvious from an examination of CELL's own configuration files in the sharedir (config/
in the distro). All four types of configuration file are there, with comments.
Since the configuration files are Perl modules, Perl itself is leveraged to parse them. Values can be any legal scalar value, so references to arrays, hashes, or subroutines can be used, as well as simple numbers and strings. For details, see "SITE CONFIGURATION DIRECTORY", App::CELL::Config and App::CELL::Load.
Message file parsing is done by a parsing routine that resides in App::CELL::Load. For details on the syntax and how the parser works, see LOCALIZATION.
Configuration diagnostics
CELL provides several ways for the application to find out if the configuration files were loaded properly. First of all, the load routine ($CELL->load
) returns a status object: if the status is not OK, something went wrong and the application should look at the status more closely.
After program control returns from the load routine, the following methods and attributes can be used to find out what happened:
$site->CELL_SHAREDIR_LOADED
(boolean value)$meta->CELL_META_SITEDIR_LOADED
(boolean value: true if at least one sitedir has been loaded)$meta->CELL_META_SITEDIR_LIST
(reference to a list of all sitedirs that have been loaded -- full paths)
Verbose and debug mode
The load routine takes two options to increase its verbosity. The first option, verbose
, can be passed like this:
my $status = $CELL->load( verbose => 1 );
It causes the load routine to write additional information to the log. Since even this can easily be too much, the default value for verbose
is zero (terse logging).
The load routine also has a debug
mode which should be activated in combination with verbose
. Debug mode is actually a function of the CELL logger, and is activated like this:
$log->init( debug_mode => 1 );
Ordinarily the logger suppresses all log messages below info
level (i.e., debug
and trace
). When debug_mode
is activated, all messages are logged, regardless of level.
Error handling
STATUS OBJECTS
The most frequent case will be a status code of "OK" with no message (shown here with optional "payload", which is whatever the function is supposed to return on success:
# all green
return App::CELL::Status->new( level => 'OK',
payload => $my_return_value,
);
To ensure this is as simple as possible in cases when no return value (other than the simple fact of an OK status) is needed, we provide a special constructor method:
# all green
return App::CELL::Status->ok;
In most other cases, we will want the status message to be linked to the filename and line number where the new
method was called. If so, we call the method like this:
# relative to me
App::CELL::Status->new( level => 'ERR',
code => 'CODE1',
args => [ 'foo', 'bar' ],
);
It is also possible to report the caller's filename and line number:
# relative to my caller
App::CELL::Status->new( level => 'ERR',
code => 'CODE1',
args => [ 'foo', 'bar' ],
caller => [ CORE::caller() ],
);
It is also possible to pass a message object in lieu of code
and msg_args
(this could be useful if we already have an appropriate message on hand):
# with pre-existing message object
App::CELL::Status->new( level => 'ERR',
msg_obj => $my_msg;
);
Permitted levels are listed in the @permitted_levels
package variable in App::CELL::Log
.
Localization
Introduction
To an application programmer, localization may seem like a daunting proposition, and All strings the application displays to users must be replaced by variable names. Then you have to figure out where to put all the strings, translate them into multiple languages, write a library (or find an existing one) to display the right string in the right language at the right time and place. What is more, the application must be configurable, so the language can be changed by the user or the site administrator.
All of this is a lot of work, particularly for already existing, non-localized applications, but even for new applications designed from the start to be localizable.
App::CELL's objective is to provide a simple, straightforward way to write and maintain localizable applications in Perl. Notice the key word "localizable" -- the application may not, and most likely will not, be localized in the initial stages of development, but that is the time when localization-related design decisions need to be made. App::CELL tries to take some of the guesswork out of those decisions.
Later, when it really is time for the application to be translated into one or more additional languages, this becomes a relatively simple matter of translating a bunch of text strings that are grouped together in one or more configuration files with syntax so trivial that no technical expertise is needed to work with them. (Often, the person translating the application is not herself technically inclined.)
Localization with App::CELL
All strings that may potentially need be localized (even if we don't have them translated into other languages yet) are placed in message files under the site configuration directory. In order to be found and parsed by App::CELL, message files must meet some basic conditions:
- 1. file name format:
AppName_Message_lang.conf
- 2. file location: anywhere under the site configuration directory
- 3. file contents: must be parsable
Format of message file names
At initialization time, App::CELL walks the site configuration directory tree looking for filenames that meet certain regular expressions. The regular expression for message files is:
^.+_Message(_[^_]+){0,1}.conf$
In less-precise human terms, this means that the initialization routine looks for filenames consisting of at least three, but possibly four, components:
- 1. the application name (this can be anything)
- 2. followed by
_Message
- 3. optionally followed by
_languagetag
where "languagetag" is a language tag (see "..link.." for details) - 4. ending in
.conf
Examples:
CELL_Message.conf
CELL_Message_en.conf
CELL_Message_cs-CZ.conf
DifferentApplication_Message.conf
Location of message files
As noted above, message files will be found as long as they are readable and located anywhere under the base site configuration directory. For details on how this base site configuration directory is searched for and determined, see "..link..".
How message files are parsed
Message files are parsed line-by-line. The parser routine is parse_message_file
in the CELL::Load
module. Lines beginning with a hash sign ('#') are ignored. The remaining lines are divided into "stanzas", which must be separated by one or more blank lines.
Stanzas are interpreted as follows: the first line of the stanza should contain a message code, which is simply a string. Any legal Perl scalar value can be used, as long as it doesn't contain white space. CELL itself uses ALL_CAPS strings starting with CELL_
.
The remaining lines of the stanza are assumed to be the message text. Two caveats here:
- 1. In the configuration file, message text strings can be written on multiple lines
- 2. However, this is intended purely as a convenience for the application programmer. When
parse_message_file
encounters multiple lines of text, it simply concatenated them together to form a single, long string.
For details, see the parse_message_file
function in App::CELL::Load
, as well as App::CELL's own message file(s) in config/CELL
directory of the App::CELL distro.
How the language is determined
Internally, each message text string is stored along with a language tag, which defines which language the message text is written in. The language tag is derived from the filename using a regular expression like this one:
_Message_([^_]+).conf$
(The part in parentheses signifies the part between _Message_
and .conf
-- this is stored in the language
attribute of the message object.)
No sanity checks are conducted on the language tag. Whatever string the regular expression produces becomes the language tag for all messages in that file. If no language tag is found, CELL first looks for a config parameter called CELL_DEFAULT_LANGUAGE
and, failing that, the hard-coded fallback value is en
.
I'll repeat that, since it's important: CELL assumes that the message file names contain the relevant language tag. If the message file name is MyApp_Message_foo-bar.conf
, then CELL will tag all messages in that file as being in the foo-bar
language. Message files can also be named like this: MyApp_Message.conf
, i.e. without a language tag. In this case, CELL will attempt to determine the default language from a site configuration parameter (CELL_DEFAULT_LANGUAGE
). If this parameter is not set, then CELL will give up and assume that all message text strings are in English (language tag en
-- CELL's author's native tongue).
Language tags in general
See the W3C's "Language tags in HTML and XML" white paper for a detailed explanation of language tags:
L<http://www.w3.org/International/articles/language-tags/>
And see here for list of all language tags:
L<http://www.langtag.net/registries/lsr-language.txt>
Note that you should use hyphens, and not underscores, to separate components within the language tag, i.e.:
MyApp_Message_cs-CZ.conf # correct
MyApp_Message_cs_CZ.conf # WRONG!!
Non-ASCII characters in config/message file names: may or may not work. Better to avoid them.
Normal usage
In normal usage, the programmer adds messages to the respective message files. After CELL initialization, these messages (or, more precisely, message code-language pairs) will be available to the programmer to use, either directly via CELL::Message->new or indirectly as status codes.
If a message code has text strings in multiple languages, these language variants can be obtained by specifying the lang
parameter to CELL::Message->new. If the lang
parameter is not specified, CELL will always try to use the default language (CELL_DEF_LANG
or English if that parameter has not been set).
Logging
CELL's logging facility is based on Log::Any. In practice, this means that App::CELL::Log is simply a wrapper around this useful module. To use it, one imports the Log::Any singleton via App::CELL like this:
use App::CELL qw( $log );
Since this is the Log::Any singleton, all Log::Any methods can be used with it. CELL provides some conveniences, but they are optional. Actually, if the developer does not intend to use any of CELL's conveniences, there is no reason to import it through App::CELL at all and one can use Log::Any directly. In this case, CELL's log messages will go to the same log as the application's provided the Log::Any category is the same as the CELL appname
.
See "Verbose and debug mode" for a description of how to increase logging verbosity of the load routine.
CAVEATS
Internal parameters
App::CELL stores its own parameters (mostly meta and core, but also one site param) in a separate directory, but when loaded they end up in the same namespaces as the application's meta, core, and site parameters. The names of these internal parameters are always prefixed with CELL_
.
Therefore, the application programmer should avoid using parameters starting with CELL_
.
Mutable and immutable parameters
It is important to realize that, although core parameters can be overridden by site parameters, internally the values of both are immutable. Although it is possible to change them by cheating, the 'set' method of $core
and $site
will refuse to change the value of an existing core/site parameter.
Therefore, use $meta
to store mutable values.
Taint mode
Since it imports configuration data at runtime from files supplied by the user, App::CELL should not be run under taint mode. The load
routine checks this and will refuse to do anything if running with -T
.
To recapitulate: don't run App::CELL in taint mode.
Installation issues with CELL internal sharedir
The easiest way to install App::CELL is to use a package manager (e.g. zypper
). Another way to install directly from CPAN using, e.g., cpanm
). The former way installs to the vendor_perl
tree, while the latter installs to the site_perl
tree.
If you install two different versions of App::CELL, one via package manager and another directly from CPAN, a conflict can arise, and it may be necessary to examine CELL's log to determine which one is being used.
Even after running, e.g., cpanm -U App::CELL
, to uninstall from site_perl
, I found that CELL's internal sharedir remained intact in the site_perl
tree and had to be wiped manually.
As long as you always install either one way or other other (i.e. package manager or direct from CPAN), you won't get bitten by this.
COMPONENTS
App::CELL
This top-level module exports a singleton, $CELL
, which is all the application programmer needs to gain access to the CELL's key functions.
App::CELL::Config
This module provides CELL's Configuration functionality.
App::CELL::Guide
This guide.
App::CELL::Load
This module hides all the complexity of loading messages and config params from files in two directories: (1) the App::CELL distro sharedir containing App::CELL's own configuration, and (2) the site configuration directory, if present.
App::CELL::Log
Logging is accomplished by using and extending Log::Any.
App::CELL::Message
Localization is on the wish-list of many software projects. With CELL, the programmer can easily design and write my application to be localizable from the very beginning, without having to invest much effort.
App::CELL::Status
Provides CELL's error-handling functionality. Since status objects inherit from message objects, the application programmer can instruct CELL to generate localized status messages (errors, warnings, notices) if desired.
App::CELL::Test
Some routines used by CELL's test suite.
App::CELL::Util
Some generalized utility routines.
RATIONALE
In the author's experience, applications written for "users" (however that term may be defined) frequently need to:
- 1. be configurable by the user or site administrator
- 2. handle errors robustly, without hangs and crashes
- 3. potentially display messages in various languages
- 4. log various types of messages to syslog
Since these basic functions seem to work well together, CELL is designed to provide them in an integrated, well-documented, straightforward, and reusable package.