NAME
File::Tabular::Web - turn tabular files into web applications
SYNOPSIS
# start a local HTTP server
plackup -MFile::Tabular::Web -e "File::Tabular::Web->new->to_app"
# create an application scaffolding from a tabular file
perl ftw_new_app.pl path/to/some/data.txt
# use the app
http://localhost:5000/path/to/some/data.ftw?S=foo
http://localhost:5000/path/to/some/data.ftw?S=col1:bar*
http://localhost:5000/path/to/some/data.ftw?S=col2 < 123 AND col3 ~ \w\d
http://localhost:5000/path/to/some/data.ftw?L=id_of_some_record
# not displayed here
# - POST URLs to edit the data
# - integration in a real application server instead of localhost
# customize the app -- no programming involved
edit path/to/some/{data_short.tt,data_long.tt,data_edit.tt} # views
edit path/to/some/data.ftw # config
DESCRIPTION
This is a simple web application framework for searching, displaying and updating data from flat tabular files.
The framework is based on File::Tabular and Search::QueryParser for searching and editing facilities, and on Plack middleware for Web support. As a result, it will run on any Plack-supported infrastructure, like CGI
, FCGI
, modperl
, or a local HTTP server launched from the command line through the plackup utility.
The strong point of File::Tabular::Web
is that it is built around a versatile search engine, convenient for Web-style queries: this search engine spans all data fields by default, but can also retrieve words in specific fields, find prefixes, apply regular expressions, compare numerical values, use boolean combinations, etc. All of that power is available directly to all applications within the framework, without any programming. To build a new application, all that is needed is to invoke the ftw_new_app.pl
script, which will create some scaffolding templates for searching, displaying and editing your data. The application is immediately usable; the templates can be customized to improve the look and feel, and the configuration file can be edited to tune some aspects like access control ... but no Perl code is needed, at least not for common needs. So if you are looking for simplicity and speed of development and deployment, and are ready to sacrifice some speed of execution, then you may have found a convenient tool.
This framework has been used successfully for about 15 years in our Intranet for managing lists of people, rooms, meetings, links, etc., and even for more sensitive information like lists of payments or the archived judgements (minutes) of Geneva courts. Of course this technology is much slower than a real database, but if the data is not too big and the frequency of requests is not too high, it can be a perfectly viable solution.
See also File::Tabular::Web::Attachments and File::Tabular::Web::Attachments::Indexed for subclasses that extend the framework with methods for managing documents attached to data fields.
QUICKSTART
HTTP server configuration
File::Tabular::Web
is designed so that it only needs to be installed once and for all in your HTTP server configuration. Then all applications can be added or modified on the fly, without restarting the server.
For this to work, you need to tell the HTTP server which URLs are going to be served by File::Tabular::Web
. Although there are several ways to achieve this, the recommended way is to choose a file extension to be associated with this module and define a general rule performing the mapping. Here is an example using Apache, Plack and mod_perl :
<LocationMatch "\.ftw$">
SetHandler perl-script
PerlResponseHandler Plack::Handler::Apache2
PerlSetVar psgi_app /path/to/ftw.psgi
</LocationMatch>
where /path/to/ftw.psgi is a path to a simple PSGI file containing the following code :
use File::Tabular::Web;
my $app = File::Tabular::Web->new->to_app;
Once this is configured, any URL ending in .ftw will be served by File::Tabular::Web
.
The example above just gives the general idea; similar configurations can be obtained with FCGI or other architectures, setting the rules either within the HTTP server or within the Plack
middleware; see your web server documentation and the Plack documentation.
For development purposes, an application server can be started from the command line, thanks to the Plack infrastructure :
plackup /path/to/ftw.psgi
or the ftw.psgi
can even be dispensed with, through the command
plackup -MFile::Tabular::Web -e "File::Tabular::Web->new->to_app"
Setting up a particular application
An application consists of a data file, a configuration file and a couple of template files. In the simplest setting, all of these should be located in the same directory, at some path under an app_root directory, which by default is the same as the DOCUMENT_ROOT of your HTTP server. The data file is all you need to get started; the other files will be generated automatically.
We will show this through the example of a simple people directory application, assuming an Apache server where the document root is the htdocs
directory. If you want to try with a local server instead, using the plackup
command shown above, your document root is the current directory.
First create directory htdocs/people.
Let's assume that you already have a list of people, in a spreadsheet or a database. Export that list into a flat text file named htdocs/people/dir.txt. If you export from an Excel Spreadsheet, do NOT export as CSV format ; choose "text (tab-separated)" instead. The datafile should contain one line per record, with a character like '|' or TAB as field separator, and field names on the first line (see File::Tabular for details).
Run the helper script
perl ftw_new_app.pl --fieldSep \\t htdocs/people/dir.txt
This will create in the same directory a configuration file
dir.ftw
, and a collection of HTML templatesdir_short.tt
,dir_long.tt
,dir_modif.tt
, etc. The--fieldSep
option specifies which character acts as field separator (the default is '|'); other option are available, seeperl ftw_new_app.pl --help
for a list.
The URL
http:://your.web.server/people/dir.ftw
is now available to access the application, ready for searching, displaying, and maybe edit the data. You may first test the default layout, and then customize the templates to suit your needs. The templating language is documented in the "Template Toolkit's documentation".
Note : initially all files are placed in the same directory, because it is simple and convenient; however, data and templates files are not really web resources and therefore theoretically should not belong to the htdocs tree. If you want a more structured architecture, you may move these files to a different location, and specify within the configuration how to find them (see instructions below).
In most cases, the steps just shown will be sufficient, so they can be performed by a webmaster without Perl knowledge.
For more advanced uses, application-specific Perl subclasses can be hooked up into the framework for performing particular tasks. See for example the companion File::Tabular::Web::Attachments module, which provides services for attaching documents and indexing them through Search::Indexer, therefore providing a mini-framework for storing electronic documents.
WEB API
Entry points
Various entry points into the application (searching, editing, etc.) are chosen by single-letter arguments :
H
http://myServer/some/app.ftw?H
Displays the homepage of the application (through the home
view). This is the default entry point, i.e. equivalent to
http://myServer/some/app.ftw
S
http://myServer/some/app.ftw?S=<criteria>
Searches records matching the specified criteria, and displays a short summary of each record (through the short
view). Here are some example of search criteria :
word1 word2 word3 # records containing these 3 words anywhere
+word1 +word2 +word3 # idem
word1 word2 -word3 # containing word1 and word2 but not word3
word1 AND (word2 OR word3) # obvious
"word1 word2 word3" # sequence
word* # word completion
field1:word1 field2:word2 # restricted by field
field1 == val1 field2 > val2 # relational operators (will inspect the
# shape of supplied values to decide
# about string/numeric/date comparisons)
field~regex # regex
See Search::QueryParser and File::Tabular for more details.
Additional parameters may control sorting and pagination. Ex:
?S=word&orderBy=birthdate:-d.m.y,lastname:alpha&count=20&start=40
- count
-
How many items to display on one page. Default is 50.
- start
-
Index within the list of results, telling which is the first record to display (basis is 0).
- orderBy
-
How to sort results. This may be one or several field names, possibly followed by a specification like
:num
or:-alpha
. Precise syntax is documented in "cmp" in Hash::Type. - max
-
Maximum number of records retrieved in a search (records beyond that number will be dropped).
L
http://myServer/some/app.ftw?L=<key>
Finds the record with the given key and displays it in detail through the long
view.
M
http://myServer/some/app.ftw?M=key
If called with method GET, finds the record with the given key and displays it through the modif
view (typically this view will be an HTML form).
If called with method POST, finds the record with the given key and updates it with given field names and values. After update, displays an update message through the msg
view.
A
http://myServer/some/app.ftw?A
If called with method GET, displays a form for creating a new record, through the modif
view. Fields may be pre-filled by default values given in the configuration file.
If called with method POST, creates a new record, with values given by the submitted form. After record creation, displays an update message through the msg
view.
D
http://myServer/some/app.ftw?D=<key>
Deletes record with the given key. After deletion, displays an update message through the msg
view.
X
http://myServer/some/app.ftw?X
Display all records throught the download
view (mnemonic : eXtract)
Additional parameters
V
Name of the view (i.e. template) that will be used for displaying results. For example, assuming that the application has defined a print
view, we can call that view through
http://myServer/some/app.ftw?S=<criteria>&V=print
WRITING TEMPLATES
This section assumes that you already know how to write templates for the Template Toolkit (see Template).
The path for searching templates includes
the application directory (where the configuration file resides)
the directory specified within the configuration file by parameter
[template]dir
some default directories:
<app_root>/../lib/tmpl/ftw/<application_name>
,<app_root>/../lib/tmpl/ftw/<default>
,<app_root>/../lib/tmpl/ftw
.
Values passed to templates
self
-
handle to the
File::Tabular::Web
object; from there you can accessself.url
(URL of the application),self.app_root
(root dir for applications, by default equal to DOCUMENT_ROOT),self.cfg
(configuration information, an AppConfig object),self.mtime
(modification time of the data file), andself.msg
(last message). You can also call methods "can_do" or "param", like for example[% IF self.can_do('add') %] <a href="?A">Add a new record</a> [% END # IF %]
or
[% self.param('myFancyParam') %]
found
-
structure containing the results of a search. Fields within this structure are :
count
-
how many records were retrieved
records
-
arrayref containing a slice of records
start
-
index of first record in the returned slice
end
-
index of last record in the returned slice
next_link
-
href link to the next slice of results (if any)
prev_link
-
href link to the previous slice of results (if any)
Using relative URLS
All pages generated by the application have the same URL; query parameters control which page will be displayed. Therefore all internal links can just start with a question mark : the browser will recognize that this is a relative link to the same URL, with a different query string. So within templates we can write simple links like
<a href="?H">Homepage</a>
<a href="?S=*">See all records</a>
<a href="?A">Add a new record</a>
[% FOREACH record IN found.records %]
<a href="?M=[% record.Id %]">Modify this record</a>
[% END # FOREACH %]
Forms
Data input
A typical form for updating or adding a record will look like
<form method="POST">
First Name <input name="firstname" value="[% record.firstname %]"><br>
Last Name <input name="lasttname" value="[% record.lastname %]">
<input type="submit">
</form>
Usually there is no need to specify the action
of the form : the default action sent by the browser will be the same URL (including the query parameter ?A
or ?M=[% record.Id %]
). When the application receives a POST request, it knows it has to update or add the record instead of displaying the form. This implies that you must use the POST method for any data modification; whereas forms for searching may use either GET or POST methods.
For convenience, deletion through a GET url of shape ?D=[% record.Id %]
is supported; however, data modification through GET method is not recommended, and therefore it is preferable to write
<form method="post">
<input name="D" value="[% record.Id %]">
<input type="submit" value="Delete this record">
</form>
Searching
A typical form for searching will look like
<form method="POST" action="[% self.url %]">
Search :
<select name="S">
<option value="">--Choose in field1--</option>
<option value="+field1:val1">val1</option>
<option value="+field1:val2">val2</option>
...
</select>
Other : <input name="S">
<input type="submit">
</form>
So the form can combine several search criteria, all passed through the S
parameter. The form method can be either GET or POST; but if you choose POST, then it is recommended that you also specify
action="[% self.url %]"
instead of relying on the implicit self-url from the browser. Otherwise the URL displayed in the browser may still contain some all criteria from a previous search, while the current form sends other search criteria --- the application will not get confused, but the user might.
Highlighting the searched words
The preMatch
and postMatch
parameters in the configuration file (see below) define some marker strings that will be automatically inserted in the data returned by a search, surrounding each word that was mentioned in the query. These marker strings should be chosen so that they would unlikely mix with regular data or with HTML markup : the recommanded values are
preMatch {[
postMatch ]}
Then you can exploit that marking within your templates by calling the "highlight" and "unhighlight" template filters, described below.
CONFIGURATION FILE
The configuration file is always stored within the htdocs
directory, at the location corresponding to the application URL : so for application http://myServer/some/data.ftw, the configuration file is in
/path/to/http/htdocs/some/data.ftw
Because of the HTTP server configuration directives described above, the URL is always served by File::Tabular::Web
, so there is no risk of users seing the content of the configuration file.
The configuration is written in Appconfig format. This format supports comments (starting with #
), continuation lines (through final \
), "heredoc" quoting style for multiline values, and section headers similar to a Windows INI file. All details about the configuration file format can be found in Appconfig::File.
Below is the list of the various recognized sections and parameters.
Global section
The global section (without any section header) can contain general-purpose parameters that can be retrieved later from the viewing templates through [% self.cfg.<param> %]
; this is useful for example for setting a title or other values that will be common to all templates.
The global section may also contain some options to "new" in File::Tabular : preMatch
, postMatch
, avoidMatchKey
, fieldSep
, recordSep
.
Option highlightClass
defines the class name used by the "highlight" filter (default is HL
).
[fixed] / [default]
The fixed
and default
sections simulate parameters to the request. Specifications in the fixed
section are stronger than HTTP parameters; specifications in the default
section are weaker : the param method for the application will first look in the fixed
section, then in the HTTP request, and finally in the default
section. So for example with
[fixed]
count=50
[default]
orderBy=lastname
a request like
?S=*&count=20
will be treated as
?S=*&count=50&orderBy=lastname
Relevant parameters to put in fixed
or in default
are described in section "S" of this documentation : for example count
, orderBy
, etc.
[application]
dir=/some/directory
-
Directory where application files reside. By default : same directory as the configuration file.
name=some_name
-
Name of the application (will be used for example as prefix to find template files). This must be a single-level name (no pathnames allowed).
data=some_name
-
Name of the tabular file containing the data. This must be a single-level name and must reside in the application directory. By default: application name with the
.txt
suffix appended. class=My::File::Tabular::Web::Subclass
-
Will dynamically load the specified module and use it as class for objects of this application. The specified module must be a subclass of
File::Tabular::Web
. useFileCache=1
-
If true, the whole datafile will be slurped into memory and reused across requests (except update requests).
mtime=<format>
-
Format to display the last modified time of the data file, using POSIX strftime(). The result will be available to templates in
[% self.mtime %]
[permissions]
This section specifies permissions to perform operations within the application. Of course we need the HTTP server to be configured to do some kind of authentification, so that the application receives a user name through the REMOTE_USER
environment variable. Otherwise the default user name received by the application is "Anonymous". Instructions for setting up authentication for an Apache server are documented at http://httpd.apache.org/docs/2.4/howto/auth.html.
The HTTP server may also be configured to do some kind of authorisation checking, but this will control access to the application as a whole, whereas here we configure fine-grained permissions for various operations.
Builtin permission names are : search
, read
, add
, delete
, modif
, and download
. Each name also has a negative counterpart, i.e. no_search
, no_read
, etc.
For each of those permission names, the configuration can give a list of user names separated by commas or spaces : the current user name will be compared to this list. A permission may also specify '*
', which means 'everybody' : this is the default for permissions read
, search
and download
. There is no builtin notion of "user groups", but you can introduce such a notion by writing a subclass which overrides the "user_match" method.
Permissions may also be granted or denied on a per-record basis : writing $fieldname
(starting with a literal dollar sign) means that users can access records in which the content of fieldname
matches their username. Usually this is associated with an automatic user field (see below), so that the user who created a new record can later modify it.
Example :
[permissions]
read = * # the default, could have been omitted
search = * # idem
add = andy bill
modif = $last_author # username must match content of field 'last_author'
delete = $last_author
[fields]
The fields
section specifies some specific information about fields in the tabular file.
time <field> = <format>
-
Declares
field
to be a time field, which means that whenever a record is updated, the current local time will be automatically inserted in that field. The format argument will be passed to POSIX strftime(). Ex :time DateModif = %d.%m.%Y time TimeModif = %H:%M:%S
user = <field>
-
Declares
field
to be a user field, which means that whenever a record is updated, the current username will be automatically inserted in that field. default <field> = <value>
-
Default values for some fields ; will be inserted into new records.
autoNum <field>
-
Activates autonumbering for new records ; the number will be stored in the given field. Automatically implies that
default <field> = '#'
.
Subclasses may add more entries in this section (for example for specifying fields that will hold names of attached documents).
[template]
This section specifies where to find templates for various views. The specified locations will be looked for in several directories: the application template directory (as specified by dir
directive, see below), the application directory, the default File::Tabular::Web
template directory (as specified by the app_tmpl_default_dir
method), or the subdirectory default
of the above.
- dir
-
specifies the application template directory
- short
-
Template for the "short" display of records (typically a table for presenting search results).
- long
-
Template for the "long" display of records (typically for a detailed presentation of a single record ).
- modif
-
Template for editing a record (typically this will be a form with an action to call the update URL (
?M=key
). - msg
-
Template for presenting special messages to the user (messages after a record update or deletion, or error messages).
- home
-
Homepage for the application.
Defaults for these templates are <application_name>_short.tt
, <application_name>_long.tt
, etc.
METHODS
Note on the architecture
The internal object-oriented design is a bit unorthodox, mainly because I wrote it many years ago at a time when I was less familiar with Web architectures, and also because when migrating to Plack I also had to keep the previous modperl+CGI API for preserving backwards compatibility. Unfortunately the architecture cannot be changed now, because there might be subclasses that rely on this particular design. External users need not worry, but authors of subclasses should be aware of the design.
There are two kinds of instance of the File::Tabular::Web
class :
if running under Plack, one instance is the persistent Plack component that will execute the "call" method at each request.
at each HTTP request, a new transient instance of
File::Tabular::Web
class is created; that instance holds temporary information needed to communicate across the various steps of request handling. It is automatically destroyed after having sent the response.
In addition, the module itself maintains a collection of application hashrefs, loaded dynamically when needed. Each application hashref holds information about its configuration file, template files, etc.
By convention, methods starting with an underscore are meant to be private, i.e. should not be redefined in subclasses.
Entry point
new
use File::Tabular::Web;
my $ftw = File::Tabular::Web->new(app_root => $some_directory);
The new
method creates a Plack component which can serve requests to a collection of File::Tabular::Web applications.
The app_root
optional argument tells where application files are located : relative URL to applications will be mapped to relative paths starting from this root. If the argument is not explictly supplied, a default value is guessed by the system, looking at
$mod_perl->document_root
(if under mod_perl)$env->{CONTEXT_DOCUMENT_ROOT}
(new in Apache2.4)$env->{DOCUMENT_ROOT}
to_app
my $app = $ftw->to_app;
Creates a Plack application from the Plack component. This method is just inherited from Plack::Component.
handler
File::Tabular::Web->handler;
Legacy code : this method used to be the main entry point into the module, to be called from mod_perl or CGI scripts. Now the entry point is Plack's to_app
method shown above. The handler
method remains only for backwards compatibility; new projects should not use this.
Methods for creating / initializing "application" hashrefs
_app_new
Reads the configuration file for a given application and creates a hashref storing the information. The hashref is put in a global cache of all applications loaded so far.
This method should not be overridden in subclasses; if you need specific code to be executed, use the "app_initialize" method.
_app_read_config
Glueing code to the AppConfig module.
app_initialize
Initializes the application hashref. In particular, it creates the Template object, with appropriate settings to specify where to look for templates.
If you override this method in subclasses, you should probably call SUPER::app_initialize
.
app_tmpl_default_dir
Returns the default directory containing templates. The default is <app_root>/../lib/tmpl/ftw
.
app_tmpl_filters
Returns a hashref of filters to be passed to the Template object (see Template::Filters).
The default contains two filters, which work together with the preMatch
and postMatch
parameters of the configuration file. Suppose the following configuration :
preMatch {[
postMatch ]}
Then the filters are defined as follows :
- highlight
-
Replaces strings of shape
{[...[}
by<span class="HL">...</span>
.The class name is
HL
by default, but another name can be defined through thehighlightClass
configuration parameter. Templates have to define a style for that class, like for example<style> .HL {background: lightblue} </style>
- unhighlight
-
Replaces strings of shape
{[...[}
by...
(i.e. removes the marking).
These filters are intended to help highlighting the words matched by a search request ; usually this must happen after the data has been filtered for HTML entities. So a typical use in a template would be for example
<a href="/some/url?with=[% record.foo | unhighlight | uri %]">
link to [% record.foo | html | highlight %]
</a>
app_phases_definition
As explained above in section "WEB API", various entry points into the application are chosen by single-letter arguments; here this method returns a table that specifies what happens for each of them.
A letter in the table is associated to a hashref, with the following keys :
- pre
-
name of method to be executed in the "data preparation phase"
- op
-
name of method to be executed in the "data manipulation phase"
- view
-
name of view for displaying the results
Methods for instance creation / initialization
_new
Creates a new object, which represents an HTTP request to the application. The class for the created object is generally File::Tabular::Web
, unless specified otherwise in the the configuration file (see the class
entry in section "CONFIGURATION FILE").
The _new
method cannot be redefined in subclasses; if you need custom code to be executed, use "initialize" or "app_initialize" (both are invoked from _new
).
initialize
Code to initialize the object. The default behaviour is to setup max
, count
and orderBy
within the object hash.
_setup_phases
Reads the phases definition table and decides about what to do in the next phases.
open_data
Retrieves the name of the datafile, decides whether it should be opened for readonly or for update, and creates a corresponding File::Tabular object. The datafile may be cached in memory if directive useFileCache
is activated.
_cached_content
Implementation of the memory cache; checks the modification time of the file to detect changes and invalidate the cache.
Methods that can be called from templates
param
[% self.param %]
With no argument, returns the list of parameter names to the current HTTP request.
[% self.param(param_name) %]
With an argument, returns the value that was specified under $param_name
in the HTTP request, or in the configuration file (see the description of [fixed]/[default]
sections). The return value is always a scalar (so this is not exactly the same as calling cgi.param(...)
). If the HTTP request contains multiple values under the same name, these values are joined with a space. Initial and trailing spaces are automatically removed.
If you need to access the list of values in the HTTP request, you can call
[% self.req.param(param_name) %]
can_do
[% self.can_do($action, [$record]) %]
Tells whether the current user has permission to do $action
(which might be 'modif', 'delete', etc.). See explanations above about how permissions are specified in the initialization file. Sometimes permissions are setup in a record-specific way (for example one data field may contain the names of authorized users); the second optional argument is meant for those cases, so that can_do()
can inspect the current data record.
Request handling : general methods
_dispatch_request
Executes the various phases of request handling
display
Finds the template corresponding to the view name, gathers its output, and prints it together with some HTTP headers.
Request handling : search methods
search_key
Search a record with a specific key. Puts the result into $self->{result}
.
search
Search records matching given criteria (see File::Tabular for details). Puts results into $self->{result}
.
before_search
Initializes $self->{search_string}
. Overridden in subclasses for more specific searching (like for example adding fulltext search into attached documents).
sort_and_slice
Choose a slice within the result set, according to pagination parameters count
and start
.
_url_for_next_slice
Returns an URL to the next or previous slice, using "params_for_next_slice".
params_for_next_slice
Returns an array of strings "param=value"
that will be inserted into the URL for next or previous slice.
words_queried
List of words found in the query string (to be used for example for highlighting those words in the display).
Update Methods
empty_record
Generates an empty record (preparation for adding a new record). Fields are filled with default values specified in the configuration file.
update
Checks for permission and then performs the update. Most probably you don't want to override this method, but rather the methods before_update
or after_update
.
before_update
Copies values from HTTP parameters into the record, and automatically fills the user name or current time/date in appropriate fields.
after_update
Hook for any code to perform after an update (useful for example for attached documents).
rollback_update
Hook for any code to roll back whatever was performed in before_update
, in case the update failed (useful for example for attached documents).
Delete Methods
delete
Checks for permission and then performs the delete. Most probably you don't want to override this method, but rather the methods before_delete
or after_delete
.
before_delete
Hook for any code to perform before a delete.
after_delete
Hook for any code to perform aftere a delete.
Miscellaneous methods
prepare_download
Checks for permission to download the whole dataset.
print_help
Prints help. Not implemented yet.
user_match
$self->user_match($access_control_list)
Returns true if the current user (as stored in $self->{user}
"matches" the access control list (given as an argument string).
The meaning of "matches" may be redefined in subclasses; the default implementation just performs a regex case-insensitive search within the list for a complete word equal to the username.
Override in subclasses if you need other authorization schemes (like for example dealing with groups).
key_field
Returns the name of the key field in the data file.
key
my $key = $self->key($record);
Returns the value in the first field of the record.
AUTHOR
Laurent Dami, <dami AT cpan DOT org>
COPYRIGHT & LICENSE
Copyright 2007-2016 Laurent Dami, all rights reserved.
This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.