NAME

DBIx::Perlish - a perlish interface to SQL databases

VERSION

This document describes DBIx::Perlish version 0.12

SYNOPSIS

use DBI;
use DBIx::Perlish;

my $dbh = DBI->connect(...);
DBIx::Perlish::init($dbh);

# selects:
my @rows = db_fetch {
    my $x : users;
    $x->id != 0;
    $x->name !~ /\@/;
};

# sub-queries:
my @rows = db_fetch {
    my $x : users;
    $x->id <- db_fetch {
        my $t2 : table;
        $t2->col == 2 || $t2->col == 3;
        return $t2->user_id;
    };
    $x->name !~ /\@/;
};

# updates:
db_update {
    data->num < 100;
    data->mutable;

    data->num = data->num + 1;
    data->name = "xyz";
};

# more updates:
db_update {
    my $d : data;
    $d->num < 100, $d->mutable;

    $d = {
        num  => $d->num + 1,
        name => "xyz"
    };
};

# deletes:
db_delete {
    my $t : table;
    !defined $t->age  or
    $t->age < 18;
};

# inserts:
my $id = 42;
db_insert 'users', {
    id   => $id,
    name => "moi",
};

DESCRIPTION

The DBIx::Perlish module provides the ability to work with databases supported by the DBI module using Perl's own syntax for four most common operations: SELECT, UPDATE, DELETE, and INSERT.

By using DBIx::Perlish, you can write most of your database queries using a domain-specific language with Perl syntax. Since a Perl programmer knows Perl by definition, and might not know SQL to the same degree, this approach generally leads to a more comprehensible and maintainable code.

The module is not intended to replace 100% of SQL used in your program. There is a hope, however, that it can be used to replace a substantial portion of it.

The DBIx::Perlish module quite intentionally neither implements nor cares about database administration tasks like schema design and management. The plain DBI interface is quite sufficient for that. Similarly, and for the same reason, it does not take care of establishing database connections or handling transactions. All this is outside the scope of this module.

Ideology

There are three sensible and semi-sensible ways of arranging code that works with SQL databases in Perl:

SQL sprinkling approach

One puts queries wherever one needs to do something with the database, so bits and pieces of SQL are intermixed with the program logic. This approach can easily become an incomprehensible mess that is difficult to read and maintain.

Clean and tidy approach

Everything database-related is put into a separate module, or into a collection of modules. Wherever database access is required, a corresponding sub or method from such a module is called from the main program. Whenever something is needed that the DB module does not already provide, a new sub or method is added into it.

Object-relational mapping

One carefully designs the database schema and an associated collection of classes, then formulates the design in terms of any of the existing object-relational mapper modules like Class::DBI, DBIx::Class or Tangram, then uses objects which perform all necessary queries under the hood. This approach is even cleaner than "clean and tidy" above, but it has other issues. Some schemas do not map well into the OO space. Typically, the resulting performance is an issue as well. The performance issues can in some cases be alleviated by adding hand-crafted SQL in strategic places, so in this regard the object-relational mapping approach can resemble the "clean and tidy" approach.

The DBIx::Perlish module is meant to eliminate the majority of the "SQL sprinkling" style of database interaction. It is also fully compatible with the "clean and tidy" method.

Procedural interface

init()

The init() sub initializes procedural interface to the module.

It accepts named parameters. The init() function understands only one such parameter, dbh, which must be a valid DBI database handler. This parameter is required.

All other parameters are silently ignored.

Alternatively, init() can be called with a single positional parameter, in which case it is assumed to be the DBI database handler.

If the supplied database handler is not valid, an exception is thrown.

This procedure does not return anything meaningful.

Examples:

my $dbh = DBH->connect(...);
DBIx::Perlish::init(dbh => $dbh);

my $dbh = DBH->connect(...);
DBIx::Perlish::init($dbh);

db_fetch {}

The db_fetch {} function queries and returns data from the database.

The function parses the supplied query sub, converts it into the corresponding SQL SELECT statement, and executes it.

What it returns depends on two things: the context and the return statement in the query sub, if any.

If there is a return statement which specifies exactly one column, and db_fetch {} is called in the scalar context, a single scalar representing the requested column is returned for the first row of selected data. Example:

my $somename = db_fetch { return user->name };

Borrowing DBI's terminology, this is analogous to

my $somename =
    $dbh->selectrow_array("select name from user");

If there is a return statement which specifies exactly one column, and db_fetch {} is called in the list context, an array containing the specified column for all selected rows is returned. Example:

my @allnames = db_fetch { return user->name };

This is analogous to

my @allnames =
    @{$dbh->selectcol_arrayref("select name from user")};

When there is no return statement, or if the return statement specifies multiple columns, then an individual row is represented by a hash reference with column names as the keys.

In the scalar context, a single hashref is returned, which corresponds to the first row of selected data. Example:

my $h = db_fetch { my $u : user };
print "name: $h->{name}, id: $h->{id}\n";

In DBI parlance that would look like

my $h = $dbh->selectrow_hashref("select * from user");
print "name: $h->{name}, id: $h->{id}\n";

In the list context, an array of hashrefs is returned, one element for one row of selected data:

my @users = db_fetch { my $u : user };
print "name: $_->{name}, id: $_->{id}\n" for @users;

Again, borrowing from DBI, this is analogous to

my @users = @{$dbh->selectall_arrayref("select * from user",
    {Slice=>{}})};
print "name: $_->{name}, id: $_->{id}\n" for @users;

The db_fetch {} function will throw an exception if it is unable to find a valid database handle to use, or if it is unable to convert its query sub to SQL.

In addition, if the database handle is configured to throw exceptions, the function might throw any of the exceptions thrown by DBI.

"Subqueries" are permitted in db_fetch's query subs.

Please see "Query sub syntax" below for details of the syntax allowed in query subs.

The db_fetch {} function is exported by default.

db_update {}

The db_update {} function updates rows of a database table.

The function parses the supplied query sub, converts it into the corresponding SQL UPDATE statement, and executes it.

The function returns whatever DBI's do method returns.

The function will throw an exception if it is unable to find a valid database handle to use, or if it is unable to convert its query sub to SQL.

In addition, if the database handle is configured to throw exceptions, the function might throw any of the exceptions thrown by DBI.

A query sub of the db_update {} function must refer to precisely one table (not counting tables referred to by subqueries).

Neither return statements nor last statements are allowed in the db_update {} function's query subs.

"Subqueries" are permitted in db_update's query subs.

Please see "Query sub syntax" below for details of the syntax allowed in query subs.

Examples:

db_update {
    tbl->id == 41;
    tbl->id = tbl->id - 1;
    tbl->name = "luff";
};

db_update {
    my $t : tbl;
    $t->id == 40;
    $t = {
        id   => $t->id + 2,
        name => "LIFF",
    };
};

db_update {
    tbl->id == 40;
    tbl() = {
        id   => tbl->id + 2,
        name => "LIFF",
    };
};

The db_update {} function is exported by default.

db_delete {}

The db_delete {} function deletes data from the database.

The db_delete {} function parses the supplied query sub, converts it into the corresponding SQL DELETE statement, and executes it.

The function returns whatever DBI's do method returns.

The function will throw an exception if it is unable to find a valid database handle to use, or if it is unable to convert its query sub to SQL.

In addition, if the database handle is configured to throw exceptions, the function might throw any of the exceptions thrown by DBI.

A query sub of the db_delete {} function must refer to precisely one table (not counting tables referred to by subqueries).

Neither return statements nor last statements are allowed in the db_delete {} function's query subs.

"Subqueries" are permitted in db_delete's query subs.

Please see "Query sub syntax" below for details of the syntax allowed in query subs.

Examples:

db_delete { $x : users } # delete all users

# delete with a subquery
db_delete {
    my $u : users;
    $u->name <- db_fetch {
        visitors->origin eq "Uranus";
        return visitors->name;
    }
}

The db_delete {} function is exported by default.

db_insert()

The db_insert() function inserts rows into a database table.

This function is different from the rest because it does not take a query sub as the parameter.

Instead, it takes a table name as its first parameter, and any number of hash references afterwards.

For each specified hashref, a new row is inserted into the specified table. The resulting insert statement specifies hashref keys as the column names, with corresponding values taken from hashref values. Example:

db_insert 'users', { id => 1, name => "the.user" };

The function returns the number of insert operations performed. If any of the DBI insert operations fail, the function returns undef, and does not perform remaining inserts.

The function will throw an exception if it is unable to find a valid database handle to use.

In addition, if the database handle is configured to throw exceptions, the function might throw any of the exceptions thrown by DBI.

The db_insert {} function is exported by default.

$SQL and @BIND_VALUES

The DBIx::Perlish module provides two global variables (not exported) to aid in debugging. The $DBIx::Perlish::SQL variable contains the text of the SQL which was most recently generated by the procedures above (except db_insert()). The @DBIx::Perlish::BIND_VALUES array contains the bind values to be used with the corresponding SQL code.

Special treatment of the $dbh variable

If the procedural interface is used, and the user did not call init() before issuing any of the db_query {}, db_update {}, db_delete {} or db_insert {}, those functions look for one special case before bailing out.

Namely, they try to locate a variable my $dbh or our $dbh, in that order, in the scope in which they are used. If such variable is found, and if it contains a valid DBI database handler, they will use it for performing the actual query. This allows one to write something like that, and expect the module to do the right thing:

my $dbh = DBI->connect(...);
my @r = db_fetch { users->name !~ /\@/ };

The author cannot recommend relying on this feature in the production code; if in doubt, call init() first and you won't be unpleasantly surprized.

In order for this feature to be operational, the PadWalker module must be installed.

Query sub syntax

The important thing to remember is that although the query subs have Perl syntax, they do not represent Perl, but a specialized "domain specific" database query language with Perl syntax.

A query sub can consist of the following types of statements:

  • table variables declarations;

  • query filter statements;

  • return statements;

  • assignments;

  • result limiting statements;

  • statements with label syntax.

The order of the statements is generally not important, except that table variables have to be declared before use.

Table variables declarations

Table variables declarations allow one to associate lexical variables with database tables. They look like this:

my $var : tablename;

It is possible to associate several variables with the same table; this is the preferable mechanism if self-joins are desired.

Please note that "db_update {}" and "db_delete {}" must only refer to a single table.

Query filter statements

Query filter statements have a general form of Perl expressions. Binary comparison operators, logical "or" (both high and lower precedence form), matching operators =~ and !~, binary arithmetic operators, and unary ! are all valid in the filters.

Individual terms can refer to a table column using dereferencing syntax (either table->column or $tablevar->column), to an integer, floating point, or string constant, to a scalar lexical variable from an outer scope, or to a function call.

Function calls can take an arbitrary number of arguments. Each argument to a function must currently be a term, although it is expected that more general expressions will be supported in the future. The function call appear verbatim in the resulting SQL, with the arguments translated from Perl syntax to SQL syntax. For example:

lower($t1->name) eq lower($t2->lastname);

The lc and uc builtin functions are translated to lower and upper, respectively.

Return statements

Return statements determine which columns are returned by a query under what names. Each element in the return statement can be either a reference to a table column, or a string constant, in which case it is taken as an alias to the next element in the return statement:

return ($table->col1, anothername => $table->col2);

One can also specify a "distinct" or "DISTINCT" string constant in the beginning of the return list, in which case duplicated rows will be eliminated from the result set.

Return statements are only valid in "db_fetch {}".

Query subs representing subqueries using the reverse arrow notation must have exactly one return statement returning exactly one column (see "Subqueries" below).

Assignments

Assignments can take two form: individual column assignments or bulk assignments. The former must have a reference to a table column on the left-hand side, and an expression like those accepted in filter statements on the right-hand side:

table->id = 42;
$t->column = $t->column + 1;

The bulk assignments must have a table specifier on the left-hand side, and a hash reference on the right-hand side. The keys of the hash represent column names, and the values are expressions like those in the individual column assignments:

$t = {
    id     => 42,
    column => $t->column + 1
};

or

table() = {
    id     => 42,
    column => table->column + 1
};

Please note a certain ugliness in table() in the last example, so it is probably better to either use table vars, or stick to the single assignment syntax of the first example.

Assignment statements are only valid in "db_update {}".

Result limiting statements

The last command can be used to limit the number of results returned by a fetch operation.

If it stands on its own anywhere in the query sub, it means "stop after finding the first row that matches other filters", so it is analogous to LIMIT 1 in many SQL dialects.

It can also be used in conjunction with a range .. operator, so that

last unless 5..20;

is equivalent to

OFFSET 5 LIMIT 16

Result limiting statements are only valid in "db_fetch {}".

Statements with label syntax

There is a number of special labels which query sub syntax allows.

Specifying label distinct: anywhere in the query sub leads to duplicated rows being eliminated from the result set.

Specifying label limit: followed by a number (or a scalar variable representing a number) limits the number of rows returned by the query.

Specifying label offset: followed by a number N (or a scalar variable representing a number N) skips first N rows from the returned result set.

Specifying label order:, orderby:, order_by:, sort:, sortby:, or sort_by:, followed by a list of expressions will sort the result set according to the expressions. For details about the sorting criteria see the documentation for ORDER BY clause in your SQL dialect reference manual. Before a sorting expression in a list one may specify one of the string constants "asc", "ascending", "desc", "descending" to alter the sorting order, for example:

db_fetch {
    my $t : tbl;
    order_by: asc => $t->name, desc => $t->age;
};

Specifying label group:, groupby:, or group_by:, followed by a list of column specifiers is equivalent to the SQL clause GROUP BY col1, col2, ....

All special labels are case insensitive.

Special labels are only valid in "db_fetch {}".

Subqueries

It is possible to use subqueries in "db_fetch {}", "db_update {}", and "db_delete {}".

There are two variants of subqueries. The first one is a call, as a complete statement, to "db_fetch {}" anywhere in the body of the query sub. This variant corresponds to the EXISTS (SELECT ...) SQL construct, for example:

db_delete {
    my $t : table;
    db_fetch {
        $t->id == table2->table_id;
    };
};

Another variant corresponds to the column IN (SELECT ...) SQL construct. It uses a special syntax with back-arrow <-, which signifies that the column specifier on the left gets its values from whatever is returned by a "db_fetch {}" on the right:

db_delete {
    my $t : table;
    $t->id  <-  db_fetch {
        return table2->table_id;
    };
};

This variant puts a limitation on the return statement in the sub-query query sub. Namely, it must contain a return statement with exactly one return value.

Object-oriented interface

new()

Constructs and returns a new DBIx::Perlish object.

Takes a single mandatory named parameter, dbh, which must be a valid DBI database handler.

Can throw an exception if the supplied parameters are incorrect.

fetch()

An object-oriented version of "db_fetch {}".

update()

An object-oriented version of "db_update {}".

delete()

An object-oriented version of "db_delete {}".

insert()

An object-oriented version of "db_insert()".

sql()

Takes no parameters. Returns the SQL string, most recently generated by database queries performed by the object. Returns undef if there were no queries made thus far.

Example:

$db->query(sub { $u : users });
print $db->sql, "\n";

bind_values()

Takes no parameters. Returns an array of bind values that were used in the most recent database query performed by the object. Returns an empty array if there were not queries made thus far.

Example:

$db->query(sub { users->name eq "john" });
print join(", ", $db->bind_values), "\n";

Implementation details and more ideology

To achieve its purpose, this module uses neither operator overloading nor source filters.

The operator overloading would only work if individual tables were represented by Perl objects. This means that an object-relational mapper like Tangram can do it, but DBIx::Perlish cannot.

The source filters are limited in other ways: the modules using them are often incompatible with other modules that also use source filtering, and it is very difficult to do source filtering when any degree of flexibility is required. Only perl can parse Perl!

The DBIx::Perlish module, on the other hand, leverages perl's ability to parse Perl and operates directly on the already compiled Perl code. In other words, it parses the Perl op tree (syntax tree).

The idea of this module came from Erlang. Erlang has a so called list comprehension syntax, which allows one to generate lists using generator expressions and to select the list elements using filter expressions. Furthermore, the authors of the Erlang database, Mnesia, hijacked this syntax for the purpose of doing database queries via a mechanism called parse transform. The end result was that the database queries in Erlang are expressed by using Erlang's own syntax.

I found this approach elegant, and thought "why something like this cannot be done in Perl"?

CONFIGURATION AND ENVIRONMENT

DBIx::Perlish requires no configuration files or environment variables.

DEPENDENCIES

The DBIx::Perlish module needs at least perl 5.8.2, quite possibly a somewhat higher version. I have only tested it on 5.8.8 and 5.8.4.

This module requires DBI to do anything useful.

In order to support the special handling of the $dbh variable, PadWalker needs to be installed.

Other modules used used by DBIx::Perlish are included into the standard Perl distribution.

INCOMPATIBILITIES

None reported.

BUGS AND LIMITATIONS

No bugs have been reported.

Please report any bugs or feature requests to bug-dbix-perlish@rt.cpan.org, or through the web interface at http://rt.cpan.org.

A number of features found in many SQL dialects is not supported.

The module cannot handle more than 100 tables in a single query sub.

Although variables closed over the query sub can be used in it, only simple scalars are understood at the moment. Similarly, variable interpolation inside regular expressions is also not supported.

If you would like to see something implemented, or find a nice Perlish syntax for some SQL feature, please let me know!

AUTHOR

Anton Berezin <tobez@tobez.org>

ACKNOWLEDGEMENTS

I would like to thank Dmitry Karasik, Henrik Andersen, Lars Thegler, and Phil Regnauld for discussions, suggestions and code contributions.

This work is in part sponsored by Telia Denmark.

LICENSE AND COPYRIGHT

Copyright (c) 2007, Anton Berezin <tobez@tobez.org>. All rights reserved.

Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:

1. Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.

2. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.

THIS SOFTWARE IS PROVIDED BY AUTHOR AND CONTRIBUTORS ``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL AUTHOR OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.