NAME
DirDB - use a directory as a persistence back end for (multi-level) (blessed) hashes (that may contain array references) (and can be advisorialy locked)
SYNOPSIS
use DirDB;
tie my %session, 'DirDB', "./data/session";
$session{$sessionID}{email} = get_emailaddress();
$session{$sessionID}{objectcache}{fribble} ||= new fribble;
#
use Tie::File; # see below -- any array-in-a-filesystem representation
# is supported
push @{$session{$sessionID}{events}}, $event;
DESCRIPTION
DirDB is a package that lets you access a directory as a hash. The final directory will be created, but not the whole path to it. It is similar to Tie::Persistent, but different in that all accesses are immediately reflected in the file system, and very little is kept in perl memory. (your OS's file cacheing takes care of that -- DirDB only hits the disk a lot on poorly designed operating systems without file system caches, which isn't any of them any more.)
The empty string, used as a key, will be translated into ' EMPTY' for purposes of storage and retrieval. File names beginning with a space are reserved for metadata for subclasses, such as object type or array size or whatever. Key names beginning with a space get an additional space prepended to the name for purposes of naming the file to store that value.
As of version 0.05, DirDB can store hash references. references to tied hashes are recursively copied, references to plain hashes are first tied to DirDB and then recursively copied. Storing a circular hash reference structure will cause DirDB to croak.
As of version 0.06, DirDB now recursively copies subdirectory contents into an in-memory hash and returns a reference to that hash when a previously stored hash reference is deleted in non-void context.
As of version 0.07, non-HASH references are stored using Storable
As of version 0.08, non-HASH references cause croaking again: the Storable functioning has been moved to DirDB::Storable
Version 0.10 will store and retrieve blessed hash-references and blesses them back into what they were when they were stored.
Version 0.12 closes some directory handles which were not being closed automatically on cygwin, interfering with tests passing.
ARRAY tie-time argument
Version 0.11 allows storing and retrieval of references to arrays through taking an 'ARRAY' tie-time argument, which is an arrayref of the args used to tie the array before returning it. A token that is string-equal to 'DATAPATH' will be replaced with a place in the file system for the array tieing implementation to do it's thing. At this version, the default array implementation is
['Tie::File' => DATAPATH => recsep => "\0"]
but this may change, perhaps when a DirDB::Array package that gracefully handles references is devised. Forwards-compatibility is maintained by storing the array implementation details with each stored arrayref.
lock method (package DirDB::lock)
Version 0.11 also introduces a lock
method that obtains an advisory mkdir lock on either a whole tied hash or on a key in it.
tie %P, DirDB=>'/home/aurora/persistentdata';
...
my $advisory_lock1 = tied(%P)->lock; # on the whole hash
my $advisory_lock2 = tied(%P)->lock('birdy'); # on the key 'birdy'
{
my $advisory_lock3 = tied(%P)->lock(''); # on the null key
these locks last until they are DESTROY
ed by the garbage collctor or until the release
method is called on them.
$advisory_lock1->release;
release $advisory_lock2;
};
croaking on permissions problems
DirDB will croak if it can't open an existing file system entity.
tie my %d => DirDB, '/tmp/foodb';
$d{ref1}->{ref2}->{ref3}->{ref4} = 'something';
# 'something' is now stored in /tmp/foodb/ref1/ref2/ref3/ref4
my %e = (1 => 2, 2 => 3);
$d{e} = \%e;
# %e is now tied to /tmp/foodb/e, and
# /tmp/foodb/e/1 and /tmp/foodb/e/2 now contain 2 and 3, respectively
$d{f} = \%e;
# like `cp -R /tmp/foodb/e /tmp/foodb/f`
$e{destination} = 'Kashmir';
# sets /tmp/foodb/e/destination
# leaves /tmp/foodb/f alone
my %g = (1 => 2, 2 => 3);
$d{g} = {%g};
# %g has been copied into /tmp/foodb/g/ without tying %g.
Pipes and so on are opened for reading and read from on FETCH, and clobbered on STORE.
The underlying object is a scalar containing the path to the directory. Keys are names within the directory, values are the contents of the files.
STOREMETA and FETCHMETA methods are provided for subclasses who which to store and fetch metadata (such as array size) which will not appear in the data returned by NEXTKEY and which cannot be accessed directly through STORE or FETCH. Currently one metadatum, 'BLESS' is used to indicate what package to bless a tied hashref into.
storing and retrieving blessed objects
blessed objects can now be stored, as long as their underlying representation is a hash. This may change. The root of a DirDB tree will not get blessed but all blessed hashreference branches will be blessed on fetch into the package they were in when stored.
storing and retrieving array references
at this version, Tie::File is used for an array implementation. The array implementation can be specified with an ARRAY tie-time argument, like so:
use Array::Virtual;
use DirDB 0.11;
tie my %Persistent, DirDB => './data',
ARRAY => ["Array::Virtual", DATAPATH => 0664];
RISKS
stale lock risk
"mkdir locking" is used to protect incomplete directories from being accessed while they are being written, and is now used as well for advisory locking. It is conceivable that your program might catch a signal and die while inside a critical section. If this happens, a simple
find /your/data -type d -name '* LOCK*'
at the command line will identify what you need to delete.
Only the very end of the write operation is protected by the locking: during a write, other processes will be able to read the old data. They will also be able to start their own overwrites.
DirDB attempts to guarantee that written data is complete (not partial.)
DirDB does not attempt to guarantee atomicity of updates.
unexpected persistence
Untied hash references assigned into a DirDB tied hash will become tied to the file system at the point they are first assigned. This has the potential to cause confusion.
unexpected copy instead of link
Tied hash references are recursively copied. This includes hash references tied due to being assigned into a DirDB tied hash.
EXPORT
None by default.
AUTHOR
David Nicol, davidnicol@cpan.org
Assistance
version 0.04 QA provided by members of Kansas City Perl Mongers, including Andrew Moore and Craig S. Cottingham.
LICENSE
GPL/Artistic (the same terms as Perl itself)
SEE ALSO
better read perltie before trying to extend this
DirDB::Storable uses Storable for storing and retrieving arbitrary types
DirDB::FTP provides complete DirDB function over the FTP protocol
Tie::Dir is concerned with accessing stat
information, not file contents