NAME
Tie::Persistent - persistent data structures via tie made easy
VERSION
1.00
SYNOPSIS
use Tie::Persistent;
tie %DB, 'Tie::Persistent', 'file', 'rw'; # read data from 'file'
(tied %DB)->autosync(1); # turn on write back on every modify
# now create/add/modify datastruct
$DB{key} = "value";
(tied %DB)->sync(); # can be called manually
untie %DB; # stores data back into 'file'
# read stored data, no modification of file data
tie %ReadOnly, 'Tie::Persistent', 'file';
foreach (keys %ReadOnly) {
print "$_ => $ReadOnly{$_}\n";
}
untie %ReadOnly; # modifications not stored back
DESCRIPTION
The Tie::Persistent package makes working with persistent data real easy by using the tie
interface.
It works by storing data contained in a variable into a file (not unlike a database). The primary advantage is speed, as the whole datastructure is kept in memory (which is also a limitation), and, of course, that you can use arbitrary data structures inside the variable (unlike DB_File).
Note that it is most useful if the data structure fits into memory. For larger data structures I recommend MLDBM.
If you want to make an arbitrary object persistent, just store its ref in a scalar tied to 'Tie::Persistent'.
Beware: not every data structure or object can be made persistent. For example, it may not contain GLOB or CODE refs, as these are not really dumpable (yet?).
Also, it works only for variables, you cannot use it for file handles.
[A persistent file handle? Hmmm... Hmmm! I've got an idea: I could start a server and send the file descriptor to it via ioctl(FD_SEND) or sendmsg. Later, I could retrieve it back, so it's persistent as long as the server process keeps running. But the whole file handle may contain more than just the file descriptor. There may be an output routine associated with it that I'd somehow have to dump. Now let's see, there was some way to get the bytecode converted back into perl code... <wanders off into the darkness mumbling> ... ]
PARAMETERS
tie
%Hash, 'Tie::Persistent', file, mode, other...;
tie
@Array, 'Tie::Persistent', file, mode, other...;
tie
$Scalar, 'Tie::Persistent', file, mode, other...;
- file
-
Filename to store the data in. No naming convention is enforced, but I personally use the suffix 'pd' for "Perl Data" (or "Persistent Data"?). No file locking is done; see the section on locking below.
- mode (optional)
-
Same as mode for POSIX fopen() or IO::File::open. Basically a combination of 'r', 'w', 'a' and '+'. Semantics:
'r' .... read only. Modifications in the data are not stored back into the file. A non-existing file gives an error. This is the default if no mode is given. 'rw' ... read/write. Modifications are stored back, if the file does not exist, it is created. 'w' .... write only. The file is not read, the variable starts out empty. 'a', '+' ... append. Same as 'w', but creates numbered backup files. 'ra', 'r+' ... Same as 'rw', but creates numbered backup files.
When some kind of write access is specified, a backup file of the old dataset is always created. [You'll thank me for that, believe me.] The reason is simple: when you tie a variable read-write (the contents get restored from the file), and your program isn't fully debugged yet, it may die in the middle of some modifications, but the data will still be written back to the file, possibly leaving them inconsistent. Then you always have at least the previous version that you can restore from.
The default backup filenames follow the Emacs notation, i.e. a '~' is appended; for numbered backup files (specified as 'a' or '+'), an additional number and a '~' is appended.
For a file 'data.pd', the normal backup file would be 'data.pd~' and the numbered backup files would be 'data.pd~1~', 'data.pd~2~' and so on. The latest backup file is the one with the highest number. The backup filename format can be overridden, see below.
- other (optional, experimental)
-
This can be a reference to another (possibly tied) variable or a name of another tieable package.
If a ref is given, it is used internally to store the variable data instead of an anonymous variable ref. This allows to make other tied datastructures persistent, e.g. you could first tie a hash to Tie::IxHash to make it order-preserving and then give it to Tie::Persistent to make it persistent.
A plain name is used to create this tied variable internally. Trailing arguments are passed to the other tieable package.
Example:
tie %h, 'Tie::Persistent', 'file', 'rw', 'Tie::IxHash';
or
tie %ixh, 'Tie::IxHash'; tie %ph, 'Tie::Persistent', 'file', 'w', \%ixh; # you can now use %ixh as an alias for %ph
NOTE: This is an experimental feature. It may or may not work with other Tie:: packages. I have only tested it with 'Tie::IxHash'. Please report success or failure.
LOCKING
The data file is not automatically locked. Locking has to be done outside of the package. I recommend using a module like 'Lockfile::Simple' for that.
There are typical two scenarios for locking: you either lock just the 'tie' and/or 'untie' calls, but not the data manipulation, or you lock the whole 'tie' - modify data - 'untie' sequence.
KEEPING DATA SYCHRONIZED
It often is useful to store snapshots of the tied data struct back to the file, e.g. to safeguard against program crashes. You have two possibilities to do that:
use sync() to do it manually or
set autosync() to do it on every modification.
Note that sync() and autosync() are methods of the tied object, so you have to call them like this:
(tied %hash)->sync();
and
(tied @array)->autosync(1); # or '0' to turn off autosync
There is a global variable $Autosync (see there) that you can set to change the behaviour on a global level for all subsequent ties.
Enabling autosync of course means a quite hefty performance penalty, so think carefully if and how you need it. Maybe there are natural synchronisation points in your application where a manual sync is good enough. Alternatively use MLDBM (if your top-level struct is a hash).
Note: autosync only works if the top-level element of the data structure is modified. If you have more complex data structures and modify elements somewhere deep down, you have to synchronize manually. I therefore recommend the following approach, especially if the topmost structure is a hash:
fetch the top-level element into a temporary variable
modify the datastructure
store back the top-level element, thus triggering a sync.
E.g.
my $ref = $Hash{$key}; # fetch substructure
$ref->{$subkey} = $newval; # modify somewhere down under
$Hash{$key} = $ref; # store back
This programming style has the added advantage that you can switch over to other database packages (for example the MLDBM package, in case your data structures outgrow your memory) quite easily by just changing the 'tie' line!
CONFIGURATION VARIABLES
$Tie::Persistent::Readable
controls which format to use to store the data inside the file. 'false' means to use 'Storable', which is faster (and the default), 'true' means to use 'Data::Dumper', which is slower but much more readable and thus meant for debugging. This only influences the way the datastructure is written, format detection on read is automatic.
$Tie::Persistent::Autosync
gives the default for all tied vars, so modifying it affects all subsequent ties. It's set to 'false' by default.
$Tie::Persistent::BackupFile
points to a sub that determines the backup filename format. It gets the filename as $_[0] and returns the backup filename. The default is
sub { "$_[0]~"; }
which is the Emacs backup format. For NT, you might want to change this to
sub { "$_[0].bak"; }
or something.
$Tie::Persistent::NumberedBackupFile
points to a sub that determines the numbered backup filename format. It gets the filename and a number as $_[0] and $_[1] respectively and returns the backup filename. The default is
sub { "$_[0]~$_[1]~"; }
which is the extended Emacs backup format.
NOTES
'Tie::Persistent' uses 'Storable' and 'Data::Dumper' internally, so these must be installed (the CPAN module will do this for you automatically). Actually, 'Storable' is optional but recommended for speed.
For testing, I use 'Tie::IxHash', but 'make test' still does some tests if it is not installed.
There are two mailing lists at SourceForge.net:
http://lists.sourceforge.net/mailman/listinfo/persistent-announce for announcements of new releases.
http://lists.sourceforge.net/mailman/listinfo/persistent-discuss for user feedback and feature discussions.
The package is available through CPAN and SourceForge.net http://sourceforge.net/projects/persistent/
There is an initiative at SourceForge.net to get authors of persistence-packages of any kind to talk to one another. See http://sourceforge.net/projects/POOP/
BUGS
Numbered backupfile creation might have problems if the filename (not the backup number) contains the first six digits of the speed of light (in m/s).
All other bugs, please tell me!
AUTHORS
Original version by Roland Giersig <RGiersig@cpan.org>
Benjamin Liberman <beanjamman@yahoo.com> added autosyncing and fixed splice.
COPYRIGHT
Copyright (c) 1999-2002 Roland Giersig. All rights reserved. This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.