NAME
MLDBM::Sync (BETA) - safe concurrent access to MLDBM databases
SYNOPSIS
use MLDBM::Sync; # this gets the default, SDBM_File
use MLDBM qw(DB_File Storable); # use Storable for serializing
use MLDBM qw(MLDBM::Sync::SDBM_File); # use extended SDBM_File, handles values > 1024 bytes
# NORMAL PROTECTED read/write with implicit locks per i/o request
tie %cache, 'MLDBM::Sync' [..other DBM args..] or die $!;
$cache{"AAAA"} = "BBBB";
my $value = $cache{"AAAA"};
# SERIALIZED PROTECTED read/write with explicity lock for both i/o requests
my $sync_dbm_obj = tie %cache, 'MLDBM::Sync', '/tmp/syncdbm', O_CREAT|O_RDWR, 0640;
$sync_dbm_obj->Lock;
$cache{"AAAA"} = "BBBB";
my $value = $cache{"AAAA"};
$sync_dbm_obj->UnLock;
DESCRIPTION
This module wraps around the MLDBM interface, by handling concurrent access to MLDBM databases with file locking, and flushes i/o explicity per lock/unlock. The new Lock()/UnLock() API can be used to serialize requests logically and improve performance for bundled reads & writes.
my $sync_dbm_obj = tie %cache, 'MLDBM::Sync', '/tmp/syncdbm', O_CREAT|O_RDWR, 0640;
$sync_dbm_obj->Lock;
... all accesses to DBM LOCK_EX protected, and go to same file handles ...
$sync_dbm_obj->UnLock;
MLDBM continues to serve as the underlying OO layer that serializes complex data structures to be stored in the databases. See the MLDBM BUGS section for important limitations.
INSTALL
Like any other CPAN module, either use CPAN.pm, or perl -MCPAN -e
shell, or get the file MLDBM-Sync-x.xx.tar.gz, unzip, untar and:
perl Makefile.PL
make
make test
make install
New MLDBM::Sync::SDBM_File
SDBM_File, the default used for MLDBM and therefore MLDBM::Sync has a limit of 1024 bytes for the size of a record.
SDBM_File is also an order of magnitude faster for small records to use with MLDBM::Sync, than DB_File or GDBM_File, because the tie()/untie() to the dbm is much faster. Therefore, bundled with MLDBM::Sync release is a MLDBM::Sync::SDBM_File layer which works around this 1024 byte limit. To use, just:
use MLDBM qw(MLDBM::Sync::SDBM_File);
It works by breaking up up the STORE() values into small 128 byte segments, and spreading those segments across many records, creating a virtual record layer. It also uses Compress::Zlib to compress STORED data, reducing the number of these 128 byte records. In benchmarks, 128 byte record segments seemed to be a sweet spot for space/time effienciency, as SDBM_File created very bloated *.pag files for 128+ byte records.
BENCHMARKS
In the distribution ./bench directory is a bench_sync.pl script that can benchmark using the various DBMs with MLDBM::Sync.
The MLDBM::Sync::SDBM_File DBM is special because is uses SDBM_File for fast small inserts, but slows down linearly with the size of the data being inserted and read, with the speed matching that of GDBM_File & DB_File somewhere around 20,000 bytes.
So for DBM key/value pairs up to 10000 bytes, you are likely better off with MLDBM::Sync::SDBM_File if you can afford the extra space it uses. At 20,000 bytes, time is a wash, and disk space is greater, so you might as well use DB_File or GDBM_File.
Note that MLDBM::Sync::SDBM_File is ALPHA as of 2/27/2001.
The results for a dual 450 linux 2.2.14, with a ext2 file system blocksize 4096 mounted async on a SCSI disk were as follows:
=== INSERT OF 50 BYTE RECORDS ===
Time for 100 write/read's for SDBM_File 0.12 seconds 12288 bytes
Time for 100 write/read's for MLDBM::Sync::SDBM_File 0.14 seconds 12288 bytes
Time for 100 write/read's for GDBM_File 2.07 seconds 18066 bytes
Time for 100 write/read's for DB_File 2.48 seconds 20480 bytes
=== INSERT OF 500 BYTE RECORDS ===
Time for 100 write/read's for SDBM_File 0.21 seconds 658432 bytes
Time for 100 write/read's for MLDBM::Sync::SDBM_File 0.51 seconds 135168 bytes
Time for 100 write/read's for GDBM_File 2.29 seconds 63472 bytes
Time for 100 write/read's for DB_File 2.44 seconds 114688 bytes
=== INSERT OF 5000 BYTE RECORDS ===
(skipping test for SDBM_File 1024 byte limit)
Time for 100 write/read's for MLDBM::Sync::SDBM_File 1.30 seconds 2101248 bytes
Time for 100 write/read's for GDBM_File 2.55 seconds 832400 bytes
Time for 100 write/read's for DB_File 3.27 seconds 839680 bytes
=== INSERT OF 20000 BYTE RECORDS ===
(skipping test for SDBM_File 1024 byte limit)
Time for 100 write/read's for MLDBM::Sync::SDBM_File 4.54 seconds 13162496 bytes
Time for 100 write/read's for GDBM_File 5.39 seconds 2063912 bytes
Time for 100 write/read's for DB_File 4.79 seconds 2068480 bytes
=== INSERT OF 50000 BYTE RECORDS ===
(skipping test for SDBM_File 1024 byte limit)
Time for 100 write/read's for MLDBM::Sync::SDBM_File 12.29 seconds 16717824 bytes
Time for 100 write/read's for GDBM_File 9.10 seconds 5337944 bytes
Time for 100 write/read's for DB_File 11.97 seconds 5345280 bytes
WARNINGS
MLDBM::Sync is in BETA. As of 2/27/2001 I have been using it in development for months, and been using its techniques for years in Apache::ASP $Session & $Application data storage. Future releases of Apache::ASP will use MLDBM::Sync for its Apache::ASP::State base implementation instead of MLDBM
MLDBM::Sync::SDBM_File is ALPHA quality. Databases created while using it may not be compatible with future releases if the segment manager code or support for compression changes.
TODO
Production testing.
AUTHORS
Copyright (c) 2001 Joshua Chamas, Chamas Enterprises Inc. All rights reserved. Sponsored by development on NodeWorks http://www.nodeworks.com
SEE ALSO
MLDBM(3), SDBM_File(3), DB_File(3), GDBM_File(3)