NAME
CDB_File - Perl extension for access to cdb databases
SYNOPSIS
use CDB_File;
tie %h, 'CDB_File', 'file.cdb' or die "tie failed: $!\n";
$t = new CDB_File ('t.cdb', 't.tmp') or die ...;
$t->insert('key', 'value');
$t->finish;
CDB_File::create %t, $file, "$file.$$";
or
use CDB_File 'create';
create %t, $file, "$file.$$";
DESCRIPTION
CDB_File is a module which provides a Perl interface to Dan Berstein's cdb package:
cdb is a fast, reliable, lightweight package for creating and
reading constant databases.
After the tie
shown above, accesses to %h
will refer to the cdb file file.cdb
, as described in "tie" in perlfunc.
A cdb file is created in three steps. First call new CDB_File ($final, $tmp)
, where $final
is the name of the database to be created, and $tmp
is the name of a temporary file which can be atomically renamed to $final
. Secondly, call the insert
method once for each (key, value) pair. Finally, call the finish
method to complete the creation and renaming of the cdb file.
A simpler interface to cdb file creation is provided by CDB_File::create %t, $final, $tmp
. This creates a cdb file named $final
containing the contents of %t
. As before, $tmp
must name a temporary file which can be atomically renamed to $final
. CDB_File::create
may be imported.
EXAMPLES
These are all complete programs.
1. Convert a Berkeley DB (B-tree) database to cdb format.
use CDB_File;
use DB_File;
tie %h, DB_File, $ARGV[0], O_RDONLY, undef, $DB_BTREE or
die "$0: can't tie to $ARGV[0]: $!\n";
CDB_File::create %h, $ARGV[1], "$ARGV[1].$$" or
die "$0: can't create cdb: $!\n";
2. Convert a flat file to cdb format. In this example, the flat file consists of one key per line, separated by a colon from the value. Blank lines and lines beginning with # are skipped.
use CDB_File;
$cdb = new CDB_File("data.cdb", "data.$$") or
die "$0: new CDB_File failed: $!\n";
while (<>) {
next if /^$/ or /^#/;
chop;
($k, $v) = split /:/, $_, 2;
if (defined $v) {
$cdb->insert($k, $v);
} else {
warn "bogus line: $_\n";
}
}
$cdb->finish or die "$0: CDB_File finish failed: $!\n";
3. Perl version of cdbdump.
use CDB_File;
tie %data, 'CDB_File', $ARGV[0] or
die "$0: can't tie to $ARGV[0]: $!\n";
while (($k, $v) = each %data) {
print '+', length $k, ',', length $v, ":$k->$v\n";
}
print "\n";
4. Although a cdb file is constant, you can simulate updating it in Perl. This is an expensive operation, as you have to create a new database, and copy into it everything that's unchanged from the old database. (As compensation, the update does not affect database readers. The old database is available for them, till the moment the new one is finish
ed.)
use CDB_File;
$file = 'data.cdb';
$new = new CDB_File($file, "$file.$$") or
die "$0: new CDB_File failed: $!\n";
# Add the new values; remember which keys we've seen.
while (<>) {
chop;
($k, $v) = split;
$new->insert($k, $v);
$seen{$k} = 1;
}
# Add any old values that haven't been replaced.
tie %old, 'CDB_File', $file or die "$0: can't tie to $file: $!\n";
while (($k, $v) = each %old) {
$new->insert($k, $v) unless $seen{$k};
}
$new->finish or die "$0: CDB_File finish failed: $!\n";
REPEATED KEYS
Most users can ignore this section.
A cdb file can contain repeated keys. If the insert
method is called more than once with the same key during the creation of a cdb file, that key will be repeated.
Here's an example.
$cdb = new CDB_File ("$file.cdb", "$file.$$") or die ...;
$cdb->insert('cat', 'gato');
$cdb->insert('cat', 'chat');
$cdb->finish;
Normally, any attempt to access a key retrieves the first value stored under that key. This code snippet always prints gato.
$catref = tie %catalogue, CDB_File, "$file.cdb" or die ...;
print "$catalogue{cat}";
However, all the usual ways of iterating over a hash---keys
, values
, and each
---do the Right Thing, even in the presence of repeated keys. This code snippet prints cat cat gato chat.
print join(' ', keys %catalogue, values %catalogue);
And these two both print cat:gato cat:chat, although the second is more efficient.
foreach $key (keys %catalogue) {
print "$key:$catalogue{$key} ";
}
while (($key, $val) = each %catalogue) {
print "$key:$val ";
}
The multi_get
method retrieves all the values associated with a key. It returns a reference to an array containing all the values. This code prints gato chat.
print "@{$catref->multi_get('cat')}";
RETURN VALUES
The routines tie
, new
, and finish
return undef if the attempted operation failed; $!
contains the reason for failure.
DIAGNOSTICS
The following fatal errors may occur. (See "eval" in perlfunc if you want to trap them.)
- Modification of a CDB_File attempted
-
You attempted to modify a hash tied to a CDB_File.
- CDB database too large
-
You attempted to create a cdb file larger than 4 gigabytes.
- [ Write to | Read of | Seek in ] CDB_File failed: <error string>
-
If error string is Protocol error, you tried to
use CDB_File
to access something that isn't a cdb file. Otherwise a serious OS level problem occurred, for example, you have run out of disk space. - Use CDB_File::FIRSTKEY before CDB_File::NEXTKEY
-
If you are using the NEXTKEY method directly (I can't think of a reason why you'd want to do this), you need to call FIRSTKEY first.
BUGS
It ain't lightweight after you've plumbed Perl into it.
The Perl interface to cdb imposes the restriction that data must fit into memory.
SEE ALSO
cdb(3).
AUTHOR
Tim Goodwin, <tjg@star.le.ac.uk>, 1997-01-08 - 2000-05-30.