NAME
BerkeleyDB::Tie - Persistent objects using BerkeleyDB
SYNOPSIS
use BerkeleyDB::Tie;
## Example 1
## Create a Hashed database
my $db = new BerkeleyDB::Tie::Hash
home => 'zoo',
filename => 'residents' ;
$db->{Samson} = new Primate ;
$db->{Cornelius} = new Primate ;
$db->{Kaa} = new Reptile ;
## Example 2
## Create a Btree database allowing duplicates and scalar values
my $types = scalars Berkeley::Tie::Btree
home => 'zoo',
filename => 'types',
&duplicatekeys ;
$types->{primate} = 'Samson' ;
$types->{primate} = 'Cornelius' ;
$types->{reptile} = 'Kaa' ;
printf "%s\n", join ' ', $types->recordset{primate} ;
## prints: Samson Cornelius
$types->delete( primate => 'Samson' ) ;
printf "%s\n", join ' ', $types->recordset{primate} ;
## prints: Cornelius
## Example 3
## Create a database of visitors
## Use a table with arbitrary keys
## Track visitors by date/timestamp
$tickets = new BerkeleyDB::Tie::Btree
home => 'zoo',
filename => 'tickets',
&incrementkeys ;
## Alternatively
$tickets = lexical BerkeleyDB::Tie::Btree
home => 'zoo',
filename => 'tickets' ;
$bytime = scalars BerkeleyDB::Tie::Btree
home => 'zoo',
filename => 'ticketsbytime',
&duplicatekeys ;
## Process a new visitor in real time
sub newvisitor {
my $serial = $tickets->nextrecord() ;
my $date = getdate() ; ## Fictional subroutine
my $time = gettime() ; ## Fictional subroutine
$tickets->{$serial} = { @_ } ;
$bytime->{ "$date $time" } = $serial ;
return $serial ;
}
## Get a list of visitors on a certain date
sub showvisitorsbydate {
my $date = shift ;
return $bytime->matchingvalues( $date ) ;
}
DESCRIPTION
BerkeleyDB::Tie is a set of classes that provides simplified constructors, tied access to data, and methods for returning multiple record sets.
Example 1
BerkeleyDB::Tie maintains BerkeleyDB environment references in a package scoped hash keyed on the home argument. The basic BerkeleyDB::Tie constructor arguments define the BerkeleyDB environment and database. When the constructor is called, a previously opened environment is used if available. Otherwise, a new environment is created and is available to future constructor requests.
This version of BerkeleyDB::Tie creates all environment objects as concurrent data stores. Transactional data storage is not currently integrated.
By default, BerkeleyDB::Tie is designed to marshall objects into a database using the Storable module.
Example 1 shows a simple application that illustrates both of these features. The constructor call contains the minimum arguments to identify the environment and the database.
These few lines of code are sufficient to add persistent object support to an application.
Example 2
One of Berkeley's most appealing features is support for duplicate keys. This feature enables a programmer to use persistent arrays, where elements can be accessed, added, and deleted without marshalling.
Example 2 uses the scalars constructor which disables the automatic serialization of record access. Otherwise, if the new constructor is used, scalars will be returned as scalar references, regardless of how they are stored.
&duplicatekeys is a subroutine that returns a pair of constants as a shortcut. The constants are defined in the BerkeleyDB module.
The recordset method returns a stored list from the database. This method is available to both BerkeleyDB::Tie::Btree and BerkeleyDB::Tie::Hash classes.
The delete method is used to delete an element from the list. Since BerkeleyDB::Tie adheres to the Tie interface, the delete keyword can normally used to remove stored objects. The delete method should be used on databases with duplicate keys to avoid indeterminate results.
BerkeleyDB returns the status of a delete operation. This feature can be used to delete an entire list using the following idiom:
while ( ! delete $types->{primate} ) {}
A BerkeleyDB database configured for duplicate keys also allows duplicate key/value pairs. For most one-to-many data sets, key value pairs should be unique. There are several ways to handle this issue, but none of them are currently implemented.
Commonly, the workaround is to import a retrieved list into a hash structure:
%unique = map { $_ => 1 } $types->recordset('primate') ;
keys %unique ;
However, care should be taken when deleting elements. The delete method for duplicate keys should almost always be invoked using an idiom similar to the one above:
while ( ! $types->delete( primate => 'samson' ) ) {}
Another source of problems occurs when using the delete method on databases containing objects. In this case, the second argument may refer to an object that does not exactly match the stored value. The following code illustrates this difficulty:
my $cats = new BerkeleyDB::Tie::Btree(
home => 'zoo',
filename => 'cats',
&duplicatekeys,
) ;
my $Felix = new BigCat( dinner => 'antelope' ) ;
$cats->{lion} = $Felix ;
$Felix->{dinner} = 'gazelle' ;
$cats->delete( lion => $Felix ) ; ## fails
This problem also occurs because the results of the marshalling operation differ depending on whether numbers are interpreted as integers, floats, or strings. Thus an object's value may change merely as a result of its context. The following example illustrates the situation:
$weight = '300 lbs.' ;
$weight =~ s/\D//g ;
my $Felix = new BigCat( weight => $weight ) ; ## member as string
$cats->{lion} = $Felix ;
$cats->delete( lion => $Felix ) ## member as integer
if $Felix->{weight} > 200 ; ## fails
Example 3
Example 3 shows a few additional features helpful to developers accustomed to relational databases. These features take advantage of the Btree database capabilities, and are not available to BerkeleyDB::Tie::Hash objects.
The nextrecord method of BerkeleyDB::Tie::Btree returns a new unique key. Each nextrecord call creates a new blank record to avoid race conditions, and returns the new key. This method creates a key by adding 1 to the last record. In order to ensure that the last record contains the highest valued key, use the &incrementkeys argument to the BerkeleyDB::Tie::Btree constructor. The &incrementkeys function is a shortcut that returns a CODE constant that forces numerical Btree sorting.
There is a significant disadvantage to databases created using the &incrementkeys argument. The resulting databases are incompatible with SleepyCat utilities such as db_dump and db_verify. As an alternative, nextrecord can be called as a method from the BerkeleyDB::Tie::Btree::Lexical subclass. This subclass functions identically, but the numerical keys are stored as zero padded strings. Therefore, a restriction on Lexical subclass databases is that keys must be numerically less than 10,000,000,000.
The lexical constructor to the BerkeleyDB::Tie::Btree class is synonymous with the new constructor to the BerkeleyDB::Tie::Btree::Lexical subclass.
BerkeleyDB::Tie also implements another nice BerkeleyDB feature: partial string matching. The methods matchingkeys, matchingvalues, and searchset all return a set of records whose keys begin with a common substring.
For example, if keys are defined with the following format: "2002 Jul 14 15:30", the following data can be returned:
## All records for the year
@annually = $bytime->matchingkeys('2002 ') ;
## All records for the month
@monthly = $bytime->matchingvalues('2002 Jul ') ;
## All records for the day
%daily = $bytime->searchset('2002 Jul 14 ') ;
matchingkeys returns an array of the matching records' keys. matchingvalues returns an array of the matching records' values. Unforeseen confusion may result from the method name matchingvalues- the returned records have matching keys, but the record values are returned.
searchset returns the matching records as key/value pairs that can populate an associative array as shown. However, using an associative array is pointless if the database contains duplicate keys. The following code is an effective technique for capturing the results of this type of search:
foreach ( $bytime->matchingkeys( '2002 Jul 14', &uniquekeys ) ) {
$daily{ $_ } = [ $bytime->recordset( $_ ) ] ;
}
&uniquekeys returns a constant that is used primarily as an argument to the matchingkeys method to filter duplicate results from the database. When this argument is passed to the &searchset method, the values in the key/value pairs indicate a record count. &uniquekeys cannot be used with the matchingvalues method.
EXPORT
&duplicatekeys &incrementkeys &uniquepairs &uniquekeys
AUTHOR
Jim Schueler, <jschueler@tqis.com>
SEE ALSO
Storable BerkeleyDB http://www.sleepycat.com