NAME

BerkeleyDB::Manager - General purpose BerkeleyDB wrapper

SYNOPSIS

use BerkeleyDB::Manager;

my $m = BerkeleyDB::Manager->new(
	home => Path::Class::Dir->new( ... ), # if you want to use rel paths
	db_class => "BerkeleyDB::Hash", # the default class for new DBs
);

my $db = $m->open_db( file => "foo" ); # defaults

$m->txn_do(sub {
	$db->db_put("foo", "bar");
	die "error!"; # rolls back
});

# fetch all key/value pairs as a Data::Stream::Bulk
my $pairs = $m->cursor_stream( db => $db );

DESCRIPTION

This object provides a convenience wrapper for BerkeleyDB

ATTRIBUTES

home

The path to pass as -Home to BerkeleyDB::Env->new.

If provided the file arguments to open_db should be relative paths.

If not provided, BerkeleyDB will use the current working directory for transaction journals, etc.

create

Whether DB_CREATE is passed to Env or instantiate_db by default. Defaults to false.

If create and specified and an alternate log, data or tmp dir is set, a DB_CONFIG configuration file with those parameters will be written allowing standard Berkeley DB tools to work with the environment home directory.

An existing DB_CONFIG file will not be overwritten, nor will one be written in the current directory if home is not specified.

lock

Whether DB_INIT_LOCK is passed. Defaults to true.

Can be set to false if ALL concurrent instances are readonly.

deadlock_detection

Whether or not lock detection is set. The default is true.

lk_detect

The type of lock detection to use if deadlock_detection is set. Defaults to DB_LOCK_DEFAULT. Additional possible values are DB_LOCK_MAXLOCKS, DB_LOCK_MINLOCKS, DB_LOCK_MINWRITE, DB_LOCK_OLDEST, DB_LOCK_RANDOM, and DB_LOCK_YOUNGEST. See set_lk_detect in the Berkeley DB reference guide.

readonly

Whether DB_RDONLY is passed in the flags. Defaults to false.

transactions

Whether or not to enable transactions.

Defaults to true.

autocommit

Whether or not a top level transaction is automatically created by BerkeleyDB. Defaults to true.

If you turn this off note that all database handles must be opened inside a transaction, unless transactions are disabled.

auto_checkpoint

When true txn_checkpoint will be called with checkpoint_kbyte and checkpoint_min every time a top level transaction is comitted.

Defaults to true.

checkpoint_kbyte

Passed to txn_checkpoint. txn_checkpoint will write a checkpoint if that many kilobytes of data have been written since the last checkpoint.

Defaults to 20 megabytes. If transactions are comitted quickly this value should avoid checkpoints being made too often.

checkpoint_min

Passed to txn_checkpoint. txn_checkpoint will write a checkpoint if the last checkpoint was more than this many minutes ago.

Defaults to 1 minute. If transactions are not committed very often this parameter should balance the large-ish default value for checkpoint_kbyte.

recover

If true DB_REGISTER and DB_RECOVER are enabled in the flags to the env.

This will enable automatic recovery in case of a crash.

See also the db_recover utility, and file:///usr/local/BerkeleyDB/docs/gsg_txn/C/architectrecovery.html#multiprocessrecovery

multiversion

Enables multiversioning concurrency.

See http://www.oracle.com/technology/documentation/berkeley-db/db/gsg_txn/C/isolation.html#snapshot_isolation

snapshot

Whether or not DB_TXN_SNAPSHOT should be passed to txn_begin.

If multiversion is not true, this is a noop.

Defaults to true.

Using DB_TXN_SNAPSHOT means will cause copy on write multiversioning concurrency instead of locking concurrency.

This can improve read responsiveness for applications with long running transactions, by allowing a page to be read even if it is being written to in another transaction since the writer is modifying its own copy of the page.

This is an alternative to enabling reading of uncomitted data, and provides the same read performance while maintaining snapshot isolation at the cost of more memory.

read_uncomitted

Enables uncomitted reads.

This breaks the I in ACID, since transactions are no longer isolated.

A better approaach to increase read performance when there are long running writing transactions is to enable multiversioning.

log_auto_remove

Enables automatic removal of logs.

Normally logs should be removed after being backed up, but if you are not interested in having full snapshot backups for catastrophic recovery scenarios, you can enable this.

See http://www.oracle.com/technology/documentation/berkeley-db/db/ref/transapp/logfile.html.

Defaults to false.

sync

Enables syncing of BDB log writing.

Defaults to true.

If disabled, transaction writing will not be synced. This means that in the event of a crash some successfully comitted transactions might still be rolled back during recovery, but the database will still be in tact and atomicity is still guaranteed.

This is useful for bulk imports as it can significantly increase performance of smaller transactions.

dup

Enables DB_DUP in -Properties, allowing duplicate keys in the db.

Defaults to false.

dupsort

Enables DB_DUPSORT in -Properties.

Defaults to false.

db_class

The default class to use when instantiating new DB objects. Defaults to BerkeleyDB::Btree.

env_flags

Flags to pass to the env. Overrides transactions, create and recover.

db_flags

Flags to pass to instantiate_db. Overrides create and autocommit.

db_properties

Properties to pass to instantiate_db. Overrides dup and dupsort.

open_dbs

The hash of currently open dbs.

chunk_size

See cursor_stream.

Defaults to 500.

METHODS

open_db %args

Fetch a database handle, opening it as necessary.

If name is provided, it is used as the key in open_dbs. Otherwise file is taken from %args.

Calls instantiate_db

close_db $name

Close the DB with the key $name

get_db $name

Fetch the db specified by $name if it is already open.

register_db $name, $handle

Registers the DB as open.

instantiate_db %args

Instantiates a new database handle.

file is a required argument.

If class is not provided, the "db_class" will be used in place.

If txn is not provided and the env has transactions enabled, the current transaction if any is used. See txn_do

flags and properties can be overridden manually. If they are not provided build_db_flags and build_db_properties will be used.

instantiate_hash
instantiate_btree

Convenience wrappers for instantiate_db that set class.

build_db_properties %args

Merges argument options into a flag integer.

Default arguments are taken from the dup and dupsort attrs.

build_db_flags %args

Merges argument options into a flag integer.

Default arguments are taken from the autocommit and create attrs.

txn_do sub { }

Executes the subroutine in an eval block. Calls txn_commit if the transaction was successful, or txn_rollback if it wasn't.

Transactions are kept on a stack internally.

txn_begin

Begin a new transaction.

The new transaction is set as the active transaction for all registered database handles.

If multiversion is enabled DB_TXN_SNAPSHOT is passed in as well.

txn_commit

Commit the currnet transaction.

Will die on error.

txn_rollback

Rollback the current transaction.

txn_checkpoint

Calls txn_checkpoint on env with checkpoint_kbyte and checkpoint_min.

This is called automatically by txn_commit if auto_checkpoint is set.

associate %args

Associate secondary with primary, using callback to extract keys.

callback is invoked with the primary DB key and the value on every update to primary, and is expected to return a key (or with recent BerkeleyDB also an array reference of keys) with which to create indexed entries.

Fetching on secondary with a secondary key returns the value from primary.

Fetching with pb_get will also return the primary key.

See the BDB documentation for more details.

all_open_dbs

Returns a list of all the registered databases.

cursor_stream %args

Fetches data from a cursor, returning a Data::Stream::Bulk.

If cursor is not provided but db is, a new cursor will be created.

If callback is provided it will be invoked on the cursor with an accumilator array repeatedly until it returns a false value. For example, to extract triplets from a secondary index, you can use this callback:

my ( $sk, $pk, $v ) = ( '', '', '' ); # to avoid uninitialized warnings from BDB

$m->cursor_stream(
	db => $db,
	callback => {
		my ( $cursor, $accumilator ) = @_;

		if ( $cursor->c_pget( $sk, $pk, $v ) == 0 ) {
			push @$accumilator, [ $sk, $pk, $v ];
			return 1;
		}

		return; # nothing left
	}
);

If it is not provided, c_get will be used, returning [ $key, $value ] for each cursor position. flag can be passed, and defaults to DB_NEXT.

chunk_size controls the number of pairs returned in each chunk. If it isn't provided the attribute chunk_size is used instead.

If values or keys is set to a true value then only values or keys will be returned. These two arguments are mutually exclusive.

Lastly, init is an optional callback that is invoked once before each chunk, that can be used to set up the database. The return value is retained until the chunk is finished, so this callback can return a Scope::Guard to perform cleanup.

dup_cursor_stream %args

A specialization of cursor_stream for fetching duplicate key entries.

Takes the same arguments as cursor_stream, but adds a few more.

key can be passed in to initialize the cursor with DB_SET.

To do manual initialization callback_first can be provided instead.

callback is generated to use DB_NEXT_DUP instead of DB_NEXT, and flag is ignored.

VERSION CONTROL

http://github.com/nothingmuch/berkeleydb-manager

AUTHOR

Yuval Kogman <nothingmuch@woobling.org>

COPYRIGHT

Copyright (c) 2008 Yuval Kogman. All rights reserved
This program is free software; you can redistribute
it and/or modify it under the same terms as Perl itself.