NAME
Genezzo::PushHash::PushHash.pm - an impure virtual class module that defines a "push hash", a hash that generates its own unique key for each value. Values are "pushed" into the hash, similar to pushing into an array.
SYNOPSIS
use Genezzo::PushHash::PushHash;
my %tied_hash = ();
my $tie_val =
tie %tied_hash, 'Genezzo::PushHash::PushHash';
my $newkey = $tie_val->HPush("this is a test");
$tied_hash{$newkey} = "update this entry";
my $getcount = $tie_val->HCount();
DESCRIPTION
While standard perl hashes are a form of associative array, where the user supplies a key/value pair, a PushHash is more like a multiset which generates its own unique key for each element. The preferred usage is to use the HPush method, which returns the new key, but you can use the PUSH "pseudo key" to generate a new key, e.g.:
$tied_hash{PUSH} = "new value";
Note that the result of the underlying STORE only returns the pushed value, not the new key. Also, expressions like:
my $pushval = tied_hash{PUSH} = "new value";
can be problematic, since the tie may try to return a FETCH of key "PUSH", which will not work.
WHY
Push hashes can be used to restructure code based upon references to anonymous hashes or arrays in order to facilitate the persistent storage of data structures. Also, they are useful for implementing shared data structures or data structures with transactional update semantics, where you would want concurrent access and quick unique key generation. In addition, they can be used to create data structures larger than main memory, or handle cases where multiple keys or key traversal mechanisms get mapped to the same data. In other words, they are similar to SQL database ties or tied DB hashes, but potentially more flexible and extensible.
FUNCTIONS
PushHashes support all standard hash operations, with the exception that you cannot create or insert a user key -- you must push new entries and use the generated key or basic iteration to retrieve your data. It also supports two additional methods, HPush and HCount. Note that these methods are associated with the tie value (i.e. the blessed ref for the PushHash class), not the tied hash.
- HPush
-
HPush returns the new key for each pushed value, or an undef if the append fails. It only accepts a single argument, not a list.
my $newkey = $tie_val->HPush("this is a test");
Note that there is not a corresponding "pop" operation, since the generic PushHash does not define an ordering on the contents of the hash.
- HCount
-
Returns the count of items in the hash -- equivalent to an array FETCHSIZE, i.e. scalar(@array).
Why use a distinctive HPush function versus an array-like PUSH?
HPush is designed to support quick appends of a single value to a push-hash and return the new key, or return an undef if the push fails. The basic perl "push" appends a LIST and returns the new number of elements in the array.
- Ease of obtaining new key value
-
HPush returns the new key in a single operation, while push returns the size of the array. Unlike an array, the pushhash implementation does not have to generate keys that are simple ascending integers, so returning the number of elements in a hash would require extra operations to obtain the new key. The classic "push" works well for arrays since the number of elements in an array is essentially the offset of the new key.
- Ease of failure detection
-
If HPush fails, it returns an undef. Push requires an extra calculation to compare the returned count with the previous fetchsize to see if the push succeeded.
- Efficiency
-
For a standard push, you should be able to determine if it fails by checking the array size before and after the push. However, for many hash implementations, counting all the elements in the data structure may be very expensive. One example is a disk-based persistent hash, where the count may require a reading a file to count the entries. For large or complex data structures, returning the local information that an append failed should be much cheaper than calculating the number of valid entries twice.
Why use a distinctive HCount function versus an array-like FETCHSIZE?
The distinction is subtle. An array FETCHSIZE is a cheap operation that just returns the number of elements in the array. HCount is a potentially expensive operation that returns the number of valid data elements in the pushhash. For the example of the disk-based persistent hash, the HCount could involve reading multiple files on disk and special operations to distinguish between valid and deleted data.
EXPORT
RADIXPOINT - by default ".". A separator for a multipart key.
AUTHOR
Jeffrey I. Cohen, jcohen@genezzo.com
SEE ALSO
perl(1).
Copyright (c) 2003, 2004 Jeffrey I Cohen. All rights reserved.
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; either version 2 of the License, or
any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with this program; if not, write to the Free Software
Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA
Address bug reports and comments to: jcohen@genezzo.com
For more information, please visit the Genezzo homepage at http://www.genezzo.com