NAME

Genezzo::PushHash::HPHRowBlk.pm - a 90% pure virtual class module that extends hierarchical "push hash" Genezzo::PushHash::hph with Row/Block methods. These methods facilitate the construction of classes that manipulate data blocks directly, such as index access methods and functions that split rows over multiple blocks..

SYNOPSIS

use Genezzo::PushHash::HPHRowBlk;
use Genezzo::PushHash::hph;

# need more info here!!

DESCRIPTION

Like a standard hierarchical pushhash (hph), the HPHRowBlk is a pushhash built upon a collection of other pushhashes. A push into the top-level hash is routed into one of the bottom hashes. If the bottom hashes are full (push fails), the top-level pushhash uses the factory method to create or obtain a new pushhash. The HPHRowBlk class is designed to layer on top of hph's built of hash-tied byte block storage, such as Genezzo::Row::RSBlock.

CONCEPTS and INTERNALS - useful for implementors

A hph is constructed of N pushhash "chunks", and the elements of each chunk are referred to as "slices". Typically, one chunk is "current" -- we push into the current chunk until it fills up, at which point the hph attempts to make a new one. HPHRowBlk is designed to expose the underlying block mechanism to the uppermost layer of the pushhash. It provides some additional methods: _make_new_block, _get_current_block, and _get_block_and_bce, which provide functionality somewhat similar to _get_current_chunk/_make_new_chunk, but on a block level, versus individual scalar (packed row) operations. In addition, these methods "short-circuit" the hph tree of pushhashes, making the bottom block operations directly available to the top hph layer. The penultimate layer of the hph stack (see Genezzo::Row::RSFile) must implement the internal block access methods on the bottom pushhash.

_make_new_block

create a new block in the current chunk and return the block number as a rid.

_get_current_block

return the block number of the insertion position in the current chunk.

_get_block_and_bce

return an array of the tied block, the buffer cache element (see Genezzo::BufCa::BufCaElt), and other useful information.

First_Blockno/Next_Blockno

iterate over all the blocks in the HPHRowBlk push hash.

WHY?

Indexes

Btree indexes are implemented as a tree of data blocks. Tree operations directly manipulate the blocks directly, bypassing the hph mechanisms that typically isolate the persistent tuple storage from the top layer. See Genezzo::Index::bt3.

Row/Column Splitting

When a packed tuple exceeds the size of an individual block, the row may be split over multiple blocks. The basic semantics of the row contents is only understood at the uppermost layer, which packs and interprets tuple data, while the bottommost layer is solely responsible for storing and accessing scalar byte string data in persistent storage. The HPHRowBlk methods provide handles into the basic block storage so the upper layer can split and reconstruct row data over multiple blocks. See Genezzo::Row::RSTab.

TODO

fix synopsis

AUTHOR

Jeffrey I. Cohen, jcohen@genezzo.com

SEE ALSO

Genezzo::PushHash::hph, Genezzo::PushHash::PushHash, perl(1).

Copyright (c) 2003, 2004, 2005 Jeffrey I Cohen. All rights reserved.

This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; either version 2 of the License, or
any later version.

This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
GNU General Public License for more details.

You should have received a copy of the GNU General Public License
along with this program; if not, write to the Free Software
Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA  02110-1301  USA

Address bug reports and comments to: jcohen@genezzo.com

For more information, please visit the Genezzo homepage at http://www.genezzo.com