NAME

FlatFile::DataStore::Tutorial - POD containing in-depth discussion of and tutorials for using FlatFile::DataStore.

(This is still just a stub. See also FlatFile::DataStore::FMTEYEWTK.)

VERSION

Discusses FlatFile::DataStore version 0.16.

SYNOPSYS

man FlatFile::DataStore
man FlatFile::DataStore::Tutorial

or

perldoc FlatFile::DataStore
perldoc FlatFile::DataStore::Tutorial

or

http://search.cpan.org/dist/FlatFile-DataStore/

DESCRIPTION

Overview

This tutorial only contains POD, so don't do this:

use FlatFile::DataStore::Tutorial;  # don't do this

Instead, simply read the POD (as you are doing). Also please read the docs for FlatFile::DataStore, which is essentially the reference manual.

This tutorial/discussion is intended to augment those docs with longer explanations for the design of the module, more usage examples, and other materials that will hopefully help you make better use of it.

DISCUSSION

Overview

FlatFile::DataStore implements a simple flat file data store. When you create (store) a new record, it is appended to the flat file. When you update an existing record, the existing entry in the flat file is flagged as updated, and the updated record is appended to the flat file. When you delete a record, the existing entry is flagged as deleted, and a deleted record is appended to the flat file.

The result is that all versions of a record are retained in the data store, and running a history will return all of them. Another result is that each record in the data store represents a transaction: create, update, or delete.

Data Store Files and Directories

Key files act as an index into the data files. The different versions of the records in the data files act as linked lists:

- the first version of a record links just to it's successor
- a second, third, etc., versions link to their successors and predecessors
- the final (current) version links just to its predecessor

A key file entry always points to the final (current) version of a record. It may have a pointer to a previous version, but it will never have a pointer to a "next" version, because there isn't one.

Each record is stored with a preamble, which is a fixed-length string of fields containing:

- crud indicator        (flag for created, updated, deleted, etc.)
- transaction indicator (flag for created, updated, deleted, etc.)
- transaction number    (incremented when a record is touched)
- date                  (of the "transaction")
- key number            (record sequence number)
- record length         (in bytes)
- user data             (for out-of-band* user-defined data)
- "this" file number    (linked list pointers ...)
- "this" seek position
- "prev" file number
- "prev" seek position
- "next" file number
- "next" seek position

*That is, data about the record not stored in the record.

The formats and sizes of these fixed-length fields may be configured when the data store is first defined, and will determine certain constraints on the size of the data store. For example, if the file number is base-10 and 2 bytes in size, then the data store may have up to 99 data files. And if the seek position is base-10 and 9 bytes in size, then each data file may contain up to 1 Gig of data.

Number bases larger than base-10 (up to base-36 for file numbers and up to base-62 for other numbers) may be used to help shorten the length of the preamble string.

A data store will have the following files:

- uri  file,  contains the uri, which defines the configuration parameters
- obj  file,  contains dump of generic perl object constructed from uri*
- toc  files, contain transaction numbers for each data file
- key  files, contain pointers to every current record version
- data files, contain all the versions of all the records

*('generic' because the object does not include the 'dir' attribute)

If the data store is small, it might have only one toc, key, and/or data file.

If dirlev (see below) is 0 or undefined, the toc, key, or data files will reside at the same level as the url and obj files, e.g.,

- name.uri
- name.obj
- name.toc     (or name.1.toc if C<tocmax> is set)
- name.key     (or name.1.key if C<keymax> is set)
- name.1.data  (the filenum, e.g., 1, is always present)

If dirlev > 0, the directory structure follows this scheme (note that file/dir numbers start with 1):

- dir - name.uri - name.obj - name - toc1 - name.1.toc, - name.2.toc, - etc. - toc2, - etc. - key1 - name.1.key, - name.2.key, - etc. - key2, - etc. - data1 - name.1.data - name.2.data, - etc. - data2, - etc.

If tocmax is not defined, there will never be more than one toc file and so the name will be name.toc instead of name.1.toc.

If keymax is not defined, there will never be more than one key file and so the name will be name.key instead of name.1.key.

Different data stores may coexist in the same top-level directory--they just have to have different names.

To retrieve a record, one must know the data file number and the seek position into that data file, or one must know the record's sequence number (the order it was added to the data store). With a sequence number, the file number and seek position can be looked up in a key file, so these sequence numbers are called "key numbers" or keynum.

Methods support the following actions:

- create
- retrieve
- update
- delete
- history
- iterate (over all transactions in the data files)

Scripts supplied in the distribution perform:

- validation of a data store
- migration of data store records to newly configured data store
- comparison of pre-migration and post-migration data stores

CRUD cases

Create: no previous preamble required or allowed
   - create a record object (with no previous)
   - write the record
   - return the record object
Retrieve:
   - read a data record
   - create a record object (with a preamble, which may become a previous)
   - return the record object
Update: previous preamble required (and it must not have changed)
   - create a record object (with a previous preamble)
   - write the record (updating the previous in the data store)
   - return the record object
Delete: previous preamble required (and it must not have changed)
   - create a record object (with a previous preamble)
   - write the record (updating the previous in the data store)
   - return the record object

defining a data store

designing a program to analyze a data store definition

validating a data store

migrating a data store

interating over the data store transactions