NAME
DS - Data Stream module
SYNOPSIS
use IO::Handle;
use DS::Importer::TabFile;
use DS::Transformer::TabStreamWriter;
use DS::Target::Sink;
$importer = new DS::Importer::TabFile( "$Bin/price_index.csv" );
$printer = new DS::Transformer::TabStreamWriter(
new_from_fd IO::Handle(fileno(STDOUT), 'w')
);
$printer->include_header;
$importer->attach_target( $printer );
$printer->attach_target( new DS::Target::Sink );
$importer->execute();
DESCRIPTION
This package provides a framework for writing data processing components that work on typed streams. A typed stream in DS is a stream of hash references where every hashreference obeys certain constraints that is contained in a type specification.
BASIC CONCEPTS
The DSlib package draws upon a handful of concepts that are introduced here.
Base classes
The base classes in DSlib are:
- DS::Source A source of a data stream. Sometime just called a "source".
- DS::Target A target of a data stream. Sometime just called "target".
- DS::Transformer A source and target mixin that receives a data stream and passes it on (with possible modifications).
- DS::Importer A source that retrieves data from a source outside DS.
Processing chains
A processing chain is a linked list starting with a source, any number of following transformers and a target at the end of the list. An open processing chain is a chain where source or target is missing.
Processing chains work by having the source pass data down the chain until it eventually reaches the target, where the data goes out of DSlibs scope. The data is passed by having each transformer in the chain call the following transformer, passing the data as a parameter. The only data type supported is hash references.
End of stream convention
The data type supported by DS is hash references, but to indicate that there is no more rows in the stream, undef is used as an end of stream-marker.
It is vital that this marker is passed on by all components in the processing chain, since some components may need to clean up or pass on more rows at this point.
Type specifications
Any source, target or transformer can have ingoing oand outgoing types that can be used to ensure that the data passed to any target contains (but not limited to) a specified list of fields.
APIS SUBJECT TO CHANGE
I have decidede to pursue a more general way of writing transformers which will be available in version 3 of this package. I am certain that some APIs will be changed in a way that is not backwards compatible.
MISSING DOCUMENTATION
Some classes in this package are still without documentation. Send me a mail if you run into trouble or just want clarification of something. That may also encourage me to write the missing documentation.
SEE ALSO
DS::Importer::TabFile, DS::Importer::Sth, DS::Transformer, DS::Transformer::Sub, DS::Target::Sink.
AUTHOR
Written by Michael Zedeler.