NAME

ETL::Yertl::Transform - Transform a stream of documents

VERSION

version 0.044

SYNOPSIS

### Simple transform callback
use ETL::Yertl;
use ETL::Yertl::Transform;
my $xform = ETL::Yertl::Transform->new(
    transform_doc => sub {
        # Document is in $_
    },
    source => ETL::Yertl::FormatStream->new_for_stdin,
    destination => ETL::Yertl::FormatStream->new_for_stdout,
);

### Transform class
package Local::Transform::Dump;
use ETL::Yertl;
use Data::Dumper;
use base 'ETL::Yertl::Transform';
sub transform_doc {
    my ( $self, $doc ) = @_;
    say Dumper $doc;
    return $doc;
}

package main;
use ETL::Yertl;
my $xform = Local::Transform::Dump->new(
    source => ETL::Yertl::FormatStream->new_for_stdin,
    destination => ETL::Yertl::FormatStream->new_for_stdout,
);

DESCRIPTION

This class holds a transformation routine in a Yertl stream. Transforms read documents from ETL::Yertl::FormatStream objects and optionally write them to another ETL::Yertl::FormatStream object. Transforms can chain to other transforms, creating a pipeline of transformations.

Transformations can be simple subroutines or full classes (inheriting from this class).

Transform Object

Create ad-hoc transform objects by passing in a transform_doc callback. The callback receives two arguments: The transform object, and the document to transform. The callback should return the transformed document (whether or not it is the same document modified in-place).

Transform Class

Create transform classes by inheriting from ETL::Yertl::Transform. Subclasses can override the transform_doc method to transform documents. This method receives the same arguments, returns the same values, sets $_, and behaves exactly like the transform_doc callback.

Overloaded Operators

Transforms can be chained together using the pipe (|) operator. The result of the expression is the transform on the right side, for continued chaining.

my $xform1 = ETL::Yertl::Transform->new(
    transform_doc => sub { ... },
);
my $xform2 = ETL::Yertl::Transform->new(
    transform_doc => sub { ... },
);
my $xform3 = $xform1 | $xform2 | ETL::Yertl::Transform->new(
    transform_doc => sub { ... },
);

Transforms can receive sources using the << operator with a ETL::Yertl::FormatStream object. The result of the expression is the transform object, for continued chaining.

my $input = ETL::Yertl::FormatStream->new_for_stdin;
my $xform = ETL::Yertl::Transform->new(
    transform_doc => sub { ... },
) << $input;

Transforms can receive destinations using the >> operator with a ETL::Yertl::FormatStream object. The result of the expression is the transform object, for continued chaining.

my $output = ETL::Yertl::FormatStream->new_for_stdout;
my $xform = ETL::Yertl::Transform->new(
    transform_doc => sub { ... },
) >> $output;

METHODS

new

my $xform = ETL::Yertl::Transform->new( %args );

Create a new transform object. %args is a hash with the following keys:

source

The source for documents. Can be a ETL::Yertl::FormatStream or a ETL::Yertl::Transform object. You do not need to specify this right away, but it is required for the transform to do useful work.

destination

(optional) A ETL::Yertl::FormatStream object to write the documents to. This can be an intermediate destination or the ultimate destination. The last transform in a stream should have a destination.

transform_doc

A subref to transform the documents read from the source. The subref will receive two arguments: The transform object and the document to transform. It should return the transformed document. The document to transform is also set as $_ for simpler transforms.

configure

$xform->configure( %args );

Configure this object. Takes the same arguments as the constructor, "new". This method allows updating any of the transform attributes later, so that transforms can be given new sources/destinations.

write

$xform->write( $doc );

Write a document explicitly. This can be used by the transform_doc callback to write documents without needing to return them from the callback.

run

$xform->run;

Run the transform, returning when all data is read from the source, and all data written to the destination (if any).

SEE ALSO

ETL::Yertl, ETL::Yertl::FormatStream

AUTHOR

Doug Bell <preaction@cpan.org>

COPYRIGHT AND LICENSE

This software is copyright (c) 2018 by Doug Bell.

This is free software; you can redistribute it and/or modify it under the same terms as the Perl 5 programming language system itself.