NAME

Hadoop::Streaming::Reducer - Simplify writing Hadoop Streaming jobs. Write a map() and reduce() function and let this role handle the Stream interface. The Reducer roll provides an iterator over the multiple values for a given key.

VERSION

version 0.100060

SYNOPSIS

#!/usr/bin/env perl

package WordCount::Reducer;
use Moose;
with qw/Hadoop::Streaming::Reducer/;

sub reduce {
    my ($self, $key, $values) = @_;

    my $count = 0;
    while ( $values->has_next ) {
        $count++;
        $values->next;
    }

    $self->emit( $key => $count );
}

package main;
WordCount::Reducer->run;

Your mapper class must implement map($key,$value) and your reducer must implement reduce($key,$value). Your classes will have emit() and run() methods added via the role.

METHODS

run

Package->run();

This method starts the Hadoop::Streaming::Reducer instance.

After creating a new object instance, it reads from STDIN and calls $object->reduce( ) passing in the key and an iterator of values for that key.

Subclasses need only implement reduce() to produce a complete Hadoop Streaming compatible reducer.

emit

$object->emit( $key, $value )

This method emits a key,value pair in the format expected by Hadoop::Streaming. It does this by calling $self->put(). Catches errors from put and turns them into warnings.

put

$object->put( $key, $value )

This method emits a key,value pair to STDOUT in the format expected by Hadoop::Streaming. (key\tvalue\n)

AUTHORS

andrew grangaard <spazm@cpan.org>
Naoya Ito <naoya@hatena.ne.jp>

COPYRIGHT AND LICENSE

This is free software; you can redistribute it and/or modify it under the same terms as the Perl 5 programming language system itself.

To install Hadoop::Streaming::Mapper, copy and paste the appropriate command in to your terminal.

cpanm

cpanm Hadoop::Streaming::Mapper

CPAN shell

perl -MCPAN -e shell
install Hadoop::Streaming::Mapper

For more information on module installation, please visit the detailed CPAN module installation guide.

	Global
`s`	Focus search bar
`?`	Bring up this help dialog

	GitHub
`g` `p`	Go to pull requests
`g` `i`	go to github issues (only if github is preferred repository)

	POD
`g` `a`	Go to author
`g` `c`	Go to changes
`g` `i`	Go to issues
`g` `d`	Go to dist
`g` `r`	Go to repository/SCM
`g` `s`	Go to source
`g` `b`	Go to file browse

	Search terms
module: (e.g. module:Plugin)
distribution: (e.g. distribution:Dancer auth)
author: (e.g. author:SONGMU Redis)
version: (e.g. version:1.00)