NAME

Hadoop::Streaming::Mapper - Simplify writing Hadoop Streaming Mapper jobs. Write a map() function and let this role handle the Stream interface.

VERSION

version 0.101881

SYNOPSIS

#!/usr/bin/env perl

package Wordcount::Mapper;
use Moose;
with 'Hadoop::Streaming::Mapper';

sub map
{
  my ( $self, $line ) = @_;
  $self->emit( $_ => 1 ) for ( split /\s+/, $line );
}

package main;
Wordcount::Mapper->run;

Your mapper class must implement map($key,$value) and your reducer must implement reduce($key,$value). Your classes will have emit(), counter(), status() and run() methods added via a role.

METHODS

run

Package->run();

This method starts the Hadoop::Streaming::Mapper instance.

After creating a new object instance, it reads from STDIN and calls $object->map() on each line of input. Subclasses need only implement map() to produce a complete Hadoop Streaming compatible mapper.

AUTHORS

  • andrew grangaard <spazm@cpan.org>

  • Naoya Ito <naoya@hatena.ne.jp>

COPYRIGHT AND LICENSE

This software is copyright (c) 2010 by Naoya Ito <naoya@hatena.ne.jp>.

This is free software; you can redistribute it and/or modify it under the same terms as the Perl 5 programming language system itself.