NAME

Hadoop::IO::SequenceFile

VERSION

version 0.002

DESCRIPTION

This class handles serialization of records in Hadoop SequenceFile format.

NAME

Hadoop::IO::SequenceFile - Hadoop / Hive compatible SequenceFile serializer.

METHODS

$class->new(%args) -> $inst

Create and return new instance of SequenceFile serializer.

Supported arguments are:

writer

Either instance of Hadoop::IO::SequenceFile::HDFSWriter or a coderef. Coderef will be called with a single argument: data to be written to the file. It will be called multiple times.

key_class

Name of the perl package responsible for encoding keys in this file. Default is Hadoop::IO::SequenceFile::BytesWriteable, which is equivalent to what Hive uses by default.

val_class

Name of the perl package responsible for encoding values in this file. Default is Hadoop::IO::SequenceFile::Text, which is equivalent to what Hive uses by default.

$self->write_header()

This should be called soon after creating new file and before first write_record or write_row. Do not call this if you just want to append to a pre-existing file.

$self->write_record($key, $val)

Writes next new record to the file. $key and $val will be encoded using key_class and val_class passed to the constructor.

$self->write_row(@values)

Writes a sequence of fields in a format compatible with LazySimpleSerDe which Hive uses by default.

AUTHORS

  • Philippe Bruhat

  • Sabbir Ahmed

  • Somesh Malviya

  • Vikentiy Fesunov

COPYRIGHT AND LICENSE

This software is copyright (c) 2023 by Booking.com.

This is free software; you can redistribute it and/or modify it under the same terms as the Perl 5 programming language system itself.