NAME
Hadoop::IO::RCFile::Reader
VERSION
version 0.003
SYNOPSIS
use Hadoop::IO::RCFile::Reader;
my $table_reader = Hadoop::IO::RCFile::Reader->new(directory => '/user/hive/warehouse/sabbir.db/', webhdfs_client => $webhdfs_client);
while($table_reader->next()) {
my $current_row = $table_reader->current_row();
}
DESCRIPTION
This module decodes a RCFILE based hive table and reads rows from the table. It reads directly from HDFS file, so no partition information available, only the data of the file will be read. User need to take care of partition informations.
The documentation about the file format can be found here: https://hive.apache.org/javadocs/r2.1.1/api/org/apache/hadoop/hive/ql/io/RCFile.html
NAME
Hadoop::IO::RCFile::Reader - Read the RCFILE based hive table from HDFS through the WebHDFS API
METHODS
new
The constructor. Accepts parameters in key => value format.
directory
Name of the directory/file;
webhdfs_client
A Net::Hadoop::WebHDFS client.
next
Move the current row pointer to next row, must be called before reading any row. First call will make the first row as current row. Returns true if it can move the pointer to next row, false if no more rows available to read.
current_row
Returns the current row as a reference of list of columns from left to right.
AUTHORS
Philippe Bruhat
Sabbir Ahmed
Somesh Malviya
Vikentiy Fesunov
COPYRIGHT AND LICENSE
This software is copyright (c) 2023 by Booking.com.
This is free software; you can redistribute it and/or modify it under the same terms as the Perl 5 programming language system itself.