NAME
Hadoop::HDFS::Command
VERSION
version 0.001
SYNOPSIS
use Hadoop::HDFS::Command;
my $hdfs = Hadoop::HDFS::Command->new;
my @rv = $hdfs->$command( @command_args );
DESCRIPTION
This is a simple wrapper around the hdfs commandline to make them easier to call from Perl and parse their output.
The interface is partially done at the moment (see the implemented wrappers down below).
NAME
Hadoop::HDFS::Command - Wrappers for various hadoop hdfs cli commands
METHODS
new
The constructor. Available attributes are listed below.
cmd_hdfs
Default value is /usr/bin/hdfs
. This option needs to be altered if you have the `hdfs`
command in some other place.
dfs
One of the top level commands, including an interface to the sub-commands listed below. The calling convention of the sub commands is as simple as:
my @rv = $hdfs->dfs( \%options, $sub_command => @subcommand_args );
# options hash is optional
my @rv = $hdfs->dfs( $sub_command => @subcommand_args );
Available options are listed below:
- ignore_fail :Bool
-
Global.
- silent :Bool
-
Global.
- want_epoch :Bool
-
Only used for
ls
. Converts timestamps to epoch. - callback :CODE
-
Only used for
ls
. The callback always needs to return true to continue processing, returning false from it will short-circuit the processor.
du
The @subcommand_args
can have these defined: -s
, -h
.
my @rv = $hdfs->dfs( du => @subcommand_args => $hdfs_path );
my @rv = $hdfs->dfs( du => qw( -h -s ) => "/tmp" );
my @rv = $hdfs->dfs(
{
ignore_fail => 1,
silent => 1,
},
du => -s => @hdfs_paths,
);
ls
The @subcommand_args
can have these defined: -d
, -h
, R
.
my @rv = $hdfs->dfs( ls => @subcommand_args => $hdfs_path );
The callback can be used to prevent buffering and process the result set yourself. The callback always needs to return true to continue processing. If you want to skip some entries but continue processing then a true value needs to be returned. A bare return (which is false) will short circuit the iterator and discard any remaining records.
my %options = (
callback => sub {
# This callback will receive a hash meta-data about the file.
my $file = shift;
if ( $file->{type} eq 'dir' ) {
# do something
}
# skip this one, but continue processing
return 1 if $file->{type} ne 'file';
# do something
return if $something_really_bad_so_end_this_processor;
# continue processing
return 1;
},
# The meta-data passed to the callback will have an "epoch"
# key set when this is true.
want_epoch => 1,
);
# execute the command recursively on the path
$hdfs->dfs( \%options, ls => -R => $hdfs_path );
mv
my @rv = $hdfs->dfs( mv => $hdfs_source_path, $hdfs_dest_path );
put
The @subcommand_args
can have these defined: -f
, -p
, -l
$hdfs->dfs( put => @subcommand_args, $local_path, $hdfs_path );
# notice the additional "-"
$hdfs->dfs( put => '-f', '-', $hdfs_path, $in_memory_data );
rm
The @subcommand_args
can have these defined: -f
, -r
, -skipTrash
$hdfs->dfs( rm => @subcommand_args, $hdfs_path );
SEE ALSO
`hdfs dfs -help`
.
AUTHOR
Burak Gursoy <burak@cpan.org>
COPYRIGHT AND LICENSE
This software is copyright (c) 2016 by Burak Gursoy.
This is free software; you can redistribute it and/or modify it under the same terms as the Perl 5 programming language system itself.