NAME

IO::Plumbing - pluggable, lazy access to system commands

SYNOPSIS

use IO::Plumbing qw(plumb);

my $command = IO::Plumbing->new
    ( program => "echo",
      args    => [ "Hello,",  "world" ],
    );

# same thing
$command = plumb("echo", args => [qw"Hello, world"]);

$command->execute;  # starts pipeline - still running
if ($command->ok) { # waits for completion
    # success
}

# input plumbing - connects FHs before running
$command->program("cat");
$command->args(["-e", "-n"]);
$command->input("filename");

if ($command->ok) {
    # no plumbing, we just caught it to a buffer
    my $output = $command->terminus->output;
}

# connecting pipelines
$command->output(plumb("od", args => ["-x"]));

# as traditional, we start from the beginning and wait
# on the command at the end of the chain.
$command->execute;

if ($command->terminus->ok) {
    # success.
    print "We got:\n";
    print $command->terminus->output;
}

# other shorthand stuff - moral equivalents of:
#   for reading:    zero null urandom   heredoc
#   for writing:    null full "|gpg -e" var=`CMD`
use IO::Plumbing qw(vent plug prng      bucket    );

# themed import groups!
use IO::Plumbing qw(:tools);   # everything so far

DESCRIPTION

IO::Plumbing is a module designed for writing programs which work a bit like shell scripts; where you have data sources, which are fed into pipelines of small programs, connected to make a larger computing machine.

The intention is that the interface behaves much like modules such as IO::All, which is capable of starting threads with external programs. However, the IO::Plumbing object is stackable, and relatively complex arrangements of filehandles and subprocesses are available.

When you plug two or more of these things together, they won't start running commands immediately - that happens the moment you try to read from the output. So, they are lazy.

FUNCTIONS

THE BASIC PLUMBINGS

These functions all return a new IO::Plumbing object with a different configuration.

plumb(cmdline, arg => value, ...)

Shortcut for making a new IO::Plumbing object. Passing in a cmdline with a space indicates that you want shell de-quoting.

prng()

Shortcut for /dev/urandom or other such locally available source of relatively entropic bit sequences.

When written to, creates a gpg instance that encrypts to the default recipient.

plug()

When read from, always returns end of file, like /dev/null on Unix.

When written to, always returns an error, like /dev/full on Unix. This is slightly different to the filehandle being closed. To use a real closed filehandle, just pass one in to input(), output() or stderr().

bucket( [ $contents ] )

A small (= in-core) spool of data. Returns end of file when the data has been sent. Specifying the contents is enough to do this.

When written to, fills with data as the process writes. In that case, the contents will normally be a pointer to an array or scalar to fill with input records.

Now, the thing about all of this is that you can only be pouring into one bucket at a time as the parent process is responsible for this. So, remember to only use one bucket at a time until that's all sorted out.

vent( [ $generator ] )

When read from, returns a stream of zeros (by default - or supply $generator), like /dev/zero on Unix.

When written to, happily consumes any amount of data without returning an error, like /dev/null on Unix.

hose( [ ... ] )

This represents a filehandle. This class is responsible for plugging into an IO::Plumbing contraption, and giving you a filehandle that you can read from or write to.

Arguments are passed to IO::Plumbing::Hose->new();

METHODS

Many of these methods are object properties.

cwd( $path )

Specify a directory to change to after fork() time. Honoured for code reference blocks, too. Defaults to undef, which does not alter the working directory.

env( [ { KEY => VALUE, ... } ])

Specify the process environment to use in the child. Defaults to undef, which does not alter the environment.

program( [ $path ] )

Specify the program to execute.

args( [ @command ] )

Specify a list of arguments to the command. ie, what gets passed to @ARGV in the child. Can be a list of strings or an ArrayRef.

all_args()

primarily of interest to those sub-classing the module, this lets you return something other than what "args" was set to when it comes time to execute.

cmdline("xxx")

As a shortcut to specifying program and args, specify a command line. No shell redirection is yet supported, only basic de-quoting.

code( sub { ... } )

Specify a piece of code to run, instead of executing a program. when the block is finished the child process will call exit(0).

If both code and an external program are passed, then the code block will be run. It receives the IO::Plumbing object as its first argument and the command line arguments after that.

input( [ $source] [, $weakref ] )

Specify the input source of this command pipe. Defaults to a plug.

If you pass a filehandle in, you might also like to call ->close_on_exec($source) on it to mark it to close when the pipeline executes.

If you pass in another IO::Plumbing object (or something which quacks like one), then that object's output property is automatically set to point back at this object. So, an IO::Plumbing chain is a doubly-linked list. The $weakref flag indicates this is what is happening, and aims to stop these circular references, which might otherwise cause memory leaks.

output( [ $dest] [, $weakref ] )

Specify the output this command pipe. Defaults to a bucket.

Pass in "|cmdname" as a string for a quick way to make more plumbing.

stderr( [ $dest] [, $weakref ] )

Specify where stderr of this stage goes. Defaults to STDERR of the current process.

connect_plumb( $direction, $number, $plumb, $reverse, $weak )

This is a generic interface to connect any plumb to any slot of the plumbing. The above three methods are shortcuts to invokation of this method.

$direction can be undef, 0 or "input" to mean input, anything else means output.

The $reverse parameter refers to which plumbing slot to plumb the other way into. undef or 0 means the first slot, which also conveniently generally does what you wanted.

$weak means to make the reference to $plumb a "weak" reference, and to not try to make a corresponding counter-plumb. This is used to break the infinite loop that might otherwise eventuate and would not normally be passed in by a user of this module.

This example:

$plumb->connect_plumb( input => 0, $plumb2, 1 );

Connects the standard error of $plumb2 to the standard input of $plumb.

has_plumb( $direction, $number )
get_plumb( $direction, $number )

Predicate/accessors for the plumbs at the various slots. Same input as the above.

get_plumb_pair( $direction, $number )
terminus()

Returns the last output object on the "output" chain of this pipeline. Frequently a bucket.

status()

Returns the current status of this piece of plumbing;

Value             Meaning
--------------------------------------------------
COMMAND_ERROR     Not good enough to exec() yet
COMMAND_READY     Got everything we need to run
COMMAND_RUNNING   In progress
COMMAND_DONE      Reaped
COMMAND_LOST      Process went AWOL
ready()
running()
done()

Aliases for checking whether the status is one of them

status_name()

Returns a description of the current status of the process

pid()

Returns the process ID of the running (or completed) process.

rc()

Returns the current return code of the process (ie, what $? was set to). If undefined, the program hasn't finished (or isn't started yet);

ok()

Returns true if the program exited cleanly.

error()

Returns a true value if the process returned an error code. Includes in the message whether the program exited cleanly, exited with an error code (and if so what the error code was), as well as whether it was killed by a signal (and what the signal was).

errormsg()

Just like error, except guaranteed to never produce a "use of uninitialised variable" warning by returning "finished normally" if the process ran successfully.

wait()

Waits for this specific piece of plumbing to finish.

name

Returns (or sets) a string descriptor for this piece of plumbing.

Available as the overloaded '""' (stringify) operator.

out_fh( [ $fh ] [ , $close_on_exec ] )

specify (or return) the filehandle that will become this child process' STDOUT

err_fh( [ $fh ] )

specify (or return) the filehandle that will become this child process' STDERR

has_fd( $num )
get_fd( $num )
set_fd( $num, $fd, [$close_on_exec] )

This is a generic interface to the various *_fh functions. Instead of specifying the filehandle you want to get or set by the name of the method, use the filehandle identifier. When the plumb is executed, filehandles will be connected appropriately.

in_fh( [ $fh ] [ , $close_on_exec ] )

specify (or return) the filehandle that will become this child process' STDIN.

execute()

starts this pipeline. Any link can be the starting point for an execute()

close_on_exec($fh [, $fh, ...])

Mark a filehandle that should be closed in the parent process when the pipeline is executed. Note that this is quite a different concept to the OS-level close on exec, which is hinted about at "$^F" in perlvar, which applies to filehandles which are closed in the child process. IO::Plumbing does not alter $^F.

If you are passing raw filehandles in, the module can't guess whether this filehandle is one that should be closed on execution of the pipeline, or whether it's one that as a parent process you intend to feed or read yourself.

With a normal file, that's not a huge problem - just a wasted FD in the parent process. With the input half of a pipe, it means that the other end will not see the filehandle closed when a sub-process closes it, and hence your pipeline will block as the next program waits forever for an end of file.

So long as you always pass IO::Plumbing objects to the input and output methods, you don't need to use this function; when those are converted from objects to filehandles, the temporary filehandles are always marked close on exec.

CLASS METHODS

These may also be called as object methods

IO::Plumbing->new( $att => $value, [ ... ] )

The constructor is very basic, it just calls bare accessors based on $att and $value and then calls BUILD.

IO::Plumbing->reap( [ $max ] )

check for any waiting children and update the RC values of all running plumbing objects, without ever blocking.

$max specifies the maximum number of children to reap at one time.

SUB-CLASS API

OVERRIDABLE METHODS

default_input

What to use as a default standard input when nothing else is given. Defaults to a IO::Plumbing::Plug (/dev/null). Override this in a sub-class to change this behaviour.

default_output

What to use as a default standard output. Defaults to a IO::Plumbing::Bucket (ie, a variable buffer).

default_stderr

Default standard error. Defaults to the calling process' STDERR.

needs_fork

Set this to return a true value if this piece of plumbing needs to fork; false otherwise.

needs_pipe( $direction, $number )

This is called when a plumb is about to set up FDs to another one.

fd_shape

This method should return a hash of arrays; it represents which input or output filehandle is connected to which system FD number. The default is:

{ input => [ 0 ], output => [ 1, 2 ] }
fd_num ( $direction, $number )

Functional interface to the above - return the (post-plumbed) FD number of the given output/slot pair. These arguments are the same as to "connect_plumb";

do_fork

A hook for forking

connect_hook ( $direction, $number )

A hook that is called once a connection is made.

prefer_code

This is another way to specify the code vs program behaviour of the plumbing; it is used by the default execute() function to decide whether to invoke an external program, or use the supplied code block, if both are provided.

The default is to prefer code on Windows.

DEBUGGING

To get debug information to STDERR about forking and plumbing, set IO_PLUMBING_DEBUG in the environment to 1.

To get further information useful for debugging the IO::Plumbing module, set it to 2 or higher.

AUTHOR AND LICENCE

Copyright 2007, 2008, Sam Vilain. All Rights Reserved. This program is free software; you can use it and/or modify it under the same terms as Perl itself.

BUGS / SUBMISSIONS

This is still currently quite experimental code, so it's quite likely that something straightforward you expect to work doesn't.

In particular, currently this module has not been ported to run under Windows; please e-mail the author if you are interested in adding support for that.

If you find an error, please submit the failure as an addition to the test suite, as a patch. Version control is at:

git://utsl.gen.nz/IO-Plumbing

See the file SubmittingPatches in the distribution for a basic command sequence you can use for this. Feel free to also harass me via https://rt.cpan.org/Ticket/Create.html?Queue=IO%3A%3APlumbing or mail me something other than a patch, but you win points for just submitting a patch in `git-format-patch` format that I can easily apply and work on next time.

To take that to its logical extension, you can expect well written patch series which include test cases and clearly described progressive changes to spur me to release a new version of the module with your great new feature in it. Because I hopefully didn't have to do any coding for that, just review.