NAME
parpush - Secure transfer a file/directory to a cluster of machines via SSH
SYNOPSIS
# You need to set first a configuration file or have a
# $HOME/.csshrc file with the cluster option defined:
$ cat Cluster
cluster1 = machine1 machine2 machine3
cluster2 = machine2 machine4 machine5
num = 193.140.101.175 193.140.101.246
# or have the .csshrc file associated with cssh.
# Copy 'sourcefile' to the union of cluster1 and cluster2
$ parpush sourcefile cluster1+cluster2:/tmp
# Copy 'sourcefile' to the intersection of cluster1 and cluster2
# i.e. to 'machine2'
parpush sourcefile cluster1*cluster2:/tmp
# Copy 'sourcefile' to the machines in cluster1 that don't belong to cluster2
# i. e. to 'machine1', 'machine2'
$ parpush sourcefile cluster1-cluster2:/tmp
# Copies 'sourcefile' to file 'tfmachine1.txt' in 'machine1'
# and to 'tfmachine2.txt' in 'machine2'. The macro '@=' inside
# a path is substituted by the name of the machine
$ parpush sourcefile cluster1-cluster2:/tmp/tf@=.txt
# A more complicated formula:
$ parpush 'sourcefile' '(cluster1+cluster2)-num:/tmp'
# Though 'machine2' is an alias for 193.140.101.175, they aren't
# considered equal by parpush. The file will be transferred to
# 'machine2'
# Several cluster expressions may appear
# Send 'sourcefile' to machines in cluster1 but not in cluster2
# and store it in /tmp. Send it to machines in cluster2 but not cluster1
# and store it at the home directory
$ parpush sourcefile cluster1-cluster2:/tmp cluster2-cluster1:
# Copy from remote machine 'machine1' the file 'file.txt' to
# all the machines in 'cluster1' other than 'machine1':
$ parpush machine1:file.txt cluster1-machine1:
# You can also transfer several files from different machines to
# some set of machines:
$ parpush 'machine1:file1.txt machine2:file2.txt' cluster1-machine1-machine2:
# protect the source with single quotes. Clusters aren't allowed
# in the sourcefile section. A command like this gives an error:
$ parpush 'cluster1:file.txt' machine1+machine2:
# A combination of local and remote files can be sent
$ parpush 'localfile.txt machine4:remote.txt' cluster1-machine1-machine2:
# sends 'localfile.txt' in the local machine and 'remote.txt' in 'machine4'
# to machine3
# globs can be used in the sourcefile argument:
$ parpush 'file* machine1:*.pl' cluster2-machine2:/tmp
# All the files matching the glob 'file*' in the local machine will be sent.
# Also those in machine1 matching the glob '*.pl'
$ parpush 'machine1:*.pl machine2:dir/' :/tmp
# All the files matching the glob 'file*' in the 'machine1' will be sent
# to the local machine. The directory 'dir/' in machine2 will be also sent
# to '/tmp/dir/' in the local machine.
$ parpush 'machine1:file.txt machine2:file.txt' machine3/tmp/file_txt.@#
# The macro '@#' stands for the "source machine". Thus, in the example
# above file 'file.txt' in machine1 will be copied to file '/tmp/file.txt.machine1'
# in machine3. File 'file.txt' in machine2 will be copied to
# file '/tmp/file.txt.machine2' in machine3.
$ parpush 'machine1:file.txt machine2:file.txt' :/tmp/file_txt.@#
# The macro '@#' stands for the "source machine". Thus, in the example
# above file 'file.txt' in machine1 will be copied to file '/tmp/file.txt.machine1'
# in the local machine. File 'file.txt' in machine2 will be copied to
# file '/tmp/file.txt.machine2' in the local machine
INSTALLATION
Install Set::Scalar first. Then the installation uses the traditional procedure. The program cssh
(clustercssh
) is not needed but I recommend its installation. Then issue the usual commands (or use cpan
):
perl Makefile.PL
make
make test
make install
SETTING AUTOMATIC AUTHENTICATION
To use this script you have to set automatic authentication via SSH between the source machine (your machine) and the other destiny machines. This section explains the simplified procedure.
SSH includes the ability to authenticate users using public keys. Instead of authenticating the user with a password, the SSH server on the remote machine will verify a challenge signed by the user's private key against its copy of the user's public key. To achieve this automatic ssh-authentication you have to:
Generate a public key use the
ssh-keygen
utility. For example:local.machine$ ssh-keygen -t rsa -N ''
The option
-t
selects the type of key you want to generate. There are three types of keys: rsa1, rsa and dsa. The-N
option is followed by the passphrase. The-N ''
setting indicates that no passphrase will be used. This is useful when used with key restrictions or when dealing with cron jobs, batch commands and automatic processing which is the context in which this module was designed. If still you don't like to have a private key without passphrase, provide a passphrase and usessh-agent
to avoid the inconvenience of typing the passphrase each time.ssh-agent
is a program you run once per login session and load your keys into. From that moment on, anyssh
client will contactssh-agent
and no more passphrase typing will be needed.By default, your identification will be saved in a file
/home/user/.ssh/id_rsa
. Your public key will be saved in/home/user/.ssh/id_rsa.pub
.Once you have generated a key pair, you must install the public key on the remote machine. To do it, append the public component of the key in
/home/user/.ssh/id_rsa.pub
to file
/home/user/.ssh/authorized_keys
on the remote machine. If the
ssh-copy-id
script is available, you can do it using:local.machine$ ssh-copy-id -i ~/.ssh/id_rsa.pub user@remote.machine
Alternatively you can write the following command:
$ ssh remote.machine "umask 077; cat >> .ssh/authorized_keys" < /home/user/.ssh/id_rsa.pub
The
umask
command is needed since the SSH server will refuse to read a/home/user/.ssh/authorized_keys
files which have loose permissions.Edit your local configuration file
/home/user/.ssh/config
(seeman ssh_config
in UNIX) and create a new section forGRID::Machine
connections to that host. Here follows an example:... # A new section inside the config file: # it will be used when writing a command like: # $ ssh gridyum Host gridyum # My username in the remote machine user my_login_in_the_remote_machine # The actual name of the machine: by default the one provided in the # command line Hostname real.machine.name # The port to use: by default 22 Port 2048 # The identitiy pair to use. By default ~/.ssh/id_rsa and ~/.ssh/id_dsa IdentityFile /home/user/.ssh/yumid # Useful to detect a broken network BatchMode yes # Useful when the home directory is shared across machines, # to avoid warnings about changed host keys when connecting # to local host NoHostAuthenticationForLocalhost yes # Another section ... Host another.remote.machine an.alias.for.this.machine user mylogin_there ...
This way you don't have to specify your login name on the remote machine even if it differs from your login name in the local machine, you don't have to specify the port if it isn't 22, etc. This is the recommended way to work with
GRID::Machine
. Avoid cluttering the constructornew
.Once the public key is installed on the server you should be able to authenticate using your private key
$ ssh remote.machine Linux remote.machine 2.6.15-1-686-smp #2 SMP Mon Mar 6 15:34:50 UTC 2006 i686 Last login: Sat Jul 7 13:34:00 2007 from local.machine user@remote.machine:~$
DESCRIPTION
parpush
push files and directories across sets of remote machines.
Options
Valid options are:
--configfile file : Configuration file
--scpoptions : A string with the options for scp.
The default is no options and '-r' if
sourcefile is adirectory
--program : A string with the name of the program to use for secure copy
by default is 'scp'
--processes : Maximum number of concurrent processes
--verbose
--xterm : runs cssh to the target machines
--help : this help
--Version
Cluster Syntax
parpush
looks for a filename named Cluster
in the current directory or it looks for the cluster ~/.csshrc
used by cssh
:
$ cat Cluster
cluster1 = machine1 machine2 machine3
cluster2 = machine2 machine4 machine5
num = 193.140.101.175 193.140.101.246
See man cssh
to find out how to describe a cluster in the ~/.csshrc
file.
Cluster Expressions
s + t
unions * t
intersections - t
differences % t
symmetric_difference
Path Syntax. The @=
macro
Inside a path the macro @=
stands for the name of the current machine. Thus, the command:
$ parpush file.txt machine1+machine2:/tmp/@=.txt
copies file.txt
to machine1.txt
in machine1
and to machine2.txt
in machine2
.
Path Syntax. The @#
macro
Inside a path the macro @#
stands for the name of the source machine. Thus, the command:
$ parpush 'machine1:file.txt machine2:file.txt' :/tmp/file_txt.@#
copies file.txt
in machine1
to /tmp/file.txt.machine1
in the local machine and the file with the same name in machine2
to the file /tmp/file.txt.machine2
.
Source Syntax
If your source is a file or directory nothing is needed. If you are going to send several files you must protect them inside single quotes as in:
$ parpush 'machine1:file1.txt machine2:file2.txt' cluster1-machine1-machine2:
the only cluster expressions allowed in sourcefile expressions are machines. Cluster expressions aren't supported.
SEE ALSO
Cluster ssh: cssh http://sourceforge.net/projects/clusterssh/
Project C3 http://www.csm.ornl.gov/torc/C3/
AUTHOR
Casiano Rodriguez-Leon <casiano.rodriguez.leon@gmail.com>
COPYRIGHT AND LICENSE
Copyright (C) 2009-2009 by Casiano Rodriguez-Leon
This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself, either Perl version 5.8.8 or, at your option, any later version of Perl 5 you may have available.