NAME

PDL::FAQ - Frequently asked questions about PDL

DESCRIPTION

This is version 0.4 of the PDL FAQ, a collection of frequently asked questions about PDL - the Perl Data Language.

This FAQ was generated on 29.7.97.

Current maintainer: Christian Soeller (csoelle@sghms.ac.uk).

You can find the latest version of this document at http://www.aao.gov.au/local/www/kgb/perldl/faq.html. This FAQ will be monthly posted to the PDL mailing list perldl@jach.hawaii.edu.

This is still an early version of the PDL FAQ. As such it is almost certainly incomplete and maybe unclear in parts. You are explicitly encouraged to let us know about questions which you think should be answered in this document but currently aren't. Similarly, if you think parts of this document are unclear, please let us know. Send your comments to the PDL mailing list at perldl@jach.hawaii.edu (preferably) or to the FAQ maintainer Christian Soeller (csoelle@sghms.ac.uk).

Some questions and answers in this document are related to features of the current beta/alpha versions of PDL. To point this out these sections are marked with the strings '[!beta!]' or '[!alpha!]'. These reflect development that is under way and will hopefully soon culminate in the release of PDL 2.0. Right now we are quite confident about the stability of the latest alpha versions (1.9XXX) so that we suggest you had a look at those versions if you need features which are not yet supported in the released versions (see below for directions on how to get alpha distributions).

New ([+]) and changed ([!]) questions and answers in this release are marked.

GENERAL QUESTIONS

What is PDL ?

PDL stands for Perl Data Language. To say it with the words of Karl Glazebrook, initiator of the PDL project:

The PDL concept is to give standard perl5 the ability
to COMPACTLY store and SPEEDILY manipulate the large
N-dimensional data sets which are the bread and butter
of scientific computing. e.g. $a=$b+$c can add two
2048x2048 images in only a fraction of a second.

It is hoped to eventually provide tons of useful
functionality for scientific and numeric analysis.

For readers familiar with other scientific data evaluation packages it may be helpful to add that PDL is in many respects similar to IDL, MATLAB and similar packages. However, it tries to improve on a number of issues which were perceived (by the authors of PDL) as shortcomings of those existing packages.

Why yet another Data Language ?

There are actually several reasons and everyone should decide for himself which are the most important ones:

  • PDL is "free software". The authors of PDL think that this concept has several advantages: everyone has access to the sources -> better debugging, easily adaptable to your own needs, extensible for your purposes, etc...

  • PDL is based on a powerful and well designed scripting language: Perl. In contrast to other scientific/numeric data analysis languages it has been designed using the language features of a proven language instead of having grown into existence from scratch defining the control structures while features were added during development (leading to languages that often appear clumsy and badly planned for most existing packages with similar scope as PDL).

  • Using Perl as the basis a PDL programmer has all the powerful features of Perl at his hand, right from the start. This includes regular expressions, associative arrays (hashes), well designed interfaces to the operating system, network, etc. Experience has shown that even in mainly numerically oriented programming it is often extremely handy if you have easy access to powerful semi-numerical or completely non-numerical functionality as well. For example, you might want to offer the results of a complicated computation as a server process to other processes on the network, perhaps directly accepting input from other processes on the network. Using Perl and existing Perl extension packages things like this are no problem at all (and it all will fit into your "PDL script").

  • Extremely easy extensibility and interoperability as PDL is a Perl extension; development support for Perl extensions is an integral part of Perl and there are already numerous extensions to standard Perl freely available on the network.

  • Integral language features of Perl (regular expressions, hashes, object modules) immensely facilitated development and implementation of key concepts of PDL. One of the most striking examples for this point is probably PDL::PP (see below), a code generator/parser/pre-processor that generates PDL functions from concise descriptions.

  • None of the existing DLs follow the Perl language rules, which the authors firmly believe in:

    • TIMTOWTDI: There is more than one way to do it. Minimalist languages are interesting for computer scientists, but for users, a little bit of redundancy makes things wildly easier to cope with and allows individual programming styles - just as people speak in different ways. For many people this will undoubtedly be a reason to avoid PDL ;)

    • Simple things are simple, complicated things possible: Things that are often done should be easy to do in the language, whereas seldom done things shouldn't be too cumbersome.

    All existing languages violate at least one of these rules.

  • As a project for the future PDL should be able to use super computer features, e.g. vector capabilities/parallel processing. This will probably be achieved by having PDL::PP ([!alpha!], see below) generate appropriate code on such architectures to exploit these features.

  • [ fill in your personal 111 favourite reasons here...]

What is PDL good for ?

Just in case you do not yet know what the main features of PDL are and what one could do with them, here is a (necessarily selective) list of key features:

PDL is well suited for matrix computations, general handling of multidimensional data, image processing, general scientific computation, numerical applications. It supports I/O for many popular image and data formats, 1D (line plots), 2D (images) and 3D (volume visualisation, surface plots via OpenGL/MesaGL) graphics display capabilities and implements lots of numerical and semi-numerical algorithms.

[!alpha!] Some of these features (image I/O, 3D graphics (via OpenGL/MesaGL), matrix library) are currently in alpha testing.

What is the connection between PDL and Perl ?

PDL is a Perl5 extension package. As such it needs an existing Perl5 installation (see below) to run. Furthermore, much of PDL is written in perl (+ some core functionality that is written in C). PDL programs are (syntactically) just perl scripts that happen to use some of the functionality implemented by the package "PDL";

What do I need to run PDL on my machine ?

Since PDL is just a Perl package you need first of all an installation of Perl on your machine. As of this writing PDL requires version 5.004 of Perl, version 5.004 or higher is strongly recommended. More information on where and how to get a Perl installation can be found at the Perl home page http://www.perl.com and at many CPAN sites (if you do not know what CPAN is check the answer to the next question).

Furthermore, you need the PDL package which will be installed as an extension within your PERL installation. See below for directions how and where to get the latest PDL distribution.

Where do I get it?

PDL is available as source distribution in the Comprehensive Perl Archive Network, or CPAN. This archive contains not only the PDL distribution but also just about everything else that is Perl-related. CPAN is mirrored by dozens of sites all over the world. The main site is ftp://ftp.funet.fi. You can find a more local CPAN site by getting the file /pub/languages/perl/CPAN/MIRRORS from ftp://ftp.funet.fi. Alternatively, you can point your Web browser at http://www.perl.com and use its CPAN multiplex service. Within CPAN you find the latest released version of PDL in the directory CPAN/modules/by-module/PDL/. Another site that has the latest PDL distribution and the latest beta versions is http://www.aao.gov.au/local/www/kgb/perldl. There are currently no other mirror sites in other parts of the world. This will hopefully change soon.

What machines does PDL run on, then ?

Ideally, PDL should run on about every machine for which a port of Perl5 is available that supports Xsubs and the package Extutils::MakeMaker. You also need a C compiler on your machine to compile those core routines that are written in C or XS. In practice, you might run into problems if you would try to compile PDL on some platform it has never been tested on before. A list of platforms on which PDL has been successfully tested is available at http://www.aao.gov.au/local/www/kgb/perldl/ports.html. If you don't have a compiler you can check if a binary distribution for your platform is available (we haven't yet got round to making binary versions/bundles available but it is definitely on the TODO list) at the PDL home site located at http://www.aao.gov.au/local/www/kgb/perldl.

If you can (or cannot) get PDL working on a new (previously unsupported) platform we would like to hear about it. Please, report your success/failure to the PDL mailing list at perldl@jach.hawaii.edu. We will do our best to assist you in porting PDL to a new system.

What do I have to pay to get PDL?

We are delighted to be able to give you the nicest possible answer on a question like this: PDL is *free software* and all sources are publicly available. But still, there are some copyrights to comply with. So please, try to be as nice as we (the PDL authors) are and try to comply with them.

Oh, before you think it is *completely* free: you have to invest some time to pull the distribution from the net, compile and install it and (maybe) read the manuals.

In the future, we hope to be able to supply bundles/binaries for a number of popular architectures. However, as of this writing you will have to find some means of how and where to compile the package yourself.

Are there other PDL information sources on the internet?

First of all, for all purely Perl-related questions (see above why we often talk about Perl in the PDL FAQ) there are tons of sources on the net. A good point to start is http://www.perl.com.

The PDL home site can be accessed by pointing your web browser to http://www.aao.gov.au/local/www/kgb/perldl. It has tons of goodies for anyone interested in PDL:

  • PDL distributions

  • Online documentation

  • Pointers to an HTML archive of the PDL mailing lists

  • A list of platforms on which PDL has been successfully tested.

  • News about recently added features, ported libraries, etc.

  • Name of the current pumpkin holders for the different PDL modules (if you want to know what that means you better had a look at the web pages).

If you are interested in PDL in general you can join the PDL mailing list perldl@jach.hawaii.edu. This is a forum to discuss programming issues in PDL, report bugs, seek assistance with PDL related problems, etc. To subscribe, send a message to perldl-request@jach.hawaii.edu containing a string in the following format:

subscribe me@my.email.address

where you should replace the string me@my.email.address with your email address. Past messages can be retrieved in digest format by anonymous ftp from ftp://ftp.jach.hawaii.edu/pub/ukirt/frossie/pdlp/. A searchable archive and a hypertext version of the traffic on this list can be found at http://www.xray.mpe.mpg.de/mailing-lists/perldl/.

If you are interested in all the technical details of the ongoing PDL development you can join the PDL developers mailing list pdl-porters@jach.hawaii.edu. To subscribe, send a message to pdl-porters-request@jach.hawaii.edu containing a string in the following format:

subscribe me@my.email.address

where you should replace the string me@my.email.address with your email address. Past messages can be retrieved in digest format by anonymous ftp from ftp://ftp.jach.hawaii.edu/pub/ukirt/frossie/pdlp/. A searchable archive and a hypertext version of the traffic on this list can be found at http://www.xray.mpe.mpg.de/mailing-lists/pdl-porters/.

Crossposting between these lists should be avoided unless there is a very good reason for doing that.

What is the current version of PDL ?

As of this writing (FAQ version 0.4 of 29.7.97) the latest released version is 1.11. Currently in alpha test is 1.93_03. For those of you who are really audacious (and like to run into bugs) directions on how to get the current alpha versions of the latest "hot" PDL modules can be found at http://www.aao.gov.au/local/www/kgb/perldl/alpha.html.

I am looking for a package to do XXX in PDL. Where shall I look for it?

A good place to start is again http://www.aao.gov.au/local/www/kgb/perldl. We hope to get round to compiling a list of packages that have already been/are in the process of being interfaced to PDL RSN (you know what that means...). This information will be accessible through the PDL home site.

Currently, the main PDL related information source is the PDL mailing list at perldl@jach.hawaii.edu (But see also the question on information sources). It is devoted to information exchange about all general issues related to PDL. If you want to ask a development related question there is the PDL development mailing list pdl-porters@jach.hawaii.edu. Check the question about information sources for subscription directions and locations of archives of past/recent messages.

Before you post your questions to the list(s) make sure

  • that your problem has not already been dealt with in another section of this FAQ.

  • that you have read the manual(s) (RTFM!!).

  • that your problem is not a general perl programming question in which case you better check the perl FAQ (available at http://www.perl.com/perl/faq) and/or ask the question in the relevant perl newsgroups/mailing lists.

[!] [!alpha!] There is this great XXX package on the net. Has it already been interfaced to PDL or how can I do it?

Check on PDL's home site http://www.aao.gov.au/local/www/kgb/perldl if the package in question has already been ported/interfaced to PDL. How to interface a new package to PDL is explained in PDL::PP (see below if you don't know what PDL::PP is). Note that people willing to write interfaces for new packages should target them toward the upcoming beta versions since the internals of PDL have changed a lot since the latest released version (1.11).

I want to contribute to the further development of PDL. How can I help?

If you have a certain project in mind you should check if somebody else is already working on it or if you could benefit from existing modules. Do so by posting your planned project to the PDL developers mailing list at pdl-porters@jach.hawaii.edu. To subscribe, send a message to pdl-porters-request@jach.hawaii.edu containing a string in the following format:

subscribe me@my.email.address

where you should replace the string me@my.email.address with your email address. You can also read past and current mails in the searchable hypertext version of the mailing list at http://www.xray.mpe.mpg.de/mailing-lists/pdl-porters/. We are always looking for people to write code and/or documentation ;).

I think I have found a bug in the current version of PDL. What shall I do?

First, make sure that the bug/problem you came across has not already been dealt with somewhere else in this FAQ. Secondly, you can check the searchable archive of the PDL mailing list at whether this bug has already been discussed. If you still haven't found any explanations you can post a bug report to perldl@jach.hawaii.edu.

TECHNICAL QUESTIONS

What is perldl?

Sometimes perldl is used as a synonym for PDL. Strictly speaking, however, the name perldl is reserved for the little shell that comes with the PDL distribution and is supposed to be used for the interactive prototyping of PDL scripts. For details check the perldl man page.

I want to access the third element of a pdl but $a[2] doesn't work ?!

See answer to the next question why the normal perl array syntax doesn't work for pdls.

The docs say pdls are some kind of array. But why doesn't the perl array syntax work with pdls then ?

Ok, you are right in a way. The docs say that pdls can be thought of arrays. More specifically, it says (PDL):

I find when using perlDL it is most useful to think of
standard perl @x variables as "lists" of generic "things"
and PDL variables like $x as "arrays" which can be contained
in lists or hashes.

So, while pdls can be thought of as some kind of multi-dimensional array they are not arrays in the perl sense. Rather, from the point of view of perl they are some special class (which is currently implemented as an opaque pointer to some stuff in memory) and therefore need special functions (or 'methods' if you are using the OO version) to access individual elements or a range of elements. The functions/methods to check are at/sec (see PDL manpage) or in the new alpha versions [!alpha!] the powerful slice function and friends (see PDL::Indexing).

Finally, to confuse you completely, you can have perl arrays of plds, e.g. $spec[3] can refer to a pdl representing ,e.g, a spectrum, where $spec[3] is the fourth element of the perl list (or array ;) @spec. This may be confusing but is very useful !

[!alpha!] How do I get online help for PDL?

This is currently a subject of ongoing development. We hope to be able to come up with an online help feature soon. A refcard and searchable index are planned as well. Support for these features will be built into the perldl shell.

[!alpha!] What on earth is this dataflow stuff ?

Dataflow is an experimental project that you don't need to concern yourself with (it should not interfere with your usual programming). However, if you want to know, have a look at PDL::Dataflow in the current alpha distribution. There are applications which will benefit from this feature (and it is already at work behind the scenes in the alpha versions).

[!alpha!] There is this strange pre-processor package (PDL::PP). Do I have to know about it?

PDL::PP is used to compile very concise definitions into XSUB routines implemented in C that can easily be called from PDL and which automatically support threading, dataflow and other things without you having to worry about it.

For further details check PDL::PP.

Sometimes I am getting these strange results when using inplace operations ?

This question is related to the inplace function. From the documentation (see PDL manpage):

Most functions, e.g. log(), return a result which is
a transformation of their argument. This makes for
good programming practice. However many operations can
be done "in-place" and this may be required when large
arrays are in use and memory is at a premium. For these
circumstances the operator inplace() is provided which
prevents the extra copy and allows the argument to be
modified. e.g.:

$x = log($array);          # $array unaffected
log( inplace($bigarray) ); # $bigarray changed in situ

And also from the doc !!:

Obviously when used with some functions which can
not be applied in situ (e.g. convolve()) unexpected
effects may occur!

Check the list of PDL functions at the end of PDL.pod which points out inplace-safe functions.

[!alpha!] What is this strange usage of the string concatenation operator .= in PDL scripts ?

See next question on assignment in PDL.

[!alpha!] Why are there two different kinds of assignment in PDL ?

This is caused by the fact that currently the assignment operator = allows only restricted overloading. For some purposes of PDL (new indexing features, dataflow) it turned out to be necessary to have more control over the overloading of an assignment operator. Therefore, current alpha versions of PDL peruse the operator .= for certain types of assignments. For details see the documentation about indexing/threading and dataflow that come with those versions of PDL.

[!] What happens when I have several references to the same PDL object in different variables (cloning, etc?) ?

Piddles behave like perl references in many respects. So when you say

$a = pdl [0,1,2,3];
$b = $a;

then both $b and $a point to the same object, e.g. then saying

$b++;

will *not* create a copy of the original piddle but just increment in place, of which you can convince yourself by saying

print $a;
[1 2 3 4]

This should not be mistaken for dataflow which connects several *different* objects so that data changes are propagated between the so linked piddles (though, under certain circumstances, dataflown piddles can share physically the same data).

It is important to keep the "reference nature" of piddles in mind when passing piddles into subroutines. If you modify the input pdls you modify the original argument, not a copy of it. This is different from some other array processing languages but makes for very efficient passing of piddles between subroutines. If you do not want to modify the original argument but rather a copy of it just create a copy explicitly:

sub myfunc {                    # silly example function
   my $pdl = shift;
   if ($pdl->is_inplace)        # modify inplace if *explicitly* requested
      {$pdl->set_inplace(0)}
   else                         # modify a copy by default
      {$pdl = $pdl->copy}
   $pdl->set(0,0);
   return $pdl;
}

[+] [!alpha!] What I/O formats are supported by PDL ?

The current versions of PDL already support quite a number of different I/O formats. However, it is not always obvious which module implements which formats. To help you find the right module for the format you require, here is a short list of the current list of I/O formats and a hint in which module to find the implementation:

raw format

A home brew fast raw (binary) I/O format for PDL is implemented by the FastRaw module

a more generic raw format

The FlexRaw module implements generic methods for the input and output of `raw' data arrays. In particular, it is designed to read output from FORTRAN 77 UNFORMATTED files and the low-level C write function, even if the files are compressed or gzipped.

It is possible that the FastRaw functionality will be included in the FlexRaw module at some time in the future.

FITS

FITS I/O is implemented by the wfits/rfits functions in PDL::IO::Misc.

ASCII

Ascii file I/O in various formats can be achieved by using the rcols and rgrep functions, also in PDL::IO::Misc.

image formats (TIFF, GIF, JPEG, etc)

PDL::IO::Pic implements an interface to the netpbm/pbm+ filters to read/write several popular image formats; also supported is output of image sequences as MPEG movies.

NetCDF

On CPAN you can find the PDL-NetCDF module that works with the current released version of PDL 1.11. Some minor modifications are required if you want to use this module with the current alpha versions.

For further details consult the documentation in the individual modules.

[+] [!alpha!] What is a null pdl ?

null is a special token for 'empty piddle'. A null pdl can be used to flag to a PDL function that it should create an appropriately sized and typed piddle. <Null> piddles can be used in places where a PDL function exspects an output or temporary argument. Output and temporary arguments are flagged in the signature of a PDL function with the [o] and [t] qualifiers (see next question if you don't know what the signature of a PDL function is). For example, you can invoke the sumover function as follows:

sumover $a, $b=null;

which is equivalent to

$b = sumover $a;

If this seems still a bit murky check PDL::Indexing and PDL::PP for details about calling conventions, the signature and threading (see also below).

[+] [!alpha!] What is the signature of a PDL function ?

The signature of a function is an important concept in PDL. Many (but not all) PDL functions have a signature which specifies the arguments and their (minimal) dimensionality. As an example, look at the signature of the maximum function:

'a(n); [o] b;'

this says that maximum takes two arguments, the first of which is (at least) one-dimensional while the second one is zero-dimensional and an output argument (flagged by the [o] qualifier). If the function is called with pdls of higher dimension the function will be repeatedly called with slices of these pdls of appropriate dimension (this is called threading in PDL).

For details and further explanations consult PDL::Indexing and PDL::PP.

PDL JARGON

[!alpha!] Oops, what is threading (is PDL a newsreader) ?

In the context of PDL threading has a different meaning from what you would probably normally associate with the term. Here, it denotes a feature of PDL that can be loosely defined as an implicit looping facility. For details check the PDL::Indexing manpage.

What is a piddle (;) ?

Well, PDL scalar variables (which are instances of a particular class of perl objects, i.e. blessed thingies (see perlfaq)) are in common PDL parlance often called piddles (for example, check the mailing list archives). Err, clear? If not, simply use the term piddle when you refer to a PDL variable (an instance of a PDL object as you might remember) regardless of what actual data the PDL variable contains.

CHANGES

0.4

  • use of perl5.004 is now required

  • PDL I/O formats

  • piddles behave like perl references

  • null PDL's and output arguments

  • signature

0.3

  • questions about pdls and perl array syntax

  • added requirement for C compiler in answer to 'what machines...' question

  • PDL jargon section

  • piddles

0.2

  • upgraded released/alpha version numbers

  • added another WYANDL reason

  • split into perldl/pdl-porters mailing lists

0.1

  • initial revision

BUGS

If you find any inaccuracies in this document (or disfunctional URLs) please report to the perldl mailing list perldl@jach.hawaii.edu or to the current FAQ maintainer Christian Soeller (csoelle@sghms.ac.uk).

ACKNOWLEDGEMENTS

Achim Bohnet (ach@mpe.mpg.de) for suggesting CoolHTML as a prettypodder and various other improvements.

AUTHOR & COPYRIGHT

This document emerged from a joint effort of several PDL developers (Karl Glazebrook (kgb@aaocbn1.aao.GOV.AU), Tuomas J. Lukka (lukka@husc.harvard.edu), Christian Soeller (csoelle@sghms.ac.uk)) to compile a list of the most frequently asked questions about PDL with answers. Permission is granted for verbatim copying (and formatting) of this material as part of PDL. Permission is explicitly not granted for distribution in book or any corresponding form. Email the current FAQ maintainer Christian Soeller (csoelle@sghms.ac.uk) or ask on the PDL mailing list perldl@jach.hawaii.edu if you are unclear.