NAME
Inline::Awk - Add awk code to your Perl programs.
SYNOPSIS
Call an awk function from a Perl program:
use Inline AWK;
hello("awk");
__END__
__AWK__
function hello(str) {
print "Hello " str
}
Or, call an entire awk program using the awk()
function:
use Inline AWK;
awk(); # operates on @ARGV by default
__END__
__AWK__
# Count the number of lines in a file
END { print NR }
DESCRIPTION
The Inline::Awk
module allows you to include awk code in your Perl program. You can call awk functions or entire programs.
Inline::Awk works by converting awk code into Perl code using the a2p
utility which comes as standard with Perl. This means that you don't require awk to use the Inline::Awk module.
Here is an example of how you would incorporate some awk functions into a Perl program:
use Inline AWK;
$num = 5;
$str = 'ciao';
print square($num), "\n";
print echo($str), "\n";
print "Now, back to our normal program.\n"
__END__
__AWK__
function square(num) {
return num * num
}
function echo(str) {
return str " " str
}
You can call an awk program via the awk()
function. Here is a simple version of the Unix utility wc
which counts the number of lines, words and characters in a file:
use Inline AWK;
awk();
__END__
__AWK__
# Simple minded wc
BEGIN {
file = ARGV[1]
}
{
words += NF
chars += length($0) +1 # +2 in DOS
}
END {
printf("%7d%8d%8d %s\n", NR, words, chars, file)
}
awk()
The awk()
function is imported into you Inline::Awk program by default. It allows you to run Inline::Awk
code as a program and to pass arguments to it.
Say, for example, that you have an awk program called parsefile.awk
that is normally run like this:
awk -f parsefile.awk type=1 example.ini
If you then turned parsefile.awk
into an Inline::Awk program, (perhaps by using the a2a
utility in the distro), you could run the code from within a Perl program as follows:
awk('type=1', 'example.ini');
If you are using -w
or warnings
in your Perl program you should quote any literal string in the variable assignment that you pass:
awk('type=ini', 'example.ini'); # gives a warning with -w
awk('type="ini"', 'example.ini'); # no warning
The default action of an awk program is to loop over the files that it is passed as arguments. Therefore, the awk()
function without arguments is equivalent to inserting the following code into your Perl program:
while (<>) {
# Converted awk code here
}
As usual, the empty diamond operator, <>
will operate on @ARGV
, shifting off the elements until it is empty. Therefore, @ARGV
will be cleared after you call awk()
. However, awk()
creates a local
copy of any arguments that it receives so you can avoid clearing @ARGV
by passing it as an argument:
awk(@ARGV);
# Do something else with @ARGV in Perl
An awk program doesn't loop over a file if it contains a BEGIN block only:
use Inline AWK;
awk();
__END__
__AWK__
BEGIN { print "Hello, world!" }
As with all Perl functions the return value of awk()
is the last expression evaluated. This is an unintentional feature but you may find a use for it.
If your program only has awk functions and no awk program you can ignore awk()
. However, it is still imported into you Perl program.
HOW TO USE Inline::Awk
You can use Inline::Awk
in any of the following ways. See also the Inline documentation:
Method 1 (the standard method):
use Inline AWK;
# Call the awk code
__END__
__AWK__
# awk code here
Method 2 (for simple code):
use Inline AWK => "# awk code here";
Method 3 (requires Inline::Files):
use Inline::Files;
use Inline AWK;
# Call the awk code
__AWK__
# awk code here
Note, any of the following use declarations are valid:
use Inline awk;
use Inline Awk;
use Inline AWK;
However, they should be matched by a corresponding data section:
__awk__
__Awk__
__AWK__
HOW Inline::Awk works
Inline::Awk
in based on the same framework that underlies all of the Inline::
modules. This is described in detail in the Inline-API
document.
Inline::Awk works by filtering awk code through the Perl utility a2p
. The a2p utility converts awk code to Perl code using a parser written in C and YACC. Inline::Awk pre and post-processes the code going through a2p to obtain a result that is as close as possible to the output of a real awk compiler. However, it doesn't always get it completely right, see BUGS.
Nevertheless, Inline::Awk can compile and run 130 of the code examples and programs in "The AWK Programming Language" and produce the same results as awk, mawk or gawk. It can run an additional 20 programs from the book with only minor modifications to the awk code. See, the regression test at: http://homepage.eircom.net/~jmcnamara/perl/iawk_regtest_0.03.tar.gz
BUGS
While a2p
does a very good job of converting awk code to Perl it was never intended for the use that Inline::Awk
put it to. Where possible Inline::Awk compensates for the cases where a2p differs from awk. However, you may still encounter bugs or discrepancies. The following sections give some hints on how to work around these, ahem, issues.
String versus numeric context
Awk uses the same equality operators for both numbers and strings whereas Perl uses different operators. For example consider the following awk function:
function max(m, n) {
return (m > n ? m : n)
}
There isn't enough information here for a2p to tell if m
and n
will be numeric or string values so it defaults to a string comparison and generates a warning (See the "THE VOICE OF LARRY"):
sub max {
local($M, $n) = @_;
($M gt $n ? $M : $n); #???
}
This is probably not what was intended.
However, a2p will take into account any previous uses of a variable in a numeric or string context. Therefore, the following modified code:
function max(m, n) {
m += 0;
return (m > n ? m : n)
}
Will produce a numeric comparison (although the warning is still generated):
sub max {
local($M, $n) = @_;
$M += 0;
($M > $n ? $M : $n); #???
}
Return statements
In an awk function return expression
is a valid statement for any valid expression. However, some expressions can cause problems for a2p. For example the following function would cause a translation failure:
function isnum(n) { return n ~ /^[+-]?[0-9]+$/ } # Fails
The simple workaround for this is to include parenthesis around any complex expression in a return statement:
function isnum(n) { return (n ~ /^[+-]?[0-9]+$/) } # Passes
The ternary operator
A2p can also have problems with the ternary operator exp1 ? exp2 : exp3
when it is used in complex expressions. Therefore, it is best to put parentheses around all ternary conditionals.
printf "Found %d file%s", n, n == 1 ? "": "s" # Fails
printf "Found %d file%s", n, (n == 1 ? "": "s") # Passes
This isn't a problem for simpler assignments.
The module name
Since I have always wanted to write an awk compiler it would have been nice to call the module Inline::Jawk
: that is to say, John's awk. However, I was chastened by the BUGS section of the mawk man page where mawk's author Mike Brennan says: "Implementors of the AWK language have shown a consistent lack of imagination when naming their programs.".
THE VOICE OF LARRY
The following warning indicates that a2p
couldn't determine if a string or numeric context was required at some point in your awk code:
Please check my work on the lines I've marked with "#???".
The operation I've selected may be wrong for the operand types.
See, the BUGS section for an explanation of why you should heed this warning.
Due to the nature of the Inline mechanism you will only see this warning the first time that you run your program. This may be a good thing or a bad thing, it depends on your point of view.
RATIONALE
The utility of this module is questionable: it doesn't do much more than you can already do with a2p
; you can do something similar with Filter::exec "a2p";
and even without a2p
it's generally easy to translate awk code to Perl code.
However, I am fond of awk and if nothing else it will give Brian Ingerson an extra bullet point on his Inline languages slide.
Also, Inline::Awk
serves as an atonement for Inline::PERL
. ;-)
SEE ALSO
Inline.pm, the Inline API and Foo.pm.
"The AWK Programming Language" by Alfred V. Aho, Brian W. Kernighan, and Peter J. Weinberger, Addison-Wesley, 1988. ISBN 0-201-07981-X.
ACKNOWLEDGEMENTS
Thanks to Brian Ingerson for the excellent Inline::C
and the Inline
framework.
AUTHOR
John McNamara jmcnamara@cpan.org
COPYRIGHT
© MMI, John McNamara.
All Rights Reserved. This module is free software. It may be used, redistributed and/or modified under the same terms as Perl itself.
1 POD Error
The following errors were encountered while parsing the POD:
- Around line 569:
Non-ASCII character seen before =encoding in '©'. Assuming CP1252