NAME

C::Utility - utilities for generating C programs

SYNOPSIS

use C::Utility ':all';

VERSION

This documents C::Utility version 0.012 corresponding to git commit 23840f2d0676c73ceefdde8dff0967d1ff4eeecc released on Sat Sep 1 11:00:14 2018 +0900.

DESCRIPTION

This module contains functions which assist in automatic generation of C programs. For work with strings, "convert_to_c_string" converts a string into a string with characters correctly escaped for use in a C program. "convert_to_c_string_pc" does the same thing plus escaping percent signs so that they may be used as format strings for printf. "escape_string" escapes double quotes. "valid_c_variable" checks whether a string is valid as a C variable.

The module contains various line directive related functions. "line_directive" prints a C line directive. "linein" and "lineout" offer a preprocessor and postprocessor to add line numbers to files made from templates.

EXPORTS

All the functions are exported on demand. Nothing is exported by default. An export tag :all exports all the functions.

use C::Utility ':all';

FUNCTIONS

add_lines

my $text = add_lines ($file);

Read $file, and replace strings of the form #line in the file with a C-style line directive using $file. Also add a line directive to the first line of the file. $file must be in the UTF-8 encoding. The line directives are given the full path name of the file using "rel2abs" in File::Spec. The return value is the text of the input file with the line directives added.

I recommend using "linein" in combination with "lineout" rather than this function, since it does not properly handle line directives for generated parts of the C file. See "Line numbering" for an explanation.

brute_force_line

brute_force_line ($input_file, $output_file);

Read $input_file, put #line directives on every single line, and write that to $output_file.

I recommend using "linein" in combination with "lineout" rather than this function, since it does not properly handle line directives for generated parts of the C file. See "Line numbering" for an explanation.

c_string

Alias for "convert_to_c_string".

c_to_h_name

my $h_file = c_to_h_name ("frog.c");
# $h_file = "frog.h".

Make a .h file name from a .c file name.

This is not a very useful function, and I do not use it anywhere any more.

ch_files

my $hfile = ch_files ($c_file_name);

This makes a .h filename from a .c filename, and backs up both the C and the .h files using File::Versions. See also "c_to_h_name". It dies if the input $c_file_name does not end in .c.

This is not a very useful function, and I use it in only one place.

convert_to_c_string

my $c_string = convert_to_c_string ($perl_string);

This converts a Perl string into a C string by converting backslashes to double backslashes, escaping double quotes with "escape_string", turning newlines into \n characters, and adding double quotes.

For example,

use C::Utility 'convert_to_c_string';
my $string =<<'EOF';
The quick "brown" fox\@farm
jumped %over the lazy dog.
EOF
print convert_to_c_string ($string);

produces output

"The quick \"brown\" fox\@farm\n"
"jumped %over the lazy dog.\n"

(This example is included as fox.pl in the distribution.)

It also removes backslashes from before the @ symbol, so \@ is transformed to @. Newlines within the input string are turned into concatenated strings. Empty inputs are turned into a pair of double quotes, "".

convert_to_c_string_pc

my $c_string = convert_to_c_string_pc ($string);     

This is similar to "convert_to_c_string", but it also converts the percent character % to a double percent, %%. This is for generating strings which may be used as C format strings without generating an error because of embedded percent characters.

use C::Utility 'convert_to_c_string_pc';
my $string =<<'EOF';
The quick "brown" fox\@farm
jumped %over the lazy dog.
EOF
print convert_to_c_string_pc ($string);

produces output

"The quick \"brown\" fox\@farm\n"
"jumped %%over the lazy dog.\n"

(This example is included as fox-pc.pl in the distribution.)

escape_string

my $escaped_string = escape_string ($normal_string);

This returns the value of the argument with double quotes " escaped with a backslash.

hash_to_c_file

my $h_file = hash_to_c_file ($c_file_name, \%hash);

This turns a Perl hash into a set of const char * strings and writes it to a C file specified by $c_file_name, and a header file with a similar name. For example,

use FindBin '$Bin';
use C::Utility 'hash_to_c_file';
use File::Slurper 'read_text';
my $file = "$Bin/my.c";
my $hfile = hash_to_c_file ($file, { version => '0.01', author => 'Michael Caine' });
print "C file:\n\n";
print read_text ($file);
print "\nHeader file:\n\n";
print read_text ($hfile);
unlink $file, $hfile or die $!;

produces output

C file:

#include "my.h"
const char * author = "Michael Caine";
const char * version = "0.01";

Header file:

#ifndef MY_H
#define MY_H
extern const char * author; /* "Michael Caine" */
extern const char * version; /* "0.01" */
#endif /* MY_H */

(This example is included as michael-caine.pl in the distribution.)

The return value is the name of the header file used.

The keys of the hash are checked with "valid_c_variable", and the routine dies if they are not valid C variable names.

An optional third argument may contain a prefix to add to all the variable names.

For example,

hash_to_c_file ('that.c', {ok => 'yes'}, 'super_');

outputs

const char * super_ok = "yes";

The use case of this function is a convenient way to make small C configuration files. I currently do not use this function at all, since it tends to cause about as many problems as it solves.

The behaviour of returning the name of the header file was added in version 0.006.

line_directive

This prints a C preprocessor line directive to the file specified by $fh. If $fh is a scalar reference, it concatenates the line directive to the end of it. For example,

line_directive ($fh, 42, "file.x")

prints

#line 42 "file.x"

to $fh.

use C::Utility 'line_directive';
my $out = '';
line_directive (\$out, 99, "balloons.c");
print $out;

produces output

#line 99 "balloons.c"

(This example is included as line-directive.pl in the distribution.)

This function is useful if you cannot remember the syntax for line directives, since it checks that your line number is valid. I currently only use this in one place.

linein

my $intext = linein ($infile);

Given a file $infile, this opens the file, reads it in, replaces the text #linein in the file with a C line directive referring to the original file, then returns the complete text as its return value. Note that the line number in a line directive refers to the following line, so if the line directive appears on the first line of the file, it should say #line 2, etc.

I use this to read in a template before processing with Template to add the line numbers of the input template. See "Line numbering" for a minimal working example of how.

lineout

lineout ($outtext, $outfile);

Given a C output text $outtext and a file name $outfile, this writes out the text to $outfile, replacing the text #lineout with an appropriate line directive using $outfile as the file name and the lines of the file as the line numbers.

use FindBin '$Bin';
use C::Utility 'lineout';
use File::Slurper 'read_text';
my $file = "$Bin/some.c";
my $c = <<EOF;
static void unknown (int x) { return x; }
#lineout
int main () { return 0; }
EOF
lineout ($c, $file);
print read_text ($file);
unlink $file or die $!;

produces output

static void unknown (int x) { return x; }
#line 3 "/usr/home/ben/projects/c-utility/examples/some.c"
int main () { return 0; }

(This example is included as lineout.pl in the distribution.)

I use this to write text which has been processed by Template to add line numbers to the output. See "Line numbering" for a minimal working example of how.

print_bottom_h_wrapper ($file_handle, $file_name);

Print the bottom part of an include wrapper for a .h file to $file_handle.

The name of the wrapper comes from "wrapper_name" applied to $file_name.

If $file_handle is a scalar reference, this concatenates the wrapper to the scalar.

I barely use this function and don't consider it to be very useful.

See also "print_top_h_wrapper".

print_include ($file_handle, $file_name);

Print an #include statement for a .h file named $file_name to $file_handle:

#include "file.h"

I do not currently use this function anywhere, and don't consider it useful.

print_top_h_wrapper ($file_handle, $file_name);
# Prints #ifndef wrapper at top of file.

Print an "include wrapper" for a .h file to $file_handle. For example,

#ifndef MY_FILE
#define MY_FILE

The name of the wrapper comes from "wrapper_name" applied to $file_name. If $file_handle is a scalar reference, this concatenates the wrapper to the scalar.

I barely use this function and don't consider it to be very useful.

See also "print_bottom_h_wrapper".

read_includes

my $includes = read_includes ($file);

Given a C file $file, read the file in and find all lines of the form

#include "some.h"

or

#include <another.h>

and return the list of included files as an array reference. See "$include" in C::Tokenize for the regular expression to match the includes. It skips #include statements within comments using "$comment_re" in C::Tokenize.

This function was added in version 0.008.

remove_quotes

my $unquoted_string = remove_quotes ($string);

This removes the leading and trailing quotes from $string. It also removes the "joining quotes" in composite C strings. Thus input of the form "composite " "C" " string" is converted into composite C string without the quotes.

This function probably should not be in this module at all.

stamp_file

stamp_file ($fh);

Add a stamp to file handle $fh containing the name of the program which created it, and the time of generation.

The name of the C file output can be added as a second argument:

stamp_file ($fh, $name);

If $fh is a scalar reference, the stamp is concatenated to it.

use C::Utility 'stamp_file';
my $out = '';
stamp_file (\$out);
print $out;

produces output

/*
This C file was generated by /usr/home/ben/projects/c-utility/examples/stamp-file.pl at Sat Sep  1 10:30:25 2018.
*/

(This example is included as stamp-file.pl in the distribution.)

This is not a very useful function because of the following nuisance side effect. When using a system based on "make" and "diff" to build your C project, the date and time in the stamp made by stamp_file forces unnecessary rebuilds by changing your file each time it is run, even when the actual program contents have not changed. I tend not to use this version of this function any more but use a substitute one which doesn't add the time and date to the file.

This function was added in version 0.006.

valid_c_variable

my $ok = valid_c_variable ($variable_name);

This returns 1 if $variable_name is a valid C variable, the undefined value otherwise. It tests for two things. First that the argument only contains the allowed characters, [0-9a-zA-Z_], for a C variable, and second that the argument is not a C keyword like goto or volatile. The rejected word list contains C99 keywords like _Bool. See "$reserved_re" in C::Tokenize for exactly what is rejected.

This function does not check for overlaps with the POSIX list of disallowed C variable and function names. There is a huge list of things which POSIX forbids, like functions beginning with the three letters str, underscores at the start of a variable, and _t at the end of a type name, and so on. I don't know of any CPAN module or other offering which validates against this giant list of sinful names.

wrapper_name

my $wrapper = wrapper_name ($file_name);

Given a file name, this returns a suitable C preprocessor wrapper name based on the file name. The wrapper name is the uppercase version of the file name with hyphens and dots replaced with underscores.

This does not strip out directory paths from $file_name.

This is not a useful function, and I do not use it anywhere.

EXAMPLES

Line numbering

When using "linein" and "lineout", with Template, your source file should look something like this:

#linein
/* My great file. */
typedef enum {
#lineout
[%- FOR status IN statuses %]
[% status %],
[%- END %]
#linein
}
status_t;

This results in an output something like this:

#line 2 "status.c.tmpl"
/* My great file. */
typedef enum {
#line 5 "status.c"
good,
great,
super,
fantastic,
#line 9 "status.c.tmpl"
}
status_t;

when processed as in

use FindBin '$Bin';
use Template;
use C::Utility qw/linein lineout/;
chdir $Bin or die $!;
my %vars = (statuses => [qw/good great super fantastic/]);
my $tt = Template->new (INCLUDE_DIR => '.');
my $textin = linein ('status.c.tmpl');
$tt->process (\$textin, \%vars, \my $textout);
lineout ($textout, 'status.c');

where $tt is an instance of Template.

You need to specify #linein where your template file starts, and #lineout where the automatically generated part starts. This way compiler error messages will send you to the correct line of the file. The functions "add_lines" and "brute_force_line" are not very useful because they don't add the outgoing lines of the form #line 5 "status.c" to the file, and the compiler error messages send you on a wild goose chase if the error is in the generated part of the file.

DEPENDENCIES

Carp

Carp is used to report errors.

C::Tokenize

The regular expressions of C::Tokenize are used in various places.

File::Spec

File::Spec is used to get the base name of the file from the argument to "hash_to_c_file", and to get the absolute name of the file in "add_lines".

File::Versions

File::Versions is used to back up files

File::Slurper

File::Slurper is used to read and write text files. This has the caveat that in some places this module assumes your C source files use the UTF-8 encoding.

Text::LineNumber

Text::LineNumber is used by "linein" and "lineout". It's not used by "add_lines" or "brute_force_line"

DEFICIENCIES

No POSIX compliance

See the discussion under "valid_c_variable".

Insists on UTF-8 inputs/outputs in some places

See the discussion about "File::Slurper". If this is really a problem, file a bug report.

No trigram support

This module doesn't support ANSI C trigrams like ??= for #, for people whose keyboards don't have a # key, so you're going to need a bigger keyboard if you're a trigram user who insists on using this Perl module.

HISTORY

Most of the functions in this module are for supporting automated C code generators.

C::Utility was on CPAN, but then deleted between version 0.005 and version 0.006. I don't know of anyone who was using the module, but I decided to restore it to CPAN anyway, since I'm still using and maintaining it, and it might be useful to somebody.

AUTHOR

Ben Bullock, <bkb@cpan.org>

COPYRIGHT & LICENCE

This package and associated files are copyright (C) 2012-2018 Ben Bullock.

You can use, copy, modify and redistribute this package and associated files under the Perl Artistic Licence or the GNU General Public Licence.