NAME

GCC::Builtins - access GCC compiler builtin functions via XS

VERSION

Version 0.06

SYNOPSIS

This module provides Perl access to GCC C compiler builtin functions.

use GCC::Builtins qw/:all/;
# or use GCC::Builtins qw/ ... clz ... /;
my $leading_zeros = GCC::Builtins::clz(10);
# 28

EXPORT

  • uint16_t bswap16(uint16_t)

  • uint32_t bswap32(uint32_t)

  • uint64_t bswap64(uint64_t)

  • int clrsb(int)

  • int clrsbl(long)

  • int clrsbll(long long)

  • int clz(unsigned int)

  • int clzl(unsigned long)

  • int clzll(unsigned long long)

  • int ctz(unsigned int)

  • int ctzl(unsigned long)

  • int ctzll(unsigned long long)

  • int ffs(int)

  • int ffsl(long)

  • int ffsll(long long)

  • double huge_val()

  • float huge_valf()

  • long double huge_vall()

  • double inf()

  • _Decimal128 infd128()

  • _Decimal32 infd32()

  • _Decimal64 infd64()

  • float inff()

  • long double infl()

  • double nan(const char)

  • float nanf(const char)

  • long double nanl(const char)

  • int parity(unsigned int)

  • int parityl(unsigned long)

  • int parityll(unsigned long long)

  • int popcount(unsigned int)

  • int popcountl(unsigned long)

  • int popcountll(unsigned long long)

  • double powi(double,int)

  • float powif(float,int)

  • long double powil(long double,int)

Export tag :all imports all exportable functions, like:

use GCC::Builtins qw/:all/;

SUBROUTINES

uint16_t bswap16(uint16_t)

Returns x with the order of the bytes reversed; for example, 0xaabb becomes 0xbbaa. Byte here always means exactly 8 bits.

uint32_t bswap32(uint32_t)

Similar to __builtin_bswap16, except the argument and return types are 32-bit.

uint64_t bswap64(uint64_t)

Similar to __builtin_bswap32, except the argument and return types are 64-bit.

int clrsb(int)

Returns the number of leading redundant sign bits in x, i.e. the number of bits following the most significant bit that are identical to it. There are no special cases for 0 or other values.

int clrsbl(long)

Similar to __builtin_clrsb, except the argument type is long.

int clrsbll(long long)

Similar to __builtin_clrsb, except the argument type is long long.

int clz(unsigned int)

Returns the number of leading 0-bits in x, starting at the most significant bit position. If x is 0, the result is undefined.

int clzl(unsigned long)

Similar to __builtin_clz, except the argument type is unsigned long.

int clzll(unsigned long long)

Similar to __builtin_clz, except the argument type is unsigned long long.

int ctz(unsigned int)

Returns the number of trailing 0-bits in x, starting at the least significant bit position. If x is 0, the result is undefined.

int ctzl(unsigned long)

Similar to __builtin_ctz, except the argument type is unsigned long.

int ctzll(unsigned long long)

Similar to __builtin_ctz, except the argument type is unsigned long long.

int ffs(int)

Returns one plus the index of the least significant 1-bit of x, or if x is zero, returns zero.

int ffsl(long)

Similar to __builtin_ffs, except the argument type is long.

int ffsll(long long)

Similar to __builtin_ffs, except the argument type is long long.

double huge_val()

Returns a positive infinity, if supported by the floating-point format, else DBL_MAX. This function is suitable for implementing the ISO C macro HUGE_VAL.

float huge_valf()

Similar to __builtin_huge_val, except the return type is float.

long double huge_vall()

Similar to __builtin_huge_val, except the return type is long double.

double inf()

Similar to __builtin_huge_val, except a warning is generated if the target floating-point format does not support infinities.

_Decimal128 infd128()

Similar to __builtin_inf, except the return type is _Decimal128.

_Decimal32 infd32()

Similar to __builtin_inf, except the return type is _Decimal32.

_Decimal64 infd64()

Similar to __builtin_inf, except the return type is _Decimal64.

float inff()

Similar to __builtin_inf, except the return type is float. This function is suitable for implementing the ISO C99 macro INFINITY.

long double infl()

Similar to __builtin_inf, except the return type is long double.

double nan(const char)

This is an implementation of the ISO C99 function nan.

float nanf(const char)

Similar to __builtin_nan, except the return type is float.

long double nanl(const char)

Similar to __builtin_nan, except the return type is long double.

int parity(unsigned int)

Returns the parity of x, i.e. the number of 1-bits in x modulo 2.

int parityl(unsigned long)

Similar to __builtin_parity, except the argument type is unsigned long.

int parityll(unsigned long long)

Similar to __builtin_parity, except the argument type is unsigned long long.

int popcount(unsigned int)

Returns the number of 1-bits in x.

int popcountl(unsigned long)

Similar to __builtin_popcount, except the argument type is unsigned long.

int popcountll(unsigned long long)

Similar to __builtin_popcount, except the argument type is unsigned long long.

double powi(double,int)

Returns the first argument raised to the power of the second. Unlike the pow function no guarantees about precision and rounding are made.

float powif(float,int)

Returns the first argument raised to the power of the second. Unlike the pow function no guarantees about precision and rounding are made.

long double powil(long double,int)

Returns the first argument raised to the power of the second. Unlike the pow function no guarantees about precision and rounding are made.

UPDATING THE LIST OF FUNCTIONS

The list of functions was extracted from https://gcc.gnu.org/onlinedocs/gcc/Other-Builtins.html using the script sbin/build-gcc-builtins-package.pl This script is part of the distribution but it is not installed in the host system. This file is HTML documenting these functions. I found it easier to parse this file than to parse GCC header files, mainly because the latter contain macros and typedef which I could not parse without the help of the C pre-processor.

And so the list of provided files may not be perfect. Certainly there are some functions missing. Simply because some functions do not make sense when called from Perl. For example FUNCTION(), LINE() etc. Some others are missing because they have exotic data types for function arguments and/or return which I did not know how to implement that in Perl. Others have reported missing symbols, perhaps they need a higher C standard (adjusted via the CFLAGS in Makefile.PL).

If you need another builtin function to be supported please raise an issue. Please make sure you provide me with a way to include this function. What CFLAGS, how to typemap its return type and arguments. And also provide a test script to test it (similar to those found in t/ directory).

An easy way to experiment is to use cpanm (provided by "App::cpanminus) to fetch and unpack the distribution and then open a shell at the distribution directory:"

cpanm --look GCC::Builtins

and then

sbin/build-gcc-builtins-package.sh
sbin/build-gcc-builtins-package.pl
perl Makefile.PL && make all && make test

Note that lib/GCC/Builtins.pm, lib/GCC/Builtins.xs and typemap are auto-generated by above scripts. Do not edit them. Edit sbin/build-gcc-builtins-package.pl instead.

ALTERNATIVES

The BENCHMARKS section below suggests that a 100% performance gain awaits users who prefer to call GCC::Builtins rather than implementing them in pure Perl.

However, you can still harvest those gains by coding critical sections in your Perl code in assembly via Inline::C. Assembly can be run from within a C program with the Gnu C Compiler (GCC) which offers the asm volatile(...)</c functionality.

I have outlined how in this post in this thread, over at the PerlMonks Monastery.

Here is the relevant code:

use Inline C;

use strict;
use warnings;

# Assembly code via Inline::C to return the
# 1. number of leading zeros of the input integer
# 2. a number with only bit set where the MSSB is located
#
# by bliako
# for https://perlmonks.org/?node_id=11158279
# 21/03/2024

my $z = 17;
my $res = mssb($z);
print "Leading zeros for $z : ".$res->[0]."\n";
print "MSSB for $z : ".sprintf("%032b\n", $res->[1])."\n";
# result:
# Leading zeros for 17 : 27
# MSSB for 17 : 00000000000000000000000000010000

__END__
__C__
#include <stdio.h>

AV * mssb(int input){
    int num_leading_zeros;
    int mssb;
    asm volatile(
    /* note: lzcnt inp, out
     mov src, dst
     add what, dst
     # set bit of value in dst at zero-based bitposition:
     btsl bitposition, dst (it modifies dst)
    */
    "lzcnt %[input], %[num_leading_zeros]  \n\t\
     mov $32, %%eax                        \n\t\
     sub %[num_leading_zeros], %%eax       \n\t\
     sub $1, %%eax                         \n\t\
     xor %[mssb], %[mssb]                  \n\t\
     bts %%eax, %[mssb]                    \n\t\
    "
    /* outputs */
     : [num_leading_zeros] "=r" (num_leading_zeros)
     , [mssb] "=r" (mssb)
    /* inputs */
     : [input] "mr"  (input)
    /* clobbers: we are messing with these registers: */
     : "eax"
    );

    // return an arrayref of the two outputs
    AV* ret = newAV();
    sv_2mortal((SV*)ret);
    av_push(ret, newSViv(num_leading_zeros));
    av_push(ret, newSViv(mssb));

    return ret;
}

You can also inline assembly in your Perl code with Inline::ASM

Be advised that GCC builtins are also calling assembly code. In fact the above assembly code is how GCC implements clz(). So, inline assembly and GCC::Builtins should yield, more-or-less, the same performance gain.

TESTING

For each exported sub there is a corresponding auto-generated test file. The test goes as far as loading the library and calling the function from Perl.

However, there may be errors in the expected results because that was done without verifying with a C test program.

BENCHMARKS

Counting leading zeros (clz) will be used to benchmark the GCC builtin __builtin_clz() and a pure Perl implementation as suggested by Perl Monk coldr3ality in this discussion

clz() operating on the binary representation of a number counts the zeros starting from the most significant end until it finds the first bit set (to 1). Which essentially gives the zero-based index of the MSB set to 1.

The benchmarks favour the GCC builtin __builtin_clz() which is about twice as fast as the pure Perl implementation.

The benchmarks can be run with make benchmarks An easy way to let Perl fetch and unpack the distribution for you is to use cpanm to open a shell

cpanm --look GCC::Builtins

and then

perl Makefile.PL && make all && make test && make benchmarks

The following benchamrk results indicate that the use of GCC::Builtins (clz() in this case) yields more than 100% performance gain over equivalent pure perl code:

Benchmark: timing 50000000 iterations of  clz/xs, clz/pp-ugly...
    clz/xs: 3.92331 wallclock secs ( 3.92 usr +  0.00 sys =  3.92 CPU) @ 12755102.04/s (n=50000000)
clz/pp-ugly: 8.24574 wallclock secs ( 8.23 usr +  0.00 sys =  8.23 CPU) @ 6075334.14/s (n=50000000)
                  Rate clz/pp-ugly      clz/xs
clz/pp-ugly  6075334/s          --        -52%
 clz/xs     12755102/s        110%          --
KEY:
 clz/xs : calling GCC builtin clz() via XS from Perl
 clz/pp-ugly : as suggested by coldr3ality (see https://perlmonks.org/?node_id=11158279)

Benchmark: timing 50000000 iterations of  clzl/xs, clzl/pp-ugly...
   clzl/xs: 3.84597 wallclock secs ( 3.84 usr +  0.00 sys =  3.84 CPU) @ 13020833.33/s (n=50000000)
clzl/pp-ugly: 8.44006 wallclock secs ( 8.43 usr +  0.00 sys =  8.43 CPU) @ 5931198.10/s (n=50000000)
                   Rate clzl/pp-ugly      clzl/xs
clzl/pp-ugly  5931198/s           --         -54%
 clzl/xs     13020833/s         120%           --
KEY:
 clzl/xs : calling GCC builtin clzl() via XS from Perl
 clzl/pp-ugly : as suggested by coldr3ality (see https://perlmonks.org/?node_id=11158279)

So, it pays to use this module if performance is an issue.

CAVEATS

If you observe weird return results or core-dumps it is very likely that the fault is mine while compiling the XS typemap. The file in the distribution typemap was compiled by me to translate C's data types into Perls. And for some of this I am not sure what the right type is. For example, is C's uint_fast16_t equivalent to Perl's T_UV? How about C's long double mapping to Perl's T_DOUBLE and unsigned long long to T_U_LONG?

Please report any corrections.

Note that lib/GCC/Builtins.pm, lib/GCC/Builtins.xs and typemap are auto-generated by above scripts. Do not edit them. Edit sbin/build-gcc-builtins-package.pl instead.

AUTHOR

Andreas Hadjiprocopis, <bliako ta cpan.org / andreashad2 ta gmail.com>

BUGS

Please report any bugs or feature requests to bug-gcc-builtins at rt.cpan.org, or through the web interface at https://rt.cpan.org/NoAuth/ReportBug.html?Queue=GCC-Builtins. I will be notified, and then you'll automatically be notified of progress on your bug as I make changes.

SUPPORT

You can find documentation for this module with the perldoc command.

perldoc GCC::Builtins

You can also look for information at:

ACKNOWLEDGEMENTS

  • This module started by this discussion at PerlMonks:

    Most Significant Set Bit

  • Hackers of Free Software.

  • GNU and the Free Software Foundation, providers of GNU Compiler Collection.

HUGS

!Almaz!

LICENSE AND COPYRIGHT

This software is Copyright (c) 2024 by Andreas Hadjiprocopis.

This is free software, licensed under:

The Artistic License 2.0 (GPL Compatible)

1 POD Error

The following errors were encountered while parsing the POD:

Around line 375:

Unterminated L<...> sequence