NAME
GCC::Builtins - access GCC compiler builtin functions via XS
VERSION
Version 0.06
SYNOPSIS
This module provides Perl access to GCC C compiler builtin functions.
use GCC::Builtins qw/:all/;
# or use GCC::Builtins qw/ ... clz ... /;
my $leading_zeros = GCC::Builtins::clz(10);
# 28
EXPORT
uint16_t bswap16(uint16_t)
uint32_t bswap32(uint32_t)
uint64_t bswap64(uint64_t)
int clrsb(int)
int clrsbl(long)
int clrsbll(long long)
int clz(unsigned int)
int clzl(unsigned long)
int clzll(unsigned long long)
int ctz(unsigned int)
int ctzl(unsigned long)
int ctzll(unsigned long long)
int ffs(int)
int ffsl(long)
int ffsll(long long)
double huge_val()
float huge_valf()
long double huge_vall()
double inf()
_Decimal128 infd128()
_Decimal32 infd32()
_Decimal64 infd64()
float inff()
long double infl()
double nan(const char)
float nanf(const char)
long double nanl(const char)
int parity(unsigned int)
int parityl(unsigned long)
int parityll(unsigned long long)
int popcount(unsigned int)
int popcountl(unsigned long)
int popcountll(unsigned long long)
double powi(double,int)
float powif(float,int)
long double powil(long double,int)
Export tag :all
imports all exportable functions, like:
use GCC::Builtins qw/:all/;
SUBROUTINES
uint16_t bswap16(uint16_t)
Returns x with the order of the bytes reversed; for example, 0xaabb becomes 0xbbaa. Byte here always means exactly 8 bits.
uint32_t bswap32(uint32_t)
Similar to __builtin_bswap16, except the argument and return types are 32-bit.
uint64_t bswap64(uint64_t)
Similar to __builtin_bswap32, except the argument and return types are 64-bit.
int clrsb(int)
Returns the number of leading redundant sign bits in x, i.e. the number of bits following the most significant bit that are identical to it. There are no special cases for 0 or other values.
int clrsbl(long)
Similar to __builtin_clrsb, except the argument type is long.
int clrsbll(long long)
Similar to __builtin_clrsb, except the argument type is long long.
int clz(unsigned int)
Returns the number of leading 0-bits in x, starting at the most significant bit position. If x is 0, the result is undefined.
int clzl(unsigned long)
Similar to __builtin_clz, except the argument type is unsigned long.
int clzll(unsigned long long)
Similar to __builtin_clz, except the argument type is unsigned long long.
int ctz(unsigned int)
Returns the number of trailing 0-bits in x, starting at the least significant bit position. If x is 0, the result is undefined.
int ctzl(unsigned long)
Similar to __builtin_ctz, except the argument type is unsigned long.
int ctzll(unsigned long long)
Similar to __builtin_ctz, except the argument type is unsigned long long.
int ffs(int)
Returns one plus the index of the least significant 1-bit of x, or if x is zero, returns zero.
int ffsl(long)
Similar to __builtin_ffs, except the argument type is long.
int ffsll(long long)
Similar to __builtin_ffs, except the argument type is long long.
double huge_val()
Returns a positive infinity, if supported by the floating-point format, else DBL_MAX. This function is suitable for implementing the ISO C macro HUGE_VAL.
float huge_valf()
Similar to __builtin_huge_val, except the return type is float.
long double huge_vall()
Similar to __builtin_huge_val, except the return type is long double.
double inf()
Similar to __builtin_huge_val, except a warning is generated if the target floating-point format does not support infinities.
_Decimal128 infd128()
Similar to __builtin_inf, except the return type is _Decimal128.
_Decimal32 infd32()
Similar to __builtin_inf, except the return type is _Decimal32.
_Decimal64 infd64()
Similar to __builtin_inf, except the return type is _Decimal64.
float inff()
Similar to __builtin_inf, except the return type is float. This function is suitable for implementing the ISO C99 macro INFINITY.
long double infl()
Similar to __builtin_inf, except the return type is long double.
double nan(const char)
This is an implementation of the ISO C99 function nan.
float nanf(const char)
Similar to __builtin_nan, except the return type is float.
long double nanl(const char)
Similar to __builtin_nan, except the return type is long double.
int parity(unsigned int)
Returns the parity of x, i.e. the number of 1-bits in x modulo 2.
int parityl(unsigned long)
Similar to __builtin_parity, except the argument type is unsigned long.
int parityll(unsigned long long)
Similar to __builtin_parity, except the argument type is unsigned long long.
int popcount(unsigned int)
Returns the number of 1-bits in x.
int popcountl(unsigned long)
Similar to __builtin_popcount, except the argument type is unsigned long.
int popcountll(unsigned long long)
Similar to __builtin_popcount, except the argument type is unsigned long long.
double powi(double,int)
Returns the first argument raised to the power of the second. Unlike the pow function no guarantees about precision and rounding are made.
float powif(float,int)
Returns the first argument raised to the power of the second. Unlike the pow function no guarantees about precision and rounding are made.
long double powil(long double,int)
Returns the first argument raised to the power of the second. Unlike the pow function no guarantees about precision and rounding are made.
UPDATING THE LIST OF FUNCTIONS
The list of functions was extracted from https://gcc.gnu.org/onlinedocs/gcc/Other-Builtins.html using the script sbin/build-gcc-builtins-package.pl
This script is part of the distribution but it is not installed in the host system. This file is HTML documenting these functions. I found it easier to parse this file than to parse GCC header files, mainly because the latter contain macros and typedef which I could not parse without the help of the C pre-processor.
And so the list of provided files may not be perfect. Certainly there are some functions missing. Simply because some functions do not make sense when called from Perl. For example FUNCTION()
, LINE()
etc. Some others are missing because they have exotic data types for function arguments and/or return which I did not know how to implement that in Perl. Others have reported missing symbols, perhaps they need a higher C standard (adjusted via the CFLAGS
in Makefile.PL
).
If you need another builtin function to be supported please raise an issue. Please make sure you provide me with a way to include this function. What CFLAGS
, how to typemap
its return type and arguments. And also provide a test script to test it (similar to those found in t/
directory).
An easy way to experiment is to use cpanm
(provided by "App::cpanminus) to fetch and unpack the distribution and then open a shell at the distribution directory:"
cpanm --look GCC::Builtins
and then
sbin/build-gcc-builtins-package.sh
sbin/build-gcc-builtins-package.pl
perl Makefile.PL && make all && make test
Note that lib/GCC/Builtins.pm
, lib/GCC/Builtins.xs
and typemap
are auto-generated by above scripts. Do not edit them. Edit sbin/build-gcc-builtins-package.pl
instead.
ALTERNATIVES
The BENCHMARKS section below suggests that a 100% performance gain awaits users who prefer to call GCC::Builtins rather than implementing them in pure Perl.
However, you can still harvest those gains by coding critical sections in your Perl code in assembly via Inline::C. Assembly can be run from within a C program with the Gnu C Compiler (GCC) which offers the asm volatile(...)</c
functionality.
I have outlined how in this post in this thread, over at the PerlMonks Monastery.
Here is the relevant code:
use Inline C;
use strict;
use warnings;
# Assembly code via Inline::C to return the
# 1. number of leading zeros of the input integer
# 2. a number with only bit set where the MSSB is located
#
# by bliako
# for https://perlmonks.org/?node_id=11158279
# 21/03/2024
my $z = 17;
my $res = mssb($z);
print "Leading zeros for $z : ".$res->[0]."\n";
print "MSSB for $z : ".sprintf("%032b\n", $res->[1])."\n";
# result:
# Leading zeros for 17 : 27
# MSSB for 17 : 00000000000000000000000000010000
__END__
__C__
#include <stdio.h>
AV * mssb(int input){
int num_leading_zeros;
int mssb;
asm volatile(
/* note: lzcnt inp, out
mov src, dst
add what, dst
# set bit of value in dst at zero-based bitposition:
btsl bitposition, dst (it modifies dst)
*/
"lzcnt %[input], %[num_leading_zeros] \n\t\
mov $32, %%eax \n\t\
sub %[num_leading_zeros], %%eax \n\t\
sub $1, %%eax \n\t\
xor %[mssb], %[mssb] \n\t\
bts %%eax, %[mssb] \n\t\
"
/* outputs */
: [num_leading_zeros] "=r" (num_leading_zeros)
, [mssb] "=r" (mssb)
/* inputs */
: [input] "mr" (input)
/* clobbers: we are messing with these registers: */
: "eax"
);
// return an arrayref of the two outputs
AV* ret = newAV();
sv_2mortal((SV*)ret);
av_push(ret, newSViv(num_leading_zeros));
av_push(ret, newSViv(mssb));
return ret;
}
You can also inline assembly in your Perl code with Inline::ASM
Be advised that GCC builtins are also calling assembly code. In fact the above assembly code is how GCC implements clz()
. So, inline assembly and GCC::Builtins should yield, more-or-less, the same performance gain.
TESTING
For each exported sub there is a corresponding auto-generated test file. The test goes as far as loading the library and calling the function from Perl.
However, there may be errors in the expected results because that was done without verifying with a C test program.
BENCHMARKS
Counting leading zeros (clz) will be used to benchmark the GCC builtin __builtin_clz()
and a pure Perl implementation as suggested by Perl Monk coldr3ality in this discussion
clz()
operating on the binary representation of a number counts the zeros starting from the most significant end until it finds the first bit set (to 1). Which essentially gives the zero-based index of the MSB set to 1.
The benchmarks favour the GCC builtin __builtin_clz()
which is about twice as fast as the pure Perl implementation.
The benchmarks can be run with make benchmarks
An easy way to let Perl fetch and unpack the distribution for you is to use cpanm
to open a shell
cpanm --look GCC::Builtins
and then
perl Makefile.PL && make all && make test && make benchmarks
The following benchamrk results indicate that the use of GCC::Builtins (clz()
in this case) yields more than 100% performance gain over equivalent pure perl code:
Benchmark: timing 50000000 iterations of clz/xs, clz/pp-ugly...
clz/xs: 3.92331 wallclock secs ( 3.92 usr + 0.00 sys = 3.92 CPU) @ 12755102.04/s (n=50000000)
clz/pp-ugly: 8.24574 wallclock secs ( 8.23 usr + 0.00 sys = 8.23 CPU) @ 6075334.14/s (n=50000000)
Rate clz/pp-ugly clz/xs
clz/pp-ugly 6075334/s -- -52%
clz/xs 12755102/s 110% --
KEY:
clz/xs : calling GCC builtin clz() via XS from Perl
clz/pp-ugly : as suggested by coldr3ality (see https://perlmonks.org/?node_id=11158279)
Benchmark: timing 50000000 iterations of clzl/xs, clzl/pp-ugly...
clzl/xs: 3.84597 wallclock secs ( 3.84 usr + 0.00 sys = 3.84 CPU) @ 13020833.33/s (n=50000000)
clzl/pp-ugly: 8.44006 wallclock secs ( 8.43 usr + 0.00 sys = 8.43 CPU) @ 5931198.10/s (n=50000000)
Rate clzl/pp-ugly clzl/xs
clzl/pp-ugly 5931198/s -- -54%
clzl/xs 13020833/s 120% --
KEY:
clzl/xs : calling GCC builtin clzl() via XS from Perl
clzl/pp-ugly : as suggested by coldr3ality (see https://perlmonks.org/?node_id=11158279)
So, it pays to use this module if performance is an issue.
CAVEATS
If you observe weird return results or core-dumps it is very likely that the fault is mine while compiling the XS typemap
. The file in the distribution typemap
was compiled by me to translate C's data types into Perls. And for some of this I am not sure what the right type is. For example, is C's uint_fast16_t
equivalent to Perl's T_UV
? How about C's long double
mapping to Perl's T_DOUBLE
and unsigned long long
to T_U_LONG
?
Please report any corrections.
Note that lib/GCC/Builtins.pm
, lib/GCC/Builtins.xs
and typemap
are auto-generated by above scripts. Do not edit them. Edit sbin/build-gcc-builtins-package.pl
instead.
AUTHOR
Andreas Hadjiprocopis, <bliako ta cpan.org / andreashad2 ta gmail.com>
BUGS
Please report any bugs or feature requests to bug-gcc-builtins at rt.cpan.org
, or through the web interface at https://rt.cpan.org/NoAuth/ReportBug.html?Queue=GCC-Builtins. I will be notified, and then you'll automatically be notified of progress on your bug as I make changes.
SUPPORT
You can find documentation for this module with the perldoc command.
perldoc GCC::Builtins
You can also look for information at:
RT: CPAN's request tracker (report bugs here)
Review this module at PerlMonks
Search CPAN
ACKNOWLEDGEMENTS
This module started by this discussion at PerlMonks:
Hackers of Free Software.
GNU and the Free Software Foundation, providers of GNU Compiler Collection.
HUGS
!Almaz!
LICENSE AND COPYRIGHT
This software is Copyright (c) 2024 by Andreas Hadjiprocopis.
This is free software, licensed under:
The Artistic License 2.0 (GPL Compatible)
1 POD Error
The following errors were encountered while parsing the POD:
- Around line 375:
Unterminated L<...> sequence