NAME
Data::IEEE754::Tools - Various tools for understanding and manipulating the underlying IEEE-754 representation of floating point values
SYNOPSIS
use Data::IEEE754::Tools qw/:floatingpoint :ulp/;
# return -12.875 as decimal and hexadecimal floating point numbers
to_dec_floatingpoint(-12.875); # -0d1.6093750000000000p+0003
to_hex_floatingpoint(-12.875); # -0x1.9c00000000000p+0003
# shows the smallest value you can add or subtract to 16.16 (ulp = "Unit in the Last Place")
print ulp( 16.16 ); # 3.5527136788005e-015
# toggles the ulp: returns a float that has the ULP of 16.16 toggled
# (if it was a 1, it will be 0, and vice versa);
# running it twice should give the original value
print $t16 = toggle_ulp( 16.16 ); # 16.159999999999997
print $v16 = toggle_ulp( $t16 ); # 16.160000000000000
DESCRIPTION
*** ALPHA RELEASE v0.011_004: trying out a bugfix with CPAN Testers (since the bug doesn't show up in any of my machines or perl versions, but is all throughout CPAN Testers reports ***
These tools give access to the underlying IEEE 754 floating-point 64bit representation used by many instances of Perl (see perlguts). They include functions for converting from the 64bit internal representation to a string that shows those bits (either as hexadecimal or binary) and back, functions for converting that encoded value into a more human-readable format to give insight into the meaning of the encoded values, and functions to manipulate the smallest possible change for a given floating-point value (which is the ULP or "Unit in the Last Place").
IEEE 754 Encoding
The IEEE 754 standard describes various floating-point encodings. The double format (`binary64') is a 64-bit base-2 encoding, and correpsonds to the usual Perl floating value (NV). The format includes the sign (s), the power of 2 (q), and a significand (aka, mantissa; the coefficient, c): value = ((-1)**s) * (c) * (2**q)
. The (-1)**s
term evaluates to the sign of the number, where s=0 means the sign is +1 and s=1 means the sign is -1.
For most numbers, the coefficient is an implied 1 plus an encoded fraction, which is itself encoded as a 52-bit integer divided by an implied 2**52. The range of valid exponents is from -1022 to +1023, which are encoded as an 11bit integer from 1 to 2046 (where exponent_value = exponent_integer - 1023
). With an 11bit integer, there are two exponent values (0b000_0000_0000 = 0 - 1023 = -1023
and 0b111_1111_1111 = 2047 - 1023 = +1024
), which are used to indicate conditions outside the normal range: The first special encoded-exponent, 0b000_0000_0000
, indicates that the coefficient is 0 plus the encoded fraction, at an exponent of -1022; thus, the floating-point zero is encoded using an encoded-exponent of 0 and an encoded-fraction of 0 ([0 + 0/(2**52)] * [2**-1022] = 0*(2**-1022) = 0
); other numbers smaller than can normally be encoded (so-called "denormals" or "subnormals"), lying between 0 and 1 (non-inclusive) are encoded with the same exponent, but have a non-zero encoded-fraction. The second special encoded-exponent, 0b111_1111_1111
, indicates a number that is infinite (too big to represent), or something that is not a number (NAN); infinities are indicated by that special exponent and an encoded-fraction of 0; NAN is indicated by that special exponent and a non-zero encoded-fraction.
Justification for the existence of Data::IEEE754::Tools
Data::IEEE754, or the equivalent "pack" in perlfunc recipe d>, do a good job of converting a perl floating value (NV) into the big-endian bytes that encode that value, but they don't help you interpret the value.
Data::Float has a similar suite of tools to Data::IEEE754::Tools, but uses numerical methods rather than accessing the underlying bits. It has been shown that its interpretation function can take an order of magnitude longer than a routine that manipulates the underlying bits to gather the information.
This Data::IEEE754::Tools module combines the two sets of functions, giving access to the raw IEEE 754 encoding, or a stringification of the encoding which interprets the encoding as a sign and a coefficient and a power of 2, or access to the ULP and ULP-manipulating features, all using direct bit manipulation when appropriate.
Compatibility
Data::IEEE754::Tools works with 64bit floating-point representations.
If you have a Perl setup which uses a larger representation (for example, use Config; print $Config{nvsize}; # 16 => 128bit
), values reported by this module will be reduced in precision to fit the 64bit representation.
If you have a Perl setup which uses a smaller representation (for example, use Config; print $Config{nvsize}; # 4 => 32bit
), the installation will likely fail, because the unit tests were not set up for lower precision inputs. However, forcing the installation might still allow coercion from the smaller Perl NV into a true IEEE 754 double (64bit) floating-point, but there is no guarantee it will work.
EXPORTABLE FUNCTIONS AND VARIABLES
:raw754
These are the functions to do raw conversion from a floating-point value to a hexadecimal or binary string of the underlying IEEE754 encoded value, and back.
hexstr754_from_double( val )
Converts the floating-point val into a big-endian hexadecimal representation of the underlying IEEE754 encoding.
hexstr754_from_double(12.875); # 4029C00000000000
# ^^^
# : ^^^^^^^^^^^^^
# : :
# : `- fraction
# :
# `- sign+exponent
The first three nibbles (hexadecimal digits) encode the sign and the exponent. The sign is the most significant bit of the three nibbles (so AND the first nibble with 8; if it's true, the number is negative, else it's positive). The remaining 11 bits of the nibbles encode the exponent: convert the 11bits to decimal, then subtract 1023. If the resulting exponent is -1023, it indicates a zero or denormal value; if the exponent is +1024, it indicates an infinite (Inf) or not-a-number (NaN) value, which are generally used to indicate the calculation has grown to large to fit in an IEEE754 double (Inf) or has tried an performed some other undefined operation (divide by zero or the logarithm of a zero or negative value) (NaN).
The final thirteen nibbles are the encoding of the fractional value (usually 1 + thirteennibbles / 16**13
, unless it's zero, denormal, infinite, or not a number).
Of course, this is easier to decode using the "to_dec_floatingpoint()" function, which interprets the sign, fraction, and exponent for you. (See below for more details.)
to_dec_floatingpoint(12.875); # +0d1.6093750000000000p+0003
# ^ ^^^^^^^^^^^^^^^^^^ ^^^^
# : : :
# : `- coefficient `- exponent (power of 2)
# :
# `- sign
binstr754_from_double( val )
Converts the floating-point val into a big-endian binary representation of the underlying IEEE754 encoding.
binstr754_from_double(12.875); # 0100000000101001110000000000000000000000000000000000000000000000
# ^
# `- sign
# ^^^^^^^^^^^
# `- exponent
# ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
# `- fraction
The first bit is the sign, the next 11 are the exponent's encoding
hexstr754_to_double( str )
The inverse of hexstr754_from_double(): it takes a string representing the 16 nibbles of the IEEE754 double value, and converts it back to a perl floating-point value.
print hexstr754_to_double('4029C00000000000');
12.875
binstr754_to_double( str )
The inverse of binstr754_from_double(): it takes a string representing the 64 bits of the IEEE754 double value, and converts it back to a perl floating-point value.
print binstr754_to_double('0100000000101001110000000000000000000000000000000000000000000000');
12.875
:floatingpoint
to_hex_floatingpoint( value )
to_dec_floatingpoint( value )
Converts value to a hexadecimal or decimal floating-point notation that indicates the sign and the coefficient and the power of two, with the coefficient either in hexadecimal or decimal notation.
to_hex_floatingpoint(-3.9999999999999996) # -0x1.fffffffffffffp+0001
to_dec_floatingpoint(-3.9999999999999996) # -0d1.9999999999999998p+0001
It displays the value as (sign)(0base)(implied).(fraction)p(exponent):
- sign
-
The sign will be + or -
- 0base
-
The 0base will be
0x
for hexadecimal,0d
for decimal - implied.fraction
-
The implied.fraction indicates the hexadecimal or decimal equivalent for the coefficient
implied will be 0 for zero or denormal numbers, 1 for everything else
fraction will indicate infinities (#INF), signaling not-a-numbers (#SNAN), and quiet not-a-numbers (#QNAN).
implied.fraction will range from decimal 0.0000000000000000 to 0.9999999999999998 for zero thru all the denormals, and from 1.0000000000000000 to 1.9999999999999998 for normal values.
- p
-
The p introduces the "power" of 2. (It is analogous to the
e
in1.0e3
introducing the power of 10 in a standard decimal floating-point notation, but indicates that the exponent is 2**exp instead of 10**exp.) - exponent
-
The exponent is the power of 2. Is is always a decimal number, whether the coefficient's base is hexadecimal or decimal.
+0d1.500000000000000p+0010 = 1.5 * (2**10) = 1.5 * 1024.0 = 1536.0.
The exponent can range from -1022 to +1023.
Internally, the IEEE 754 representation uses the encoding of -1023 for zero and denormals; to aid in understanding the actual number, the to_..._floatingpoint() conversions represent them as +0000 for zero, and -1022 for denormals: since denormals are
(0+fraction)*(2**min_exp)
, they are really multiples of 2**-1022, not 2**-1023.
:ulp
ulp( val )
Returns the ULP ("Unit in the Last Place") for the given val, which is the smallest number that you can add to or subtract from val and still be able to discern a difference between the original and modified. Under normal (or denormal) circumstances, ulp($val) + $val > $val
is true.
If the val is a zero or a denormal, ulp()
will return the smallest possible denormal.
Since INF and NAN are not really numbers, ulp()
will just return the same val. Because of the way they are handled, ulp($val) + $val > $val
no longer makes sense (infinity plus anything is still infinity, and adding NAN to NAN is not numerically defined, so a numerical comparison is meaningless on both).
toggle_ulp( val )
Returns the orginal val, but with the ULP toggled. In other words, if the ULP bit was a 0, it will return a value with the ULP of 1 (equivalent to adding one ULP to a positive val); if the ULP bit was a 1, it will return a value with the ULP of 0 (equivalent to subtracting one ULP from a positive val). Under normal (or denormal) circumstances, toggle_ulp($val) != $val
is true.
Since INF and NAN are not really numbers, ulp()
will just return the same val. Because of the way they are handled, toggle_ulp($val) != $val
no longer makes sense.
nextup( value )
Returns the next floating point value numerically greater than value; that is, it adds one ULP. Returns infinite when value is the highest normal floating-point value. Returns value when value is positive-infinite or NAN; returns the largest negative normal floating-point value when value is negative-infinite.
nextup
is an IEEE 754r standard function.
nextdown( value )
Returns the next floating point value numerically lower than value; that is, it subtracts one ULP. Returns -infinity when value is the largest negative normal floating-point value. Returns value when value is negative-infinite or NAN; returns the largest positive normal floating-point value when value is positive-infinite.
nextdown
is an IEEE 754r standard function.
nextafter( value, direction )
Returns the next floating point value after value in the direction of direction. If the two are identical, return direction; if direction is numerically above float, return nextup(value)
; if direction is numerically below float, return nextdown(value)
.
nextafter
is an IEEE 754r standard function.
:all
Include all of the above.
INSTALLATION
To install this module, use your favorite CPAN client.
For a manual install, type the following:
perl Makefile.PL
make
make test
make install
(On Windows machines, you may need to use "dmake" instead of "make".)
SEE ALSO
What Every Compute Scientist Should Know About Floating-Point Arithmetic
Perlmonks: Integers sometimes turn into Reals after substraction for inspiring me to go down the IEEE754-expansion trail in perl.
Perlmonks: Exploring IEEE754 floating point bit patterns as a resource for how perl interacts with the various "edge cases" (+/-infinity, denormalized numbers, signaling and quiet NaNs (Not-A-Number).
Data::IEEE754: I really wanted to use this module, but it didn't get me very far down the "Tools" track, and included a lot of overhead modules for its install/test that I didn't want to require for Data::IEEE754::Tools. However, I was inspired by his byteorder-dependent anonymous subs (which were in turn derived from Data::MessagePack::PP); they were more efficient, on a per-call-to-subroutine basis, than my original inclusion of the if(byteorder) in every call to the sub.
Data::Float: Similar to this module, but uses numeric manipulation.
AUTHOR
Peter C. Jones <petercj AT cpan DOT org>
Please report any bugs or feature requests emailing <bug-Data-IEEE754-Tools AT rt.cpan.org>
or thru the web interface at http://rt.cpan.org/NoAuth/ReportBug.html?Queue=Data-IEEE754-Tools.
COPYRIGHT
Copyright (C) 2016 Peter C. Jones
LICENSE
This program is free software; you can redistribute it and/or modify it under the terms of either: the GNU General Public License as published by the Free Software Foundation; or the Artistic License.
See http://dev.perl.org/licenses/ for more information.