NAME
Parse::Readelf::Debug::Info - handle readelf's debug info section with a class
SYNOPSIS
use Parse::Readelf::Debug::Info;
my $debug_info = new Parse::Readelf::Debug::Info($executable);
my @item_ids = $debug_info->item_ids('l_object2a');
my @structure_layout1 = $debug_info->structure_layout($item_ids[0]);
my @some_item_ids = $debug_info->item_ids_matching('^var', 'variable');
my @all_item_ids = $debug_info->item_ids_matching('');
my @all_struct_ids = $debug_info->item_ids_matching('', '.*structure.*');
ABSTRACT
Parse::Readelf::Debug::Info parses the output of readelf --debug-dump=info
and stores its interesting details in an object to ease access.
DESCRIPTION
Normally an object of this class is constructed with the file name of an object file to be parsed. Upon construction the file is analysed and all relevant information about its debug info section is stored inside of the object. This information can be accessed afterwards using a bunch of getter methods, see "METHODS" for details.
AT THE MOMENT ONLY INFORMATION REGARDING THE BINARY ARRANGEMENT OF VARIABLES (STRUCTURE LAYOUT) IS SUPPORTED. Other data is ignored for now.
Currently only output for Dwarf versions 2 and 4 is supported. Please contact the author for other versions and provide some example readelf
outputs.
EXPORT
Nothing is exported by default as it's normally not needed to modify any of the variables declared in the following export groups:
:all
all of the following groups
:command
- $command
-
is the variable holding the command to run
readelf
to get the information relevant for this module, normallyreadelf --debug-dump=line
.
:config
- $display_nested_items
-
is a variable which controls if nested items (e.g. sub-structures) are not displayed unless actually used (e.g. as data type of members of their parent) or if they are always displayed - which might confuse the reader. The default is 0, any other value switches on the unconditional display.
- $re_substructure_filter
-
is a regular expression that allows you to cut away the details of all substructures whose type names match the filter. This is useful if you have a bunch of types that you consider so basic that you like to blend out their details, e.g. the internal representation of a complex number datatype. The filter has the value
^string$
for C++ standard strings as default.
:constants
The following constants can be used to access the elements of the result of the method "structure_layout" (see below).
:fixed_regexps
- $re_section_start
-
is the regular expression that recognises the start of the info debug output of
readelf
. - $re_section_stop
-
is the regular expression that recognises the start of another debug output of
readelf
. - $re_unit_offset
-
is the regular expression that recognises the first line of a compilation unit in an info debug output of
readelf
. This line states the offset of the compilation unit itself. So this offset must be a hexadecimal string which will (must) be stored in$1
without any leading0x
. Usually it's 0 for the first unit. - $re_dwarf_version
-
is the regular expression that recognises the Dwarf version line in an info debug output of
readelf
. The version number must be an integer number which will (must) be stored in$1
. - $re_unit_signature
-
is the regular expression that recognises the hexadecimal signature line at the start of a compilation unit in an info debug output of
readelf
. The signature ID must be a string which will (must) be stored in$1
. - $re_type_offset
-
is the regular expression that recognises the type offset line at the start of a compilation unit in an info debug output of
readelf
. The offset must be a string which will (must) be stored in$1
without any leading0x
.
:versioned_regexps
These regular expressions are those that recognise the (yet) supported tags of the item nodes of a readelf debug info output. Each of them is actually a list using the Dwarf version as index:
- @re_item_start
-
recognises the start of a new item in the debug info list.
$1
is the level,$2
the internal (unique) item ID,$3
the numeric type ID and$4
the type tag. - @re_bit_offset
-
recognises the bit offset tag of an item.
$1
will contain the offset. - @re_bit_size
-
recognises the bit size tag of an item.
$1
will contain the size. - @re_byte_size
-
recognises the byte size tag of an item.
$1
will contain the size. - @re_comp_dir
-
recognises the compilation directory tag of an item.
$1
will contain the compilation directory as string. - @re_const_value
-
recognises the const value tag of an item.
$1
will contain the value. - @re_containing_type
-
recognises the containing type tag of an item. Either
$1
will contain the normal internal item ID orS2
will contain the Dwarf-4 signature of the containing type. - @re_decl_file
-
recognises the declaration file tag of an item.
$1
will contain the number of the file name (see Parse::Readelf::Debug::Line). - @re_decl_line
-
recognises the declaration line tag of an item.
$1
will contain the line number. - @re_declaration
-
recognises the declaration tag of an item.
$1
will usually contain a 1 indicating that it is set. - @re_encoding
-
recognises the encoding tag of an item.
$1
will contain the encoding as text. - @re_external
-
recognises the external tag of an item.
$1
will usually contain a 1 indicating that it is set. - @re_language
-
recognises the language tag of an item.
$1
will contain the language as text. - @re_linkage_name_tag
-
recognises the linkage name tag of an item.
$1
will contain the name. - @re_location
-
recognises the data member location tag of an item.
$1
will contain the offset. - @re_member_location
-
recognises the data location tag of an item.
$1
will contain the hex value (with spaces between each byte). - @re_name_tag
-
recognises the name tag of an item.
$1
will contain the name. - @re_producer
-
recognises the producer tag of an item.
$1
will contain the producer as string. - @re_signature_tag
-
recognises the signature tag of an item.
$1
will contain the leading<0x
in case of a signature refering to the same compilation unit,$2
will contain the hexadecimal signature. - @re_specification
-
recognises the specification tag of an item.
$1
will contain the internal item ID of the specification. - @re_type
-
recognises the type tag of an item. Either
$1
will contain the normal internal item ID orS2
will contain the Dwarf-4 signature of the type. - @re_upper_bound
-
recognises the upper bound tag of a subrange item.
$1
will contain the upper bound. - @re_ignored_attributes
-
recognises all attributes that are simply ignored (yet).
The last two lists are a bit different, they control what is parsed by this module. They are also arrays using the Dwarf version as index. What is inside each of this arrays is described below:
- @tag_needs_attributes
-
holds hashes of the type tags that are processed. Each element points to a list of the absolutely needed attributes for that type of item.
-
is a list of the type tags (see
@re_item_start
above) that are currently ignored.
new - get readelf's debug info section into an object
$debug_info = new Parse::Readelf::Debug::Info($file_name,
[$line_info]);
example:
$debug_info1 = new Parse::Readelf::Debug::Info('program');
$line_info = new Parse::Readelf::Debug::Line('module.o');
$debug_info2 = new Parse::Readelf::Debug::Info('module.o',
$line_info);
parameters:
$file_name name of executable or object file
$line_info a L<Parse::Readelf::Debug::Line> object
description:
This method parses the output of C<readelf --debug-dump=info> and
stores its interesting details internally to be accessed later by
getter methods described below.
If no L<Parse::Readelf::Debug::Line> object is passed as second
parameter the method creates one internally at it is needed to
locate the source files.
global variables used:
The method uses all of the variables described above in the
L</"EXPORT"> section.
returns:
The method returns the blessed Parse::Readelf::Debug::Info object
or an exception in case of an error.
item_ids - get object ID(s) of (named) item
@item_ids = $debug_info->item_ids($identifier);
example:
@item_ids = $debug_info->item_ids('my_variable');
parameters:
$identifier name of item (e.g. variable name)
description:
This method returns the internal item ID of all identifiers with
the given name as array.
returns:
If a name is unique, the method returns an array with exactly one
element, if a name does not exist it returns an empty array and
otherwise an array containing the IDs of all matching itmes is
returned.
item_ids_matching - get object IDs of items matching constraints
@item_ids = $debug_info->item_ids_matching($re_name, [$re_type_tag]);
example:
@some_item_ids = $debug_info->item_ids_matching('^var', 'variable');
@all_item_ids = $debug_info->item_ids_matching('');
@all_structure_ids = $debug_info->item_ids_matching('', '.*structure.*');
parameters:
$re_name regular expression matching name of items
$re_type_tag regular expression matching type tag of items
description:
This method returns an array containing the internal item ID of
all identifiers that match both the regular expression for their
name and their type tags. Note that an empty string will match
any name or type tag, even missing ones. Also note that type tags
in Dwarf 2 always begin with C<DW_TAG_>.
returns:
If a name is unique, the method returns an array with exactly one
element, if a name does not exist it returns an empty array and
otherwise an array containing the IDs of all matching itmes is
returned. The IDs are sorted alphabetically according to their
names.
structure_layout - get structure layout of variable or data type
@structure_layout =
$debug_info->structure_layout($id, [$initial_offset]);
example:
@structure_layout1 =
$debug_info->structure_layout('1a8');
@structure_layout2 =
$debug_info->structure_layout('2f0', 4);
parameters:
$id internal ID of item
$initial_offset offset to be used for the beginning of the layout
description:
This method returns the structure layout of a variable or data
type with the given item ID (which can be found with the method
L<"item_ids"> or L<"item_ids_matching">). For each element of a
structure it returns a sextuple containing (in that order)
I<relative level>, I<name>, I<data type>, I<size>, I<location in
source file> and I<offset> allthough some of the information might
be missing (which is indicated by an empty string). For bit
fields two additional fields are added: I<bit-size> and
I<bit-offset> (either both are defined or none at all).
I<location in source file> is a triplet. The first two elements
(object ID of module and source number) are needed to get the file
name from
L<Parse::Readelf::Debug::Line::file|Parse::Readelf::Debug::Line/file>.
The third is the line number within the source. If in Dwarf 4 the
last two elements are not provided, they will be replaced by the
fixed string C<signature> and the signature ID of the compilation
unit instead.
Note that named indices for the result are defined in the
L</":constants"> export (see above).
returns:
The method returns an array of the sextuples described above.
KNOWN BUGS
For references as well as pointers outside of structures the size of the referenced data is shown, not the internal size of the reference self. This is a feature. (Note that this means that pointers to functions outside of structures always have the size 0.)
Only Dwarf versions 2 and 4 are currently supported. Please contact the author for other versions and provide some example readelf
outputs. Without examples support of other versions will not be possible.
This has only be tested in a Unix like environment, namely Linux and Solaris.
SEE ALSO
Parse::Readelf, Parse::Readelf::Debug::Line and the readelf
man page
AUTHOR
Thomas Dorner, <dorner (AT) cpan.org>
COPYRIGHT AND LICENSE
Copyright (C) 2007-2020 by Thomas Dorner
This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself, either Perl version 5.6.1 or, at your option, any later version of Perl 5 you may have available.