[Dwarf-Discuss] decoding of form entries

Michael Eager eager@eagercon.com
Mon Aug 2 18:16:04 GMT 2010


Mathieu Lacage wrote:
> On Mon, Aug 2, 2010 at 18:07, Michael Eager <eager at eagercon.com> wrote:
> 
>>> So, I have a very practical question: if I want to get the value of
>>> this form, I read the reference, that gives me a new location to parse
>>> in the dwarf sections. What is the content of that new location ? Does
>>> it point to an entry (meaning, I am going to find there an abbrev
>>> code) ? If so, how do I know which attribute in this entry is going to
>>> contain the value of my original attribute ? Would it be expected to
>>> be the same attribute with a different non-reference form ?
>> In the most common cases, the "entity" will be another DIE in the
>> same compilation unit, which you will need to parse to find the
>> value of attribute you are looking for.
> 
> That is what I suspected but, I am not sure I really understand your
> answer so, let me restate my question again.
> 
> How do I know which attribute to look for in the new DIE ? Is it
> expected that the new DIE will contain the same attribute I looked at
> originally but with a non-reference form ? For the sake of discussion,
> what about DW_AT_byte_size ?

Yes, generally you would search the referenced DIE to find the same
attribute.  In some cases, like DW_AT_specification, there are multiple
attributes in the referenced DIE.

> This attribute is defined in p9 (dw4) with a link to p75 section 5.1
> (btw, that link in the d4 pdf does not work while it works fine in the
> dw3 pdf). 

We had a number of issues generating the PDF, especially getting
the links to work correctly.  Seems we missed fixing some of the
link problems.

 > That section says:
> "A base type entry has either a DW_AT_byte_size attribute or a
> DW_AT_bit_size attribute whose integer constant value (see Section
> 2.21) is the amount of storage needed to hold a value of the type."
> 
> Ok, so, section 2.21 explains how to read the value:
> 
> "Many debugging information entries allow either a DW_AT_byte_size
> attribute or a DW_AT_bit_size attribute, whose integer constant value
> (see Section 2.19) specifies an amount of storage. The value of the
> DW_AT_byte_size attribute is interpreted in bytes and the value of the
> DW_AT_bit_size attribute is interpreted in bits."
> 
> Which, really, does not tell me how to read the value. But, hey,
> section 2.19 is supposed to tell me, hence, my original email.

Section 2.19 tells you what the DW_AT_byte_size means, rather
than how to find the value.  There several ways that a value
may be represented:  explicitly in the TAG DIE, implicitly in an
abbreviation, or by a reference to another DIE.  They are described
in other parts of the DWARF spec, such as Chapter 7.

> In this case, my question is that I really don't see how I would have
> to parse a DT_AT_byte_size attribute encoded with a form of type
> "reference". To make my question crystal clear, if the value of the
> attribute in the original DIE is encoded with a reference form, shall
> I expect the new DIE referenced by this form to contain another
> DW_AT_byte_size attribute that is encoded with a non-reference form ?

Yes, although it is conceivable that the referenced DIE would
also contain a DW_AT_byte_size attribute which is a reference,
as you mention below.

> I guess that the new DIE might be allowed to contain a
> reference-encoded DW_AT_byte_size but that the dw4 spec should ensure
> that, at some point while parsing the chain of referenced DIEs, _one_
> of them will contain a non-reference form version of that
> DW_AT_byte_size attribute.

The DWARF Specification is permissive -- it tells you how to interpret
the DWARF information.  Very few requirements are placed on producers,
primarily that they generate DWARF in the prescribed format.  DWARF doesn't
have any way to ensure that a producer generates DWARF data which makes
sense or which doesn't have errors.  This is a Quality of Implementation
issue.

A DWARF producer which generated a sequence of references not
terminating in a non-reference would probably be described as
having a bug.  Obviously, DWARF cannot require that producers do
not have bugs.

> (note, that I picked this attribute arbitrarily: the same question
> holds for every attribute mentioned in 2.19, that is: DW_AT_allocated,
> DW_AT_associated, DW_AT_bit_offset, DW_AT_bit_size, DW_AT_byte_size,
> DW_AT_count, DW_AT_lower_bound, DW_AT_byte_stride, DW_AT_bit_stride,
> DW_AT_upper_bound)

Same answer applies to all of these attributes.

> I am not trying to nitpick here: I am just reading the spec and trying
> to figure how to implement a reader so, I would like to know what I
> need to do to implement a robust reader that does not start to barf
> when the compiler generates some weird un-expected but perfectly valid
> data. They are out there trying to get me :)

You need to code your DWARF reader defensively, so that it either does
not make assumptions about the input data or it validates these assumptions.
When following a reference, you might keep a count of how long the reference
chain is and issue a warning or error if the count is too high.  A different
Quality of Implementation issue would be what that maximum count should be.

-- 
Michael Eager	 eager at eagercon.com
1960 Park Blvd., Palo Alto, CA 94306  650-325-8077




More information about the Dwarf-discuss mailing list