[Dwarf-Discuss] Location list ranges vs. containing lexical block DW_AT_ranges

Wed Mar 24 20:05:58 GMT 2010

Hi Jakub --

I've made a few comments below.

Jakub Jelinek wrote:
> Hi!
> 
> We've been discussing recently what can a debug info consumer assume about
> location list ranges (or parts thereof) that are outside of containing
> DW_TAG_lexical_block/DW_TAG_inlined_subroutine DW_AT_ranges.
> As this is something that all the producers and consumers should better
> agree on, I'm raising the thing here.
> 
> Consider a lexical block with DW_AT_ranges
> 10 .. 20
> 30 .. 40
> 50 .. 60
> and DW_TAG_variable as child of that DW_TAG_lexical_block.  The compiler
> finds out that the variable (or its value) lives from PC 12 to 22 in
> reg1, from PC 23 to 25 in reg2, from PC 25 to 28 in reg3, from PC 28 to
> 35 again in reg1,  from 48 to 65 in reg2 and from 68 to 95 has its
> value in reg4 - 16.  GCC would emit in this case in .debug_loc
> 12 .. 22 DW_OP_reg1
> 23 .. 25 DW_OP_reg2
> 25 .. 28 DW_OP_reg3
> 28 .. 35 DW_OP_reg1
> 48 .. 65 DW_OP_reg2
> 68 .. 95 DW_OP_breg4 -16 DW_OP_stack_value

It might help if you could sketch out the source which might generate
this situation.

> That is unfortunately very large, and for locations where the variable is
> not in lexical scope the debug info consumer shouldn't be able to tell
> anything about the variable, because it is not in scope.

Debuggers frequently have options to display the values of variables
which are not in scope, even if a programmer cannot write code to
reference the variable.

> Therefore I wrote a GCC patch to crop .debug_loc lists to containing
> lexical scope's DW_AT_ranges (or DW_AT_low_pc..DW_AT_high_pc interval if
> the scope isn't fragmented).  .debug_loc list changed to:
> 12 .. 35 DW_OP_reg1
> 48 .. 65 DW_OP_reg2
> but the GDB people complained that this might be undesirable for watchpoints
> on the variable, where if a watchpoint is added on the variable in the first
> fragment that changes to the variable won't be noticed (resp. will be
> noticed in wrong spots) when the variable is not in the scope.

I agree with the GDB folks.  More to the point, this doesn't represent
the generated code accurately.

> So, my question for this list is, is the above a kosher optimization the
> debug info producer can do?  Note that the variable resp. its value
> doesn't really live in reg1 from 12 to 35, but only in that range in
> the spots where it is in scope.  If it is valid, then debug info consumers
> need to mask .debug_loc ranges with .debug_ranges ranges to find out
> where a variable is actually live somewhere (of course for most operations
> except watchpoints this works without any efforts - whenever the
> variable is looked up say on PC 23, it won't be found as it is not in scope,
> so its location list won't be even considered.  

I don't think that this is a valid optimization.  The basic premise
of optimization is that the results are the same whether the
optimization is performed or not.  In your example, a debugger
might give different results depending on whether you generated
an optimized location list.  At best, a debugger would say that it
couldn't find the value of a variable which was still live.  At worst,
it would display inaccurate values and perhaps allow the programmer
to inadvertently modify the value of a different variable than the
one intended.

I'm very sympathetic about wanting to decrease the size of the debug
information by compressing or optimizing the data in some fashion.
But optimizations have to have the same descriptive content.

> If it is not valid,
> the debug info producer would need to modify its output such that
> either .debug_loc locations are always masked with the containing
> DW_AT_ranges, or perhaps can say the variable is live in between somewhere,
> but only when it is actually live in there.

The producer should generate a location list which describes where the
variable is stored, even if not in scope.  I would be very reluctant to
exclude ranges where the variable is live but not in scope.  A debugger
could conceivably try to answer the question of how a variable got its
current value by reverse execution or inspecting the locations where the
variable was stored.  Or, as mentioned earlier, the debugger would not be
able to correctly instrument the code to implement watchpoints.

> Example of the masking to DW_AT_ranges by the producer would be
> 12 .. 20 DW_OP_reg1
> 30 .. 35 DW_OP_reg1
> 50 .. 60 DW_OP_reg2
> and example of keeping the variable alive in between only when it is
> actually live there would be either the first .debug_loc snippet above (the
> totally unoptimized one), or e.g.
> 12 .. 22 DW_OP_reg1
> 28 .. 35 DW_OP_reg1
> 48 .. 65 DW_OP_reg2
> (i.e. it would just remove location ranges which don't overlap DW_AT_ranges
> at all).  What do other producers resp. consumers do/assume here?

Location lists and aranges are mutually independent.  The .debug_aranges
section is to support accelerated access to the descriptive information in
the .debug_info section.  As such, the .debug_aranges section is not required
and some debuggers ignore it.

-- 
Michael Eager	 eager at eagercon.com
1960 Park Blvd., Palo Alto, CA 94306  650-325-8077