[Dwarf-Discuss] Location list ranges vs. containing lexical block DW_AT_ranges

Mon Mar 29 09:34:26 GMT 2010

On Sun, Mar 28, 2010 at 07:13:47PM -0400, Ron Brender wrote:
> >That is unfortunately very large, and for locations where the variable is
> >not in lexical scope the debug info consumer shouldn't be able to tell
> >anything about the variable, because it is not in scope.
> 
> Is this example "real", as in something gcc actually generated? Or
> is contrived for the purposes of discussion here?

The example was made up for the purposes of discussions.
To show up some real world example (well, distilled testcase from it)
that lead to this:

/* PR debug/43058 */                                                                                                                               
/* { dg-do compile } */                                                                                                                            
/* { dg-options "-g -O2" } */                                                                                                                      

extern void *f1 (void *, void *, void *);                                                                                                          
extern void *f2 (const char *, int, int, int, void *(*) ());                                                                                       
extern void *f3 (const char *);                                                                                                                    
extern void *f4 (void *s);                                                                                                                         
extern void *f5 (void *);                                                                                                                          

void test (void)                                                                                                                                   
{                                                                                                                                                  
#define X1 f1 (f2 ("a", 1, 0, 0, f5), \                                                                                                            
               f4 (({ const char *a = "b"; f3 (a); })), \                                                                                          
               ({ const char *a = "c"; f3 (a); }));                                                                                                
#define X2 X1 X1 X1 X1 X1 X1 X1 X1 X1 X1                                                                                                           
#define X3 X2 X2 X2 X2 X2 X2 X2 X2 X2 X2                                                                                                           
#define X4 X3 X3 X3 X3 X3 X3 X3 X3 X3 X3                                                                                                           
  X4 X4                                                                                                                                            
}

This testcase contains huge amount of a variables, each with
very short lived lexical scope.
Let's say the compiled code looks like some prologue, then huge
series of repeated
        xorl    %edx, %edx
        movl    $1, %esi
        movl    $.LC2, %edi
        call    f2
        movq    %rbp, %rdx
        movq    %rbx, %rsi
        movq    %rax, %rdi
        call    f1
        movl    $.LC0, %edi
        call    f3
        movl    $.LC1, %edi
        movq    %rax, %rbp
        call    f3
        movq    %rax, %rdi
        call    f4
        xorl    %ecx, %ecx
        movl    $f5, %r8d
        movq    %rax, %rbx
snippets (sure, the compiler could try to make it a loop from
the spaghetti code in this testcase, but in the real-world testcase
the function arguments were all slightly changing) and then some
epilogue.  The a variables are all optimized out (well, can be),
and therefore GCC attempts to find out which location contains
the value of the variables.  DW_AT_ranges of the blocks
is movl $.LC0, %edi and call f3 belongs to one of the lexical
blocks, movl $.LC1, %edi and call f3 belongs to the other one (this
one has 2 fragments).  The code to find where each value lives
sees that a lives in %edi after the movl $.LCN, %edi instruction,
till middle of the call insn (i.e. if debugger stops on the call insn,
it sees it in the register, if it stops somewhere in f3, it shouldn't,
as %rdi isn't call saved register), and has value .LC0 (resp. .LC1)
in other places.  As the code prefers register or memory locations
over other values (the former can be changed, so are usable for changing
the variables too), without any pruning this leads to location list
for the first a having 4000 ranges (alternating between register %rdi and
value .LC0), third a has 3998 ranges, fifth has 3996 ranges, etc.

So IMHO we want to have some kind of pruning.  For lexical scopes not
in loops you could say we could easily just prune ranges after the
last fragment of the lexical block (use basic block dominance to find which
basic blocks are after all the fragments).  For lexical scopes which
have just one fragment perhaps it would be possible to prune everything not
in the range of the lexical scope (as when you leave that range, you should
be leaving the lexical scope, if you ever reenter it, it would be because
the scope was in some loop and was entered again).  For fragmented lexical
scopes in loops, especially with aggressive scheduling, it gets harder
to differentiate between DW_AT_ranges gaps that are in between fragments
of the same lexical block and after control flow left that lexical block
and reentered it again.  Perhaps other compilers track this info, GCC
doesn't (all it records is which lexical scope each instruction belongs to
and the control flow graph, but with the exception of variable length
arrays there is nothing that would help to find out spots where a lexical
scope has been finally left.

> I ask because it is important to sort out what it means for a
> variable to "have a value" outside its scope.
> 
> Q1: What does it mean for a range to extend beyond the scope of the
> variable?
> 
> This can easily occur when the value is still sitting around in a
> register after the scope is closed. Of course, the value is "dead"
> before or at scope end. The value will continue to be available
> until that register is reused for some other purpose.

Yes.  At least the less aggressive pruning would still say the
variable lives in that register, but wouldn't try to tell the
debug info consumer that after it stopped living in register A it
will be available in register B.

> Q2: What does it mean for a range to begin before the beginning of
> the scope of a variable?
> 
> This is a puzzle. The beginning of a variable range is either the
> beginning of the scope (when the value is undefined) or the result
> of the first assignment, which necessarily occurs inside the scope.
> 
> The only way that I can image that the range starts before the scope
> is if the variable is assigned from another variable with a larger
> scope and the two variables happen to be allocated to the same
> register! In this case there is no explicit assignment to begin the
> lifetime--it happens implicitly for "free".

Either that, or e.g. the variable is assigned a constant value at the
beginning of scope and the entry of the scope doesn't happen in the lowest
pc fragment (first DW_AT_range range) of the DW_AT_lexical_scope.  Then if the
variable has the same value at the end of the first fragment, .debug_loc
range could say the location of the variable is say
DW_OP_lit5 DW_OP_stack_value from say the middle of the lowest pc
fragment till middle of the fragment in which the scope is actually entered.
{
  int v = 6;
  if (something)
    code1;
  code2;
  v = 8;
  code3;
}
with generated code:
.L4:
code1...
jmp .L5
.L6:
unrelated code
.L1: ! here is this lexical scope actually entered
test something
jne .L4
.L5:
code2...
.L8:
code3...
.L7:
DW_AT_ranges for the lexical block is .L4-.L6 and .L1-.L7, and
v can have location list .L4-.L8 DW_OP_lit6 DW_OP_stack_value
.L8-.L7 DW_OP_lit8 DW_OP_stack_value.

> Q3: What does it mean for a range to begin *and* end outside of the
> scope of the variable. For example, the
> 
> > 23 .. 25 DW_OP_reg2
> 
> This makes no sense to me. If gcc really generated such an example,
> I question whether there is a gcc bug. If not, then lets remove this
> from the discussion--or demonstrate how it could plausibly arise
> from a reasonable compiler implementation.

As I said earlier, for variables that have been really optimized out,
GCC just records in special debug statements that the variable
has at this point such and such value (constant, some non-optimized out
variable, arbitrary expressions containing constants and non-optimized out
variables, ...) and then tracks where the value lives in.

> >Therefore I wrote a GCC patch to crop .debug_loc lists to containing
> >lexical scope's DW_AT_ranges (or DW_AT_low_pc..DW_AT_high_pc interval if
> >the scope isn't fragmented).  .debug_loc list changed to:
> >12 .. 35 DW_OP_reg1
> >48 .. 65 DW_OP_reg2
> 
> This cannot be correct. Just because the value of the variable is
> available in reg1 in 12..22 and also in 28..35 does not mean that
> the value is available in reg1 in 23..27. But that is exactly what
> this description seems to imply.
> 
> I think I see where you are going--if the part of a range that is
> outside of the containing scope must be (actively) ignored by a
> debugger, then *any* description for what is happening outside that
> scope is acceptable (neither right nor wrong, because irrelevant).

I wanted to discuss whether that is acceptable or not (because, that is
of course the shortest possible representation).  As it seems it isn't
acceptable, so the question is just whether it is acceptable to use the less
aggressive pruning variant (which never lies about variable locations
outside of DW_AT_ranges, just will in many cases (only outside of the
ranges) say the variable has been optimized out.  If even that isn't
acceptable, sure, we could try harder to find out where a scope is really
left not to be reentered again (or if reentered have everything undefined):
for (...)
  {
    {
      int v = 6;
      code
    }
    code2
  }
in code2 we surely don't need to track where v lives in, as long as
scheduling didn't mix code and code2 instructions (it often does,
so we don't want to track v after last code instruction).

> 
> >but the GDB people complained that this might be undesirable for watchpoints
> >on the variable, where if a watchpoint is added on the variable in the first
> >fragment that changes to the variable won't be noticed (resp. will be
> >noticed in wrong spots) when the variable is not in the scope.
> 
> How can a change to a variable possibly occur when it is not in
> scope? My answer--it can't! The place that held the value of the
> variable when it was in scope can surely change, but not because the
> variable change but because some totally irrelevant use of the place
> is occurring.

That was my thinking too.  As the variable can't actually change in the
inter fragment holes, then I saw no reason for the debug info consumer
to care about what happens in there.  There are of course corner cases
(like instructions with multiple side-effects, where each side-effect
comes from a different lexical scope, we can put the instruction
into just one of them and thus need to choose).

> My take is that while the second alternate "should" be viable the
> first is the better description--and the one that perhaps DWARF
> should explicitly and definitively require of producers (if not
> already). Notice that these two have the same representation size
> (in this example) so size is not an advantage either way. It would
> be interesting to see some empirical studies of whether there is a
> non-trivial size advantage either way--I suspect not.

See the mail I've just posted, the difference was big in between all the
variants - the track all values everywhere, the correct ranges but don't
bother with ranges not overlapping any of the fragments and the aggressive
pruning variant.  But the less aggressive one is certainly fine for me too.
Of course this is probably partly caused by the way GCC tracks this (the
value based approach).

	Jakub