[Dwarf-Discuss] variable locations - safe use as lvalues

Tue Jan 21 00:15:05 GMT 2020

On 1/20/20 2:20 PM, Frank Ch. Eigler via Dwarf-Discuss wrote:
> Hi -
> 
> I have a question about variable location lists, but not their
> encoding, the use they are suitable for.  The basic debugging scenario
> is just reading variable values, for which this is fine, especially
> when high-quality compilers emit exquisitely detailed data for their
> optimized code.
> 
> But what about writes - as though one could edit the program to insert
> an assignment, and resume?  A whole slew of complications come up:
> 
> - trying to modify a variable permanently, but the compiler only
>    emitted -some- of its locations
> 
> - trying to modify a variable (and only one), but the compiler put two
>    variables into the same location at that PC
> 
> - expressions using that value as input might have already started
>    to be computed, so it may be too late to change it at the PC
>    in question
> 
> - ... and undoubtedly other complications exist!

Interesting question.

Complication 1: That the compiler only emitted partial descriptions for 
the variable:  this seems to be a quality of implementation issue. 
There is nothing that a debugger can do if the compiler generates 
incomplete or misleading descriptions.  There is also no way that a 
debugger can ascertain that the compiler has generated a complete or 
accurate description.  Remedy: Fix the compiler.

Complication 2: The compiler reuses variable locations at the same PC. 
This seems to be a compiler bug.  While a location (e.g., a register) 
can be the location for multiple variables, the live ranges for these 
variables should not overlap.  The location lists for all variables 
should be disjoint.  Presumably, a debugger could check that location 
lists do not overlap.

For complication 3, an example might be
	load  r1, =1
         add   r1, var
PC ==>  store r1, var
There might be arbitrary additional instructions for multiple source 
statements interspersed.

This has has two variants:

Complication 3a: That the value of a variable has been fetched for a 
computation before the debugger modifies it.  This is more complicated. 
The live range of the variable is accurate, but its value has been used 
before the current PC.  DWARF does not include descriptions of data flow 
or indicate where variables are fetched, so there is no information that 
a debugger can use to assure that a modified value is actually used.

Complication 3b: That a variable's value may be modified after the 
debugger changes it at PC.  This is essentially a race condition.  Both 
the program and the debugger are updating the variable.  Last one wins.

 > A debugger cannot currently be told that any particular variable
 > location expression is safe to use as an lvalue, something roughly
 > "exclusive, exhaustive, -O0-equivalent".  I believe most debuggers
 > don't even handle the multiple-locations case for writes at all.  I
 > don't know why - assume complications are rare?  or we have kind of
 > papered over the problem?

There are a lot of issues with a debugger modifying a program while it 
is running.  A debugger can make essentially unbounded changes to the 
program state.  Some of these may work as expected, some may not, and it 
is unclear how a debugger would be able to know the difference.

There might be fewer problems modifying variables which are marked 
volatile.  But var in the example above could be volatile and the same 
issues would occur.

Are these complications rare?  Unclear.  I think that the great majority 
of debugger use is in displaying the value of variables, not in 
modifying them.

Have we papered over the problem?  Probably.  Debugging optimized code 
is difficult, even without trying to change the program state.

 > As a DWARF standard level matter, descriptive rather than prescriptive
 > as it is, what might be the minimum extension required to communicate
 > lvalue-safety to a debugger?  A DW_OP_lvalue_safe assertive wrapper
 > around a real expression?

Conceivably, location lists could be extended to include a flag to say 
that a variable is quiescent for a particular range of PC values and 
that a modification at that time would be persistent.  (A clear 
definition of quiescent would be needed.)

As Cary notes, a default or bounded location description might be used, 
but I don't believe that either implies that a variable is quiescent (or 
not) over the specified range.

-- 
Michael Eager    eager at eagercon.com
1960 Park Blvd., Palo Alto, CA 94306