[Dwarf-Discuss] PROPOSAL: Constant expressions in location lists

Thu Jan 24 03:07:47 GMT 2008

Your example #1 is precisely the kind of case we have been discussing
(among others).  I believe the two proposals for this still on the table
are DW_OP_value and DW_AT_value.  Both are means to express that some
portion of a target-format value are not available at any location but can
be computed by a DWARF expression (one that yields precise bits to make up
the value in target format).  I don't think either of them has been written
up formally with proposed text for the standard.  (DW_OP_value is the best
name yet for the same meaning that DW_OP_computed_value has in the example
you showed.  DW_AT_value is an alternative way to encode the same
information as parallel to DW_AT_location rather than embedded in it, and
is backward-compatible with old consumers that don't know the new op.)

Your example #2 is something I don't think we have discussed directly
before.  That is, situations when it is not possible to reconstruct the
actual value but it is still possible to reconstruct constraints on what
the value might have been that may be useful to the debugger/user.  I do
see the motivation for addressing this situation too, though I find it a
lot less compelling.  Mostly that's because it is a bit hard to see how to
represent very many cases such that a debugger could do much with them.  I
did not get a very clear and precise idea from the formulation you gave.
I take the suggestion to be a way to express "I can't compute the value,
but I can compute the veracity of this particular assertion about the value".  

My skepticism about this notion relates to the general case of what
"value" means here.  As we've established in the responses to Jim Blandy's
contrary suggestion, the values we are talking about here are bit patterns
in target format that make up parts of the representation of source-level
objects.  In the simple cases like unsplit integer variables, a location
expression describes only one piece, and it is very simple to associate
what it describes with the interesting source-level concept to tell the
user about.  In those cases, an arithmetic assertion about the value is
easy to express and easy for the debugger to make useful to humans.  In
the general case, it seems far less feasible.

The way I would formulate what I take to be your idea is e.g.:

	DW_OP_lit0 DW_OP_breg5 DW_OP_gt DW_value_test DW_OP_piece(4)
	DW_OP_reg2 DW_OP_piece(4)

That says the 8-byte object's second four bytes are in register 2, and its
first four bytes are unavailable, but "register5 > 0" is a condition you
can evaluate to know something about the value of the first four bytes.
But what is it you are supposed to know about it?  How do you figure out
what that means about the source-level value?

The idea that I can get behind a little more is a "range of values" thing, e.g.:

	DW_OP_lit0 DW_OP_breg5 DW_OP_gt DW_OP_bra(4)
		DW_OP_lit0 DW_OP_value DW_OP_skip(7)
		DW_OP_lit1 DW_OP_const4u(0xffffffff) DW_OP_range DW_OP_piece(4)
	DW_OP_reg2 DW_OP_piece(4)

That says the first four bytes have a complex derivation.  If register5 <= 0,
then the value is known to be exactly zero (DW_OP_value fork).  If instead
register5 > 0 (DW_OP_bra triggers), then what we know is that the four bytes
have a bit pattern in the range [1,0xffffffff] (upper and lower bounds popped
off the stack).  For multiple disjoint ranges, there could be multiple
DW_OP_range clauses giving alternatives.

I made this example complex to bring in a new wrinkle that your suggestion
made me think of.  This is an issue that already exists (in theory) just with
location vs value, even before considering the "fuzzy value" ideas.  That is,
when it's not just that separate PC ranges differ in whether it's a location
that's available, or a value that's available, or neither.  Here, a dynamic
program value (an expression computing from a machine register) decides
whether there is a computed exact value or a known range of possible values.
In the example above I addressed this with a subtle varation on how we've
described DW_OP_value in the past.  That is, here DW_OP_value is not just a
trailing marker of a special kind of expression.  Rather it is a proper
operation, executed or not by the expression's dynamic control flow, that
says "complete the piece now as having the top of stack as computed value".

To allow this kind of dynamic selection, it would also make sense to have a
DW_OP_no_value or some such, to indicate the situation where no information
at all is available.  In static cases, that situation is represented by an
empty expression before DW_OP_piece (or a whole expression that is empty or
just omitted completely).  DW_OP_value on stack underflow seems like a bad
thing to specify, so DW_OP_no_value, or perhaps DW_OP_unavailable, makes sense.

All of that is quite rough.  But it seems like it might be worth considering
the dynamic-selection cases, contemplate how a compiler would produce these
things, and whether it's worth being able to express that.

I do also think that the "fuzzy values" area has some merit.  I think ideas
based on ranges of bit values are probably more useful than ideas based on
describing predicates of some sort.  It's just a lot easier to imagine how a
debugger putting together the target format value and then interpreting how
to show that to a user could find coherent things to say when it's looking at
a representation of "runs of known bits, runs of unknown bits, and runs of
bits in known ranges".  Maybe I would find a predicate-based idea as
compelling if it were thoroughly fleshed out and explained to me.

Thanks,
Roland