[Dwarf-Discuss] PROPOSAL: Constant expressions in location lists

Mon Jan 7 01:26:04 GMT 2008

On Jan 6, 2008 9:50 AM, Michael Eager <eager at eagercon.com> wrote:
> Jim Blandy wrote:
>
> > It would seem much cleaner to simply represent source language
> > expressions as DIE trees.  Every source-level debugger already has the
> > machinery to evaluate those.  Their semantics are as well-defined as
> > the language itself.
>
> Except for the evaluation of DWARF expressions, there are no semantics
> defined in DWARF.  It's purely descriptive.
>
> I don't know any way to describe source language expressions in DWARF.
> DIEs can not describe source expressions, nor is there any way to
> "evaluate" a DIE, whatever that may mean.
>
> DWARF describes the mapping between source language features and
> the resulting generated code.  None of the source language semantics
> is described by DWARF.

My idea was so out in the weeds that Michael misunderstood it and
Roland assumed I couldn't possibly have meant that.  :) The smart
thing to say is obviously, "Uh, never mind."  But let me try again.

The problem here is that DWARF needs to be able to describe how to
compute values of variables that have been optimized out but could be
computed from the state of the program (at least some of the time).
There's no necessary restriction on what types of values a compiler
might be able to optimize out; in the most general case, the values to
be computed could be structures, large floating-point types, etc.  So
DWARF needs to be able to describe how to compute values with
arbitrary source types.

DWARF does have DWARF expressions, but their semantics were
deliberately restricted to computing address-sized values, to keep the
specification simple and focussed.  We're now considering using DWARF
expressions to compute (potentially) every type of value expressible
in the source language.

Following this approach, If the compiler had optimized out a
floating-point value, but knew how to recompute it if desired, and
wished to express this in DWARF, the compiler would need to produce a
DWARF expression that would yield the appropriate floating-point
value, in target format.

As Roland says, this can be done.  We can have massive collections of
DWARF expressions invoking each other with DW_OP_call* to carry out
IEEE double-precision floating-point arithmetic on the DWARF
expression stack.  And as Roland says, if the compiler finds this to
be too much trouble, then it needn't produce the information at all;
we're no worse off than we are now.

My thought was that the easiest way to have a debugger compute values
of potentially arbitrary type would be to use the evaluator for
source-language expressions that every source-level debugging tool
already has.  The debuggers have already implemented (say) IEEE
floating-point arithmetic, if that's what the target uses.  We would
extend DWARF with tags representing source language expression
operators (*, +, []) and literals ("foo", 1.0, 'c'), and have
compilers emit die trees representing the expression the debugger
should compute to rematerialize the value of the missing variable.

At first blush, this might seem like a horrible bloat of the DWARF
spec: would the spec have to describe (say) C's integral type
promotion rules for binary operators, the proper handling of 0 when it
appears as the second or third operand of a ?: expression, and so on?

But I don't think that, at least, is a problem.  It would be
sufficient for the DWARF standard to say, "Some attributes refer to
source-language expressions.  DWARF can encode source-language
expressions for languages X, Y, and Z.  For language X, here is how we
represent source expressions in DWARF:", and then list the
correspondence of X operators and DWARF tags.  There's no need for the
DWARF spec to describe the semantics of the operators at all: as long
as DWARF could faithfully convey the expression from producer to
consumer, along with the environment in which it was to be evaluated,
it would be up to the consumer to properly evaluate the expression.
As I say, most consumers already have evaluators for source-language
expressions anyway, and have dealt with the subtleties; we're just
adding a new way of representing such expressions.

At the moment, the DWARF spec uses generic tags like
DW_TAG_pointer_type for every supported language's pointer type, and
it's left to the producers and consumers how exactly to fit the
concepts in the language specification onto the set of tags provided
by DWARF.  At times this process has needed some clarification, but
it's worked well enough.  To carry this approach over to expressions,
DWARF would specify tags like DW_TAG_expr_multiply and
DW_TAG_expr_deference, describing each clearly enough to allow the
authors of Pascal consumers and producers to understand that the
latter tag is the right thing to use for Pascal's '^' operator, and to
allow a C developer to recognize it as unary '*'.

Perhaps this is "far out in left field".  But using DWARF expressions
for this would be a pretty serious extension of their
responsibilities, and in the non-trivial cases would result in exactly
the sort of ridiculous complexity (I think implementing floating-point
in DWARF expressions is ridiculous) that you'd expect from that.  We
would, in practice if not in the letter of the spec, be restricting
DWARF to supporting the easy cases.