[Dwarf-Discuss] PROPOSAL: Constant expressions in location lists

Sun Jan 6 16:22:14 GMT 2008

> In that proposal, I use DW_OP_value as the operator name.

I like that name.  There's also something to be said for
DW_OP_val_expression given DW_CFA_val_expression.

> Frank suggested an alternate representation, that might be less
> intrusive and more backward-compatible.  Something along the lines of
> DW_AT_value, that could hold a loclistptr, but in which the
> expressions would represent values, rather than locations.  

I see the benefit of that approach, in that it only adds something new
an old consumer can ignore entirely, and not a new DWARF opcode value
that it doesn't know how to skip over in a location expression.

> (We can't just reuse DW_AT_const_value, because adding loclistptr to its
> available classes would change the meaning of DW_FORM_data[48] :-(

Indeed, it might be nice if we'd had DW_FORM_locp a la strp from the beginning.

The apparent reason for some people's preference for DW_OP_constant_value
is its simplicity for new consumers dealing with the constant-value case.
If you wanted to get that in the compatibly-parallel style of DW_AT_value,
you could add a DW_AT_const_value_loclist (since DW_AT_const_value can't
compatibly be overloaded, as you say) whose list entries are not actually
expressions but are blocks of constant data bytes.  (This doesn't really
appeal to me, especially since it doesn't handle pieces.  But that didn't
seem to be much on the minds of those proposing DW_OP_constant_value in the
first place either.  Both this and DW_OP_constant_value are better than
DW_OP_value/DW_AT_value for handling very large constant blocks.)

> When both DW_AT_value and DW_AT_location are present, they would add,
> rather than override each other, just like overlapping ranges in
> DW_AT_location add up.  

At first blush I found this suggestion a more onerous and unnatural way to
represent it than DW_OP_value.  It does make it harder for a human to read.
But, given overlapping location list entries where some of the multiple
applicable expressions are composite, the letter of the existing spec
permits one composite to have a zero-expressions (unavailable) piece
covering a range of the whole for which another composite has a nonempty
expression (available location).  So it could be read that a robust
consumer already ought to be merging the alternate location expressions and
then taking the "best" alternative for each byte when subsetting.  For a
consumer organized that way, it is not much of a stretch to merge alternate
location expressions and/or value expressions and/or constant value blocks.

> This reminds me of one doubt I have.  Consider a variable that
> represents a struct-typed object.  If the compiler scalarizes it, and
> each member becomes an independent implementation variable, must the
> compiler emit a separate location list entry for the entire source
> variable every time any of the members change location, or can it use
> separate entries with DW_OP_nop for all but one of the pieces, and
> rely on overlapping ranges for the debugger to put it all together?

DW_OP_nop is not used for this.  AFAICT an expression containing just
DW_OP_nop is invalid by the letter of the spec.  What I think you mean is
an empty location expression before a DW_OP_piece, which is what is
specified to mean "this piece not available here".  (I don't know what
DW_OP_nop is used for at all, in fact.  The only thing I can think of is
adjusting the alignment before a DW_OP_addr so the address after it would
be naturally aligned for relocation or something.)

> Could a representation such as:
> 
>   [L0,L1) [nop] DW_OP_piece 4 DW_OP_reg1 DW_OP_piece 4
>   [L1,L2) DW_OP_reg0 DW_OP_piece 4 DW_OP_reg1 DW_OP_piece 4
>   [L2,L3) DW_OP_reg0 DW_OP_piece 4 [nop] DW_OP_piece 4
> 
> be shortened to:
> 
>   [L0,L2) [nop] DW_OP_piece 4 DW_OP_reg1 DW_OP_piece 4
>   [L1,L3) DW_OP_reg0 DW_OP_piece 4 [nop] DW_OP_piece 4

(Here I'm reading [nop] as nothing at all.)  As I said above, one can
indeed interpret the specification of overlapping location list entries
as expecting you to cope with this.  But as always DWARF just says what
it says and it doesn't say how smart the producers or consumers have to
be (it's a format spec, not a compiler-debugger interoperability
contract).  So fall back on Postel's Law.  I'll endeavor to write a
liberal consumer able to do the best possible with the shortest list.
But you won't be a conservative producer to do the latter in place of
the former.

Existing consumers I'm familiar with simply pick the first entry in the
location list matching the PC, and have abandoned any other location
expression but that one before they even consider slicing a composite.
So at L1 they will lose access to x or y if you use the shorter list.

> BTW, It might have been useful to have accepted negative operands to
> DW_OP_piece to cover these cases, as in:
> 
>   DW_OP_reg0 DW_OP_piece 4 DW_OP_piece -4 DW_OP_fbreg -60 DW_OP_piece 4

That's more compactly expressed by a variant of piece that takes two
parameters, size and offset (either overlap backwards, or absolute
offset from the start of this composite), as in:

	DW_OP_reg0 DW_OP_piece 4 DW_OP_fbreg -60 DW_OP_abs_piece 4 0

> But I'm not sure about the wisdom of introducing such explicit
> backward-incompatible overlap mechanisms, if empty DW_OP_pieces can be
> taken from overlapping ranges.  It can be more compact, for sure, but
> it amounts to a lot of complexity and redundancy as well.

It doesn't have to be too complex if you pick one new composition
operator that covers all the cases, and e.g. say that an expression
using the new one cannot intermix it with piece and bit_piece.  

Every new opcode is an incompatible addition.  An old consumer cannot
even parse the rest of the location expression, so it must reject the
whole expression.  (It cannot e.g. skip this piece from a composite.)
This is the big disadvantage of adding new expression opcodes vs adding
new attributes and tags, which old consumers just don't notice.  The

070426.1 addition is somewhat mitigated in this regard by being
specified as a singleton operator in an expression anyway.  If it were
used in a composite it would still cause an old consumer to reject the
whole composite.  But its simple use is basically as a special escape in
location list entries.

The location list overlap semantics have the nice feature that a
simple-minded consumer just doesn't notice there is something complex
going on (only uses the first match), rather than in a complex case
suddenly hitting a fancy operator it's not sophisticated enough even to
see how to ignore.  I like that aspect of the DW_AT_value approach too.

> http://dwarfstd.org/ShowIssue.php?issue=070512.1&type=closed appears
> to indicate that DW_OP_implicit_value, that you mentioned in your
> e-mail, was actually accepted in a far more limited way, that only
> supports literal constant values.

That was the original proposal, to which mine was the counter.  I can't
tell whether it's called DW_OP_implicit_value or DW_OP_constant_value,
since both are used in the "accepted" proposal text.  (DW_OP_const_value
would be the most consistent name given what it does and the precedent
of DW_AT_const_value.)

I've since come to dislike this proposal for reasons much more boring
than the rest of this discussion.  It is unlike all other DWARF
operations, which take 0-2 scalar parameters, while the new one takes a
block.  This makes it ill-suited to existing consumer APIs that decode
DWARF operation encodings into internal form.  

Thanks,
Roland