[Dwarf-discuss] Enhancement: Expression Operation Vendor Extensibility Opcode

Ben Woodard woodard@redhat.com
Mon Mar 27 17:32:49 GMT 2023


I'm sorry Scott, I did not intend to hijack your proposal. In essence, I 
was saying that I support a the registry part of your proposal below. 
That has been one of the long time requests from the tool developers 
that I work with.

On 3/24/23 13:21, Linder, Scott via Dwarf-discuss wrote:
> [AMD Official Use Only - General]
>
> Background
> ==========
>
> The vendor extension encoding space for DWARF expression operations
> accommodates only 32 unique operations. In practice, the lack of a central
> registry and a desire for backwards compatibility means vendor extensions are
> never retired, even when standard versions are accepted into DWARF proper. This
> has produced a situation where the effective encoding space available for new
> vendor extensions is miniscule today.

I understand the desire for each particular vendor opcode to have a 
semantic meaning that is fixed throughout time. This simplifies some 
areas of the DWARF parsers because they do not have to disambiguate the 
semantic meaning of a particular constant based upon the version of the 
DWARF that is being processed. This is particularly helpful when you are 
handling a an object that may be linked together containing CUs that 
were produced with different versions of the standard. This is not only 
true for DWARF Vendor expression operations, it is true of the broader 
DWARF standard.

In essence up to this point a DWARF5 enabled parser, has been be capable 
of handing DWARF{4,3} and possibly even DWARF2 without modification. I'm 
not really aware of any breakages where interpreting an older version of 
DWARF requires being aware of the DWARF version number.

Does this continue to be an goal for DWARF6? Whether or not this is 
true, we should make this explicit.

If not, it would provide an opportunity for tool developers to begin 
building the DWARF version sensitive parsers necessary to handle that in 
the cases where within an object CUs can have different versions of 
DWARF. An explicit statement of intent would also give the vendor 
compiler developers and opportunity make the decision, that with DWARF6 
they could begin to reuse or recycle vendor extensions which are no 
longer needed for whatever reason.

I personally feel like DWARF has been around for about 30 years, it is 
kind of due a flag day allowing accumulated cruft to be purged. The 
producers and consumers just need to all move together and DWARF6 may be 
a good time to do this.

> To expand this encoding space we propose defining one DWARF operation in the
> official encoding space which acts as a "prefix" for vendor extensions. It is
> followed by a ULEB128 encoded vendor extension opcode, which is then followed
> by the operands of the corresponding vendor extension operation.
>
> This scheme opens up an infinite encoding space for arbitrary vendor
> extensions, and in practical terms is no less compact than if a fixed-size
> encoding were chosen, as was done for DW_LNS_extended_op. That is to say, when
> compared with an alternative scheme which encodes the opcode with a single
> unsigned byte: for the first 127 opcodes our approach is indistinguishable from
> the alternative scheme; for the next 128 opcodes it requires one more byte than
> that alternative scheme; and after 255 opcodes the alternative scheme is
> exhausted.
>
> Since vendor extension operations can have arbitrary semantics, the consumer
> must understand them to be able to continue evaluating the expression. The only
> use for a size operand would be for a consumer that only needs to print the
> expression. Omitting a size operand makes the operation encoding more compact,
> and this was deemed more important than the limited printing use case.
> Therefore no ULEB128 size operand is present to provide the number of bytes of
> following operands, unlike DW_LNS_extended_op.
>
> A centralized registry of vendor extension opcodes which are in use, maintained
> on the dwarfstd.org website or another suitable location, could also be
> implemented as a part of this proposal. This would remove the need for vendors
> to coordinate allocation themselves, and make it simpler to use more than one
> vendor extension at a time. As there is support for an infinite number of
> opcodes, the registration process could involve very limited review, and would
> therefore pose a minimal burden to the maintainer of such a registry.

My only concern with this proposal is really, I believe that vendor 
specific extensions should be kept to a bare minimum. It makes the job 
of consumers much more difficult, especially when tools need to support 
multiple producers.

I would say from a consumer perspective, I would prefer keeping the 
range of vendor extensions smaller, possibly forcing a flag day to 
facilitate reuse of old disused encodings, and making the process of 
standardizing needed new behaviors easier and more frequent. I believe 
that there will always need to be encoding space to prototype features. 
However, I believe that the community is best served by having new 
features regularly be incorporated into the public standard rather than 
facilitating a potential balkanization within the vendor extension area 
of the standard.

I believe that the updates to the standard version of DWARF have been 
too infrequent to keep pace with evolving technology such as GPUs and 
the challenges presented by rapidly evolving language standards like 
C++, and new classes of consumers like performance tools and ABI 
checkers. Furthermore, the bar has been set to high for these changes 
and consequently, vendor extensions have been the relief valve to allow 
technology to move forward. My hope is with a more open process, which 
is more accepting of the need for change, and some overdue housekeeping, 
the need for greatly expanded encoding space would not quite as pressing.

-ben

> Proposal
> ========
>
> 1) In Section 2.5.1.7, p38, add a new code at the end of the list:
>
>      3. DW_OP_user
>          The DW_OP_user opcode encodes a vendor extension operation. It has at
>          least one operand: a ULEB128 constant identifying a vendor extension
>          operation. The remaining operands are defined by the vendor extension.
>          The vendor extension opcode 0 is reserved and cannot be used by any
>          vendor extension.
>
>          <i>The DW_OP_user encoding space can be understood to supplement the
>          space defined by DW_OP_lo_user and DW_OP_hi_user that is allocated by
>          the standard for the same purpose.</i>
>
> 2) In Section 7.7.1, p226, add a new row to table 7.9:
>
>      DW_OP_user  |  TBD  |  1+  | ULEB128 vendor extension opcode, followed by
>                  |       |      | vendor-extension-defined operands



More information about the Dwarf-discuss mailing list