[Dwarf-discuss] Proposal: `DW_LNS_indirect_line`

Cary Coutant ccoutant@gmail.com
Tue Jul 2 02:56:50 GMT 2024


Added as Issue 240626.1 <https://dwarfstd.org/issues/240626.1.html>.

-cary


On Wed, Jun 26, 2024 at 4:06 PM Matthew Lugg via Dwarf-discuss <
dwarf-discuss@lists.dwarfstd.org> wrote:

> # Add DW_LNS_indirect_line - update `line` to absolute value stored
> indirectly
>
> ## Background
>
> In many source languages, it is possible for many program-counter
> addresses with arbitrary
> separation to correspond to the same source line due to features like
> templates/generics. When
> designing an incremental compiler, the line number program must be
> updated when line numbers within
> a source file are moved. It would be desirable to have the property that
> when moving a source line
> corresponding to a large amount of distinct program-counter addresses,
> only one line number value in
> the DWARF information needs to be updated. For this to be true, the
> regions of the line number
> program corresponding to each such address must include the line number
> of the source construct not
> directly, but through an indirect reference. This allows one line number
> value stored in the binary
> to be shared across arbitrarily many entries in the line number matrix.
>
> This is not currently possible: all modifications to the `line` register
> are given by relative
> offsets, and all of these offsets are directly included in the
> instruction (or implicit in the case
> of a special opcode).
>
> ## Overview
>
> Introduce new fields to the line number program header,
> `indirect_lines_length` (ULEB128) and
> `indirect_lines` (opaque block of bytes containing ULEB128 values). The
> `indirect_lines_length`
> field is the length in bytes of the `indirect_lines` section, rather
> than the number of elements.
> Introduce a new standard opcode to the line number program,
> `DW_LNS_indirect_line`. This opcode
> takes a single ULEB128 operand, which represents a byte offset into the
> `indirect_lines` stored in
> the header. The effect of this instruction is to set the `line` register
> to the ULEB128 value stored
> at the given byte offset into `indirect_lines`. Note that
> `indirect_lines` is not itself validated
> to be a valid sequence of ULEB128 values; decoding only occurs when
> `DW_LNS_indirect_line` is used.
> This allows an incremental compiler to pre-allocate a large amount of
> padding space in
> `indirect_lines` to fill in later as needed.
>
> Note that an incremental compiler would not necessarily wish to use
> variable-length integers to
> represent this information, since certain changes of line numbers could
> cause a line number which
> was previously encoded using 1 byte to now require 2. However, since the
> stored values need not be
> densely packed, an implementation is free to reserve as much space as is
> necessary for each entry.
> For instance, the downstream Zig compiler (which is the original
> motivator for this proposal) may
> choose to reserve 4 or 5 bytes for each line number, as line numbers in
> Zig source files cannot
> exceed 1<<32. The use of ULEB128 allows the compiler to make an
> appropriate decision here instead of
> codifying such a restriction into the DWARF specification.
>
> ## Proposed Changes
>
> Pages and line numbers are given for the 2024-06-16 working draft of
> DWARF Version 6, which is the
> latest draft at the time of writing.
>
> 6.2.4 (pg 163; line 27)
>
> 21. indirect_lines_length (ULEB128)
>      The length in bytes of the data stored in the `indirect_lines` field.
> 22. indirect_lines (block containing ULEB128 entries)
>      A collection of line numbers, each stored as a ULEB128 integer.
> These values are referenced by
>      DW_LNS_indirect_line instructions to modify the state of the line
> number information state
>      machine.
>
>      The data stored in this field is not checked to be a valid sequence
> of ULEB128 entries. The
>      contained data may include padding bytes or otherwise invalid data.
> As such, it is expected that
>      bytes of this field be accessed only when a DW_LNS_indirect_line
> instruction references them.
>
> 6.2.5.2 (pg 170; line 23)
>
> 14. DW_LNS_indirect_line
>      The DW_LNS_indirect_line opcode takes a single unsigned LEB128
> operand. This operand is
>      interpreted as a byte offset into the `indirect_lines` field of the
> line number program header.
>      An unsigned LEB128 value is read from `indirect_lines` at the given
> offset, and this value is
>      stored into the state machine's `line` register.
>
> 7.22 (pg 246; table 7.25)
>
>   Opcode name          | Value
> ----------------------+-------
>         ...            |  ...
> DW_LNS_indirect_line  | 0x0d
>
> --
> Dwarf-discuss mailing list
> Dwarf-discuss@lists.dwarfstd.org
> https://lists.dwarfstd.org/mailman/listinfo/dwarf-discuss
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.dwarfstd.org/pipermail/dwarf-discuss/attachments/20240701/f86dca51/attachment-0001.htm>


More information about the Dwarf-discuss mailing list