[Dwarf-Discuss] More on DW_AT_str_offset_base debug_str_offsets.dwo confusion

David Blaikie dblaikie@gmail.com
Tue Sep 1 18:51:19 GMT 2020


On Tue, Sep 1, 2020 at 10:24 AM David Anderson <davea42 at linuxmail.org>
wrote:
>
> On 8/31/20 8:39 PM, David Blaikie wrote:
> > On Mon, Aug 31, 2020 at 8:22 PM David Anderson <davea42 at linuxmail.org
> > <mailto:davea42 at linuxmail.org>> wrote:
> >
> >     On 8/31/20 1:03 PM, David Blaikie wrote:
> >     > I'd rather go with LLVM's existing interpretation - that strx
> >     > encodings used in .dwo do not attempt to use str_offsets in the
> >     skeleton.
> >     > But I wouldn't mind adding a str_offsets_base to the split full
unit
> >     > to make it clear - this would be consistent with rnglists, I
> >     think? (I
> >     > think, in theory a rnglistx in a .dwo with a split full unit
> >     without a
> >     > rnglists_base would use the rnglists_base (and .debug_rnglists
> >     > non-dwo) in the executable, but if the split full unit has a
> >     > rnglists_base, then the rnglistx in the split full unit use that
> >     base
> >     > to find rnglists in debug_rnglists.dwo - arguably I'd say we
> >     might as
> >     > well say the same thing about loclists, too, for consistency,
> >     though I
> >     > don't have any use for skeleton location lists right now)
> >
> >     It seems to me that rnglists base and loclists_base in Split Full
> >     always
> >     reference the data in .debug_rnglists/.debug_loclists
> >
> >     3.1.3  Split Full Compilation Unit Entries
> >     The following attributes are not part of a split full compilation
unit
> >     entry but instead are
> >     inherited (if present) from the corresponding skeleton compilation
> >     unit:
> >     DW_AT_low_pc,
> >     DW_AT_high_pc, DW_AT_ranges, DW_AT_stmt_list, DW_AT_comp_dir,
> >     DW_AT_str_offsets_base, DW_AT_addr_base and DW_AT_rnglists_base.
> >
> >
> > Hmm... yeah. I guess LLVM implements rnglistx /rnglist_base the same
> > as strx/str_offsets_base. Where it assumes that any *x encoding refers
> > to entities in the .dwo, even in the absence of a
> > rnglists_base/str_offsets_base in the split full unit. I had thought
> > we'd implemented it to emit a rnglists_base in the split full unit,
> > which would've been in contrast to the str_offsets_base - so my
> > mistake/apologies for the previous description.
> Still confused.
>
> Lets say skeleton A is in object file OB.
> And OB.dwp contains the split-full CU DIE.
> Lets say non-empty  .debug_rnglists and .debug_rnglists.dwo  exist.
>
> The compiler could create the rnglists for A in *either* OB or OB.dwp.

It sounds like you might be talking specifically/only about the CU-level
ranges (in the phrasing "rnglists for A")? Not about ranges attached to,
say, a lexical_block or inlined_subroutine? Is that the case?

> And could pick and choose, for each split-able Compilation Unit,
> which place to put rnglists
> independently of all other CUs.

FWIW, I'm not objecting to the DWARF spec's requirement that the CU-level
ranges must go in the skeleton CU (though I wouldn't've minded if that was
a "quality of implementation" thing - some producers might want to scrape
those extra few bytes out of the skeleton at the cost of consumers needing
to do the indirection/read the dwo/dwp to find the CU's ranges)

> Meaning both OB and OB.dwp could have rnglists, but only
> one of them has the rnglists entry for any given CU.
>
> How do we know  which .debug_rnglists section  to look at
> given Skeleton A and split-full A?
> Which does the DW_AT_rnglists_base apply to?

The way I think of it - reading some parts of the spec and ignoring others,
and the way it's implemented in LLVM based on my thinking (model (1) in my
previous email) - and rnglistx encoding used in the split full unit (on the
CU DIE or any child DIEs) would be resolved into debug_rnglists.dwo - no
matter the presence/absence of a rnglists_base on the skeleton CU DIE (and
there would never be a rnglists_base on the split full CU DIE). If the
skeleton unit used a rnglistx encoding for anything, it would need a
rnglists_base and the rnglistx would be resolved relative to that.

I guess a few things I'd say:
  I don't think I'd ever want to suggest that a rnglistx on a skeleton DIE
shuold refer to rnglists.dwo (if that's the case, just move the
rnglistx-encoded attribute into the split full unit DIE, since it's useless
on the skeleton by itself). If you have a rnglistx in the skeleton unit,
you must have a rnglists_base on that skeleton DIE.
  I also think it's important that a unit be able to have references from
the skeleton unit to rnglists non-dwo, and to have references from the
split full unit (less important for the unit DIE itself (but perhaps
someone has a need for some other rnglistx encoded extension attribute, for
instance, that they would like to put on the split full unit DIE) but
certainly for the children of that DIE) to rnglists.dwo - the question is
just how to support both of those. Either we assume all *x encodings refer
within their own unit (from the previous comment this, to me, is already
definitely true for the skeleton unit - so that only leaves the split full
unit: all *x encodings in the split full unit then refer to their
rnglists.dwo/loclists.dwo/str_offsets.dwo) or we say they refer to
whichever of the nearest unit has the first *_base attribute. If the split
full unit has a *_base, then resolve the split unit & children's *x
encodings relative to that and using .dwo sections. If the split full unit
does not have a *_base, but the skeleton unit does have *_base, then
resolve *x encodings relative to that and use non-.dwo sections. If neither
have a *_base, reject *x encodings.

> If one violated the standard and put DW_AT_rnglists_base
> into the CU die that has the rnglists (skeleton or split-full) it would
then
> be known where to read the rnglists.

Well, could put it on both both still, I think - rnglists_base on the
skeleton CU for a rnglistx skeleton CU DIE. rnglists_base on the split full
unit DIE so that

Perhaps I can work some examples to remove confusion. (assume this all
extends to loclistx and strx encodings/bases too)

a small file with both CU ranges and lexical_block ranges:
x.cpp:
void f();
// extra function, compile with -ffunction-sections to force CU-level ranges
void f1() { }
void f2() {
  {
    int i = 7; // forces the lexical_block to be emitted
    f();
    f();
  }
  f(); // manually modified LLVM IR to interleave this call
       // between the previous two f() calls to force the
       // use of ranges on the lexical_block
}

summarized DWARF:

Model 1:
As implemented in LLVM, rnglistx in the split full unit (in the child DIE,
in this case, but I think it'd be nice (but not practically important
immediately, that I know of) if it was supported on the split full unit DIE
as well) can only reference rnglists.dwo, without any need for
rnglists_base anywhere. rnglistx on the skeleton CU must have a
rnglists_base on the skeleton unit (or its invalid DWARF) and is resolved
relative to that.
x.o:

.debug_info contents:

0x00000000: Compile Unit: length = 0x0000002c, format = DWARF32, version =
0x0005, unit_type = DW_UT_skeleton, abbr_offset = 0x0000, addr_size = 0x08,
DWO_id = 0xbd73af57879f76d0 (next unit at 0x00000030)


DW_TAG_skeleton_unit [1]

  DW_AT_low_pc [DW_FORM_addr]       (0x0000000000000000)

  DW_AT_ranges [DW_FORM_rnglistx]   (indexed (0x0) rangelist = 0x00000010

    [0x0000000000000000, 0x0000000000000001) ".text.f1"

    [0x0000000000000000, 0x000000000000001d) ".text.f2")

  DW_AT_rnglists_base [DW_FORM_sec_offset]  (0x0000000c)
  ...


.debug_rnglists contents:

0x00000000: range list header: length = 0x00000013, format = DWARF32,
version = 0x0005, addr_size = 0x08, seg_size = 0x00, offset_entry_count =
0x00000001

offsets: [

0x00000004 => 0x00000010

]

ranges:

0x00000010: [DW_RLE_startx_length]:  0x0000000000000000, 0x0000000000000001

0x00000013: [DW_RLE_startx_length]:  0x0000000000000001, 0x000000000000001d

0x00000016: [DW_RLE_end_of_list  ]
x.dwo:

.debug_info.dwo contents:

0x00000000: Compile Unit: length = 0x00000056, format = DWARF32, version =
0x0005, unit_type = DW_UT_split_compile, abbr_offset = 0x0000, addr_size =
0x08, DWO_id = 0xbd73af57879f76d0 (next unit at 0x0000005a)


DW_TAG_compile_unit
  ...

  DW_TAG_subprogram [3] *

    DW_AT_low_pc [DW_FORM_addrx]    (indexed (00000001) address =
<unresolved>)

    DW_AT_high_pc [DW_FORM_data4]   (0x0000001d)

    DW_AT_name [DW_FORM_strx1]      (indexed (00000002) string = "f2")


  DW_TAG_lexical_block [4] *

    DW_AT_ranges [DW_FORM_rnglistx]       (indexed (0x0) rangelist =
0x00000010

      [0x0000000000000008, 0x0000000000000011)

      [0x0000000000000018, 0x000000000000001e))


.debug_rnglists.dwo contents:

0x00000000: range list header: length = 0x00000015, format = DWARF32,
version = 0x0005, addr_size = 0x08, seg_size = 0x00, offset_entry_count =
0x00000001

offsets: [

0x00000004 => 0x00000010

]

ranges:

0x00000010: [DW_RLE_base_addressx]:  0x0000000000000001

0x00000012: [DW_RLE_offset_pair  ]:  0x0000000000000007, 0x0000000000000010

0x00000015: [DW_RLE_offset_pair  ]:  0x0000000000000017, 0x000000000000001d

0x00000018: [DW_RLE_end_of_list  ]

Model 2
If we used model (2) to interpret the above DWARF as-is, then the
lexical_block rnglistx would refer to rnglists non-dwo. ([0, 1) in
".text.f1", [0, 1d) in ".text.f2") - so to represent the information LLVM
intended, it would need to produce different DWARF: either adding a
rnglists_base to the split full unit DIE:
x.o:

.debug_info contents:

0x00000000: Compile Unit: length = 0x0000002c, format = DWARF32, version =
0x0005, unit_type = DW_UT_skeleton, abbr_offset = 0x0000, addr_size = 0x08,
DWO_id = 0xbd73af57879f76d0 (next unit at 0x00000030)


DW_TAG_skeleton_unit [1]

  DW_AT_low_pc [DW_FORM_addr]       (0x0000000000000000)

  DW_AT_ranges [DW_FORM_rnglistx]   (indexed (0x0) rangelist = 0x00000010

    [0x0000000000000000, 0x0000000000000001) ".text.f1"

    [0x0000000000000000, 0x000000000000001d) ".text.f2")

  DW_AT_rnglists_base [DW_FORM_sec_offset]  (0x0000000c) // relocatable
  ...


.debug_rnglists contents:

0x00000000: range list header: length = 0x00000013, format = DWARF32,
version = 0x0005, addr_size = 0x08, seg_size = 0x00, offset_entry_count =
0x00000001

offsets: [

0x00000004 => 0x00000010

]

ranges:

0x00000010: [DW_RLE_startx_length]:  0x0000000000000000, 0x0000000000000001

0x00000013: [DW_RLE_startx_length]:  0x0000000000000001, 0x000000000000001d

0x00000016: [DW_RLE_end_of_list  ]
x.dwo:

.debug_info.dwo contents:

0x00000000: Compile Unit: length = 0x00000056, format = DWARF32, version =
0x0005, unit_type = DW_UT_split_compile, abbr_offset = 0x0000, addr_size =
0x08, DWO_id = 0xbd73af57879f76d0 (next unit at 0x0000005a)


DW_TAG_compile_unit

  DW_AT_rnglists_base [DW_FORM_sec_offset]  (0x0000000c) // note
relocatable, absolute in .dwo, relative to the cu_index in .dwp

  ...

  DW_TAG_subprogram [3] *

    DW_AT_low_pc [DW_FORM_addrx]    (indexed (00000001) address =
<unresolved>)

    DW_AT_high_pc [DW_FORM_data4]   (0x0000001d)

    DW_AT_name [DW_FORM_strx1]      (indexed (00000002) string = "f2")


  DW_TAG_lexical_block [4] *

    DW_AT_ranges [DW_FORM_rnglistx]       (indexed (0x0) rangelist =
0x00000010

      [0x0000000000000008, 0x0000000000000011)

      [0x0000000000000018, 0x000000000000001e))


.debug_rnglists.dwo contents:

0x00000000: range list header: length = 0x00000015, format = DWARF32,
version = 0x0005, addr_size = 0x08, seg_size = 0x00, offset_entry_count =
0x00000001

offsets: [

0x00000004 => 0x00000010

]

ranges:

0x00000010: [DW_RLE_base_addressx]:  0x0000000000000001

0x00000012: [DW_RLE_offset_pair  ]:  0x0000000000000007, 0x0000000000000010

0x00000015: [DW_RLE_offset_pair  ]:  0x0000000000000017, 0x000000000000001d

0x00000018: [DW_RLE_end_of_list  ]

Alternative use of Model 2:
you could reference a rnglist in the object/executable, rather than using
rnglists.dwo:
x.o:

.debug_info contents:

0x00000000: Compile Unit: length = 0x0000002c, format = DWARF32, version =
0x0005, unit_type = DW_UT_skeleton, abbr_offset = 0x0000, addr_size = 0x08,
DWO_id = 0xbd73af57879f76d0 (next unit at 0x00000030)


DW_TAG_skeleton_unit [1]

  DW_AT_low_pc [DW_FORM_addr]       (0x0000000000000000)

  DW_AT_ranges [DW_FORM_rnglistx]   (indexed (0x0) rangelist = 0x00000010

    [0x0000000000000000, 0x0000000000000001) ".text.f1"

    [0x0000000000000000, 0x000000000000001d) ".text.f2")

  DW_AT_rnglists_base [DW_FORM_sec_offset]  (0x0000000c) // relocatable
  ...


.debug_rnglists contents:

0x00000000: range list header: length = ..., format = DWARF32, version =
0x0005, addr_size = 0x08, seg_size = 0x00, offset_entry_count = 0x00000002

offsets: [

0x00000004 => 0x00000010
0x00000008 => 0x00000017

]

ranges:

0x00000010: [DW_RLE_startx_length]:  0x0000000000000000, 0x0000000000000001

0x00000013: [DW_RLE_startx_length]:  0x0000000000000001, 0x000000000000001d

0x00000016: [DW_RLE_end_of_list  ]

0x00000027: [DW_RLE_base_addressx]:  0x0000000000000001

0x00000029: [DW_RLE_offset_pair  ]:  0x0000000000000007, 0x0000000000000010

0x0000002c: [DW_RLE_offset_pair  ]:  0x0000000000000017, 0x000000000000001d

0x0000002f: [DW_RLE_end_of_list  ]
x.dwo:

.debug_info.dwo contents:

0x00000000: Compile Unit: length = 0x00000056, format = DWARF32, version =
0x0005, unit_type = DW_UT_split_compile, abbr_offset = 0x0000, addr_size =
0x08, DWO_id = 0xbd73af57879f76d0 (next unit at 0x0000005a)


DW_TAG_compile_unit
  ...

  DW_TAG_subprogram [3] *

    DW_AT_low_pc [DW_FORM_addrx]    (indexed (00000001) address =
<unresolved>)

    DW_AT_high_pc [DW_FORM_data4]   (0x0000001d)

    DW_AT_name [DW_FORM_strx1]      (indexed (00000002) string = "f2")


  DW_TAG_lexical_block [4] *

    DW_AT_ranges [DW_FORM_rnglistx]       (indexed (0x1) rangelist =
0x00000010

      [0x0000000000000008, 0x0000000000000011)

      [0x0000000000000018, 0x000000000000001e))

> Confused, still.
> DavidA
>
>
>
> --
> Despite all appearances, your boss is a thinking, feeling, human being.
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.dwarfstd.org/pipermail/dwarf-discuss-dwarfstd.org/attachments/20200901/a4606114/attachment-0001.html>



More information about the Dwarf-discuss mailing list