[Dwarf-Discuss] address pool + offset representation

Wed Jul 26 23:40:39 GMT 2017

On Wed, Jul 26, 2017 at 4:27 PM Robinson, Paul <paul.robinson at sony.com>
wrote:

> >> Well... why not just use ranges, in that case?  .debug_rnglists is
> >> already tuned to reduce relocations.  A range list with only one
> >> entry is the same as a contiguous range, and DW_RLE_offset_pair is
> >> basically the same as using constant low_pc and high_pc.
> >
> > Fair point/question (I was going to say I wasn't sure if gdb supported
> > rnglists yet, but grepping the source it looks like it probably does -
> > though LLVM can't/doesn't produce them yet, it would be a good thing,
> > even aside from this particular issue).
>
> The new range lists are on my list of DWARF 5 features to do in LLVM.
>
> > Certainly would be a thing to prototype/a valid representation that
> > would reduce the number of address relocations/size of object files
> > (under split DWARF at least).
> >
> > But I imagine that representation might be a bit larger overall/more
> > expensive to parse?
>
> Well, if you want to minimize relocations, you can have:
> - DW_AT_ranges with a LEB index into the rnglists section;
>

That's necessarily a ULEB (reading the spec, there's no equivalent of
addrx1/2/3/4?)?

I don't really know just how much LLDB cares about fixed-size forms/DIEs,
but rumor has it it's important to some degree, so I continue to have a
slight preference towards fixed size representations (or at least having
the option to do so, even if there are variable length forms too - as with
addrx).

> - one 4-byte entry in the rnglists header to point to the actual
>   range list; (or 8-byte entry in DWARF-64 format);
> - the range list itself which would be:
>   o DW_RLE_offset_pair (1 byte entry code plus 2 LEBs)
>   o DW_RLE_end_of_list (1 byte entry code)
>   So 2 bytes plus 2 LEBs. This assumes using the unit's base address.
>   If we're not using the unit's base address, we add 1 byte and 1 LEB
>   to select the base address in the .debug_addr section.
> Grand total is 6 bytes plus 3 LEBs (or 7 bytes and 4 LEBs if we can't
> use the unit's base address).
>

Seems like a lot - if it's not possible to do a fixed size representation,
at that point I wouldn't mind a form that was two ULEBs (addrx + offset).
(at least I think the offset, computed as a label difference in the
assembler, can be done as a ULEB (I mean I think there's assembler syntax
for label difference as ULEB)).

> Versus DW_AT_low_pc, one address-sized value and a relocation, or
> one index into .debug_addr and the same relocation.
>
> Or DW_AT_low_pc-as-constant, which is likely the smallest but does
> have the drawback that to compute this DIE's low_pc you need to
> find the parent's low_pc (or the nearest parent that has a low_pc).
>

*nod* Wouldn't be /too/ hard to keep track of.

> I *really* don't think a two-operand kind of FORM will ever fly.
>

Oh? Why's that? I realize starting out that it wasn't ideal - but mostly
around the idea of having the fixed-size forms, and thus having
combinatorial explosion. If fixed sized forms aren't a priority/aren't
possible in the other representations anyway (& aren't done for rnglistx,
for example) - a two ULEB doesn't seem totally crazy to me.

> So the range-list approach is a bit bigger, but one .debug_addr entry
> can act as the base for everything in the section so essentially
> relocation costs are eliminated (both size in .o files and linker
> execution time).
>

Right - that's the goal (& eventually fold the addr pool into the line
table - since it'll contain all the same addresses/relocations over again -
be nice to have them literally /once/)

> It might also be cost-effective to pre-parse the rnglists section,
> could easily be done in parallel with reading other info sections.
> The low_pc solutions really don't feel like they could be done in
> parallel with anything else.
>

Maybe - either you can't do it in parallel because there are non-fixed-size
forms, in which case you have to walk through them anyway (& so you pay a
little extra time to track the current low_pc as you do the walk) or you
have fixed size forms and use them to skip over DIEs - in which case in
parallel you could have other threads reading thoes skipped over DIEs to
find/compute low_pcs (& a merge step to resolve which one is the base of
which other one, etc). Doubt anyone would go to quite such lengths, though.

- Dave

>
> --paulr
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.dwarfstd.org/pipermail/dwarf-discuss-dwarfstd.org/attachments/20170726/8bc50519/attachment.htm>