[Dwarf-Discuss] debug_aranges use and overhead

Thu Mar 11 20:48:28 GMT 2021

On Thu, Mar 11, 2021 at 5:48 AM <paul.robinson at sony.com> wrote:

> Hopefully not to side-track things too much... maybe wants its own
> thread, if there's more to debate here.
>

Yeah, how about we spin it off into another thread (done here)

> >> For the case you suggested where it would be useful to keep the range
> >> list for the CU in the .o file, I think .debug_aranges is what you're
> >> looking for.
> >
> > aranges has been off by default in LLVM for a while - it adds a lot of
> > overhead (doesn't have all the nice rnglist encodings for instance -
> > nor can it use debug_addr, and if it did it'd still be duplicate with
> > the CU ranges wherever they were).
>
> Did you want to file an issue to improve how .debug_aranges works?
>

I don't currently understand the value it provides, and I at least don't
have a use case for it, so I'm not sure I'd be the best person to
advocate/drive that work.

Complaining that it duplicates CU ranges is missing the point, though;
> it's an index, like .debug_names, of course it duplicates other info.
> If you want to suggest an improved index, like we did with .debug_names,
> that would be great too.
>

.debug_names is quite different though - it collects information from
across the DIE tree - information that is expensive to otherwise gather
(walking the whole DIE tree).

.debug_aranges is not like that for most producers (producers that do
include the address ranges on the CU DIE) - the data is readily available
immediately on the CU. That does involve reading some of .debug_abbrev, and
interpreting a handful of attributes - but at least for the use cases I'm
aware of, that overhead isn't worth the size increase.

Do you have numbers on the benefits of .debug_aranges compared to parsing
the ranges from CU DIEs?

(one possible issue: the CU doesn't /have/ to contain low/high/ranges if
its children DIEs contain addresses - having that as a guarantee, or some
preferred way of encoding zero length (high/low of 0 would be acceptable, I
guess) would be nice & make it cheap to skip over CUs that don't have any
address ranges)

Roughly, a modern debug_aranges to me would look something like:

<length>
<version>
<CU sec_offset>
<addr_base>
<rnglist sec_offset>

So it could fully re-use the rnglist encoding. If this was going to be as
compact as possible, it'd need to be configurable which encodings it uses -
ranges V high/low, addrx V addr - at which point it'd probably look like a
small DIE with an inline abbrev (similar to the way DWARFv5 encodes the
file and directory entries now, and how debug_names is self-describing) -
at which point it looks to me a lot like parsing the CU DIEs.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.dwarfstd.org/pipermail/dwarf-discuss-dwarfstd.org/attachments/20210311/d9a5754b/attachment.html>