[Dwarf-Discuss] debug_aranges use and overhead

David Blaikie dblaikie at gmail.com
Thu Feb 24 12:44:27 PST 2022


Hey Samy - curious if you ever happened to end up getting further details
here.

On Fri, Apr 9, 2021 at 1:05 PM Samy Al Bahra <sbahra at repnop.org> wrote:

> Thanks for the detailed response David.
>
> On Fri, Apr 9, 2021 at 2:52 PM David Blaikie <dblaikie at gmail.com> wrote:
>
>> I'm not suggesting scanning all of .debug_info - only the CU DIE for
>> DW_AT_ranges or high/low_pc, then skip to the next CU DIE (via the
>> unit header's next unit offset).
>>
>
>> It sounded like CU ranges couldn't be used to build such an index at
>> all/that your code used quite a different strategy in the absence of
>> aranges? (rather than building the index from the CU ranges - somewhat
>> slower I'm sure, but I wouldn't've thought (& am trying to understand
>> if it is/why) so fundamentally slower that it wouldn't be the next
>> fallback rather than skipping the index entirely or employing some
>> more fundamentally different approach)
>
>
> This is still significantly less dense than aranges, involves more disk
> I/O and memory pressure. Let me see what optimizations I can implement here
> and get back to you with the results / what I came up with. This would be a
> better basis for apples to apples comparison.
>
>
>>
>> If you mean building ranges from all the DIEs deep inside a CU - yeah,
>> that's going to be fundamentally slower in a bunch of ways that maybe
>> I could see that would necessitate a totally different approach/that
>> the index wouldn't make sense anymore (though I'd still like to
>> understand it) - but I'm especially curious about the case where the
>> CU DIE itself does have comprehensive address range information.
>>
>
> Will report back on this.
>
>
>>
>> - Dave
>>
>> >
>> >>
>> >>
>> >>>
>> >>> (+ complexities Greg mentions later in the thread). In cases where we
>> lack this, we use our own persistent cache which introduces unnecessary
>> complexity. Now I am considering going as far as adding a multi-threaded
>> indexer for cases where a persistent cache / build system modifications
>> aren't an option (work to begin in the next week or two).
>> >>>
>> >>> .debug_aranges would provide a lot of value to our users.
>> >>>
>> >>> On Thu, Mar 11, 2021 at 3:48 PM David Blaikie via Dwarf-Discuss <
>> dwarf-discuss at lists.dwarfstd.org> wrote:
>> >>>>
>> >>>> On Thu, Mar 11, 2021 at 5:48 AM <paul.robinson at sony.com> wrote:
>> >>>>>
>> >>>>> Hopefully not to side-track things too much... maybe wants its own
>> >>>>> thread, if there's more to debate here.
>> >>>>
>> >>>>
>> >>>> Yeah, how about we spin it off into another thread (done here)
>> >>>>
>> >>>>>
>> >>>>> >> For the case you suggested where it would be useful to keep the
>> range
>> >>>>> >> list for the CU in the .o file, I think .debug_aranges is what
>> you're
>> >>>>> >> looking for.
>> >>>>> >
>> >>>>> > aranges has been off by default in LLVM for a while - it adds a
>> lot of
>> >>>>> > overhead (doesn't have all the nice rnglist encodings for
>> instance -
>> >>>>> > nor can it use debug_addr, and if it did it'd still be duplicate
>> with
>> >>>>> > the CU ranges wherever they were).
>> >>>>>
>> >>>>> Did you want to file an issue to improve how .debug_aranges works?
>> >>>>
>> >>>>
>> >>>> I don't currently understand the value it provides, and I at least
>> don't have a use case for it, so I'm not sure I'd be the best person to
>> advocate/drive that work.
>> >>>>
>> >>>>> Complaining that it duplicates CU ranges is missing the point,
>> though;
>> >>>>> it's an index, like .debug_names, of course it duplicates other
>> info.
>> >>>>> If you want to suggest an improved index, like we did with
>> .debug_names,
>> >>>>> that would be great too.
>> >>>>
>> >>>>
>> >>>> .debug_names is quite different though - it collects information
>> from across the DIE tree - information that is expensive to otherwise
>> gather (walking the whole DIE tree).
>> >>>>
>> >>>> .debug_aranges is not like that for most producers (producers that
>> do include the address ranges on the CU DIE) - the data is readily
>> available immediately on the CU. That does involve reading some of
>> .debug_abbrev, and interpreting a handful of attributes - but at least for
>> the use cases I'm aware of, that overhead isn't worth the size increase.
>> >>>>
>> >>>> Do you have numbers on the benefits of .debug_aranges compared to
>> parsing the ranges from CU DIEs?
>> >>>>
>> >>>> (one possible issue: the CU doesn't /have/ to contain
>> low/high/ranges if its children DIEs contain addresses - having that as a
>> guarantee, or some preferred way of encoding zero length (high/low of 0
>> would be acceptable, I guess) would be nice & make it cheap to skip over
>> CUs that don't have any address ranges)
>> >>>>
>> >>>> Roughly, a modern debug_aranges to me would look something like:
>> >>>>
>> >>>> <length>
>> >>>> <version>
>> >>>> <CU sec_offset>
>> >>>> <addr_base>
>> >>>> <rnglist sec_offset>
>> >>>>
>> >>>> So it could fully re-use the rnglist encoding. If this was going to
>> be as compact as possible, it'd need to be configurable which encodings it
>> uses - ranges V high/low, addrx V addr - at which point it'd probably look
>> like a small DIE with an inline abbrev (similar to the way DWARFv5 encodes
>> the file and directory entries now, and how debug_names is self-describing)
>> - at which point it looks to me a lot like parsing the CU DIEs.
>> >>>>
>> >>>> _______________________________________________
>> >>>> Dwarf-Discuss mailing list
>> >>>> Dwarf-Discuss at lists.dwarfstd.org
>> >>>> http://lists.dwarfstd.org/listinfo.cgi/dwarf-discuss-dwarfstd.org
>> >>>
>> >>>
>> >>>
>> >>> --
>> >>> Samy Al Bahra [http://repnop.org]
>> >
>> >
>> >
>> > --
>> > Samy Al Bahra [http://repnop.org]
>>
>
>
> --
> Samy Al Bahra [http://repnop.org]
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.dwarfstd.org/pipermail/dwarf-discuss-dwarfstd.org/attachments/20220224/236da4f2/attachment.html>


More information about the Dwarf-Discuss mailing list