[Dwarf-Discuss] debug_aranges use and overhead

Fri Apr 9 18:52:27 GMT 2021

On Fri, Apr 9, 2021 at 11:13 AM Samy Al Bahra <sbahra at repnop.org> wrote:
>
> Responses inline.
>
> On Fri, Mar 19, 2021 at 9:59 PM David Blaikie <dblaikie at gmail.com> wrote:
>>
>> On Fri, Mar 19, 2021 at 9:34 AM Samy Al Bahra <sbahra at repnop.org> wrote:
>
>
> [...]
>
>>>
>>> This is quite old (excuse the formatting) but numbers are here: https://engineering.backtrace.io/2014-09-15-bt-lightweight-backtrace-tool/ , search for "Chromium".  This is something other debuggers can take advantage of if they run in a non-interactive / batch mode (think bulk processing of millions - billions of dumps a month)
>>
>>
>> "This is something... " - what is "this" you're referring to there? Lazy loading? Yeah, for sure. Why do you restrict/suggest that a highly lazy approach would only be suitable for non-interactive/batch execution?
>
>
> This is quite old, this = blog post.
>
> This is something other debuggers can take advantage of: Lazy loading is more effective for automated analysis tools than interactive debuggers which more often than not don't benefit from lazy evaluation if folks are expecting auto-complete for types, variables, etc... Of course, it is still useful for non-blocking loads of debug data especially if you implement job cancellation (allow commands to be executed concurrently while loading is being completed).
>
> [...]
>
>>
>>
>>>
>>> I'm also happy to run benchmarks for you with and without .debug_aranges on top of our debugger if it'll be useful.
>>
>>
>> Yeah, I'd certainly be curious if you have a chance! Though it may depend a bit on what your implementation does in the absence of .debug_aranges.
>
>
> I'll get back to you on this shortly!
>
>>
>>
>>>
>>> One of the crucial optimizations we made is incremental indexing on top of .debug_aranges based on PC values
>>
>>
>> Could you explain that in more detail - and why that approach can't be used with CU ranges?
>
>
> .debug_aranges is significantly smaller and faster to load than scanning all of .debug_info.

I'm not suggesting scanning all of .debug_info - only the CU DIE for
DW_AT_ranges or high/low_pc, then skip to the next CU DIE (via the
unit header's next unit offset).

It sounded like CU ranges couldn't be used to build such an index at
all/that your code used quite a different strategy in the absence of
aranges? (rather than building the index from the CU ranges - somewhat
slower I'm sure, but I wouldn't've thought (& am trying to understand
if it is/why) so fundamentally slower that it wouldn't be the next
fallback rather than skipping the index entirely or employing some
more fundamentally different approach)

If you mean building ranges from all the DIEs deep inside a CU - yeah,
that's going to be fundamentally slower in a bunch of ways that maybe
I could see that would necessitate a totally different approach/that
the index wouldn't make sense anymore (though I'd still like to
understand it) - but I'm especially curious about the case where the
CU DIE itself does have comprehensive address range information.

- Dave

>
>>
>>
>>>
>>> (+ complexities Greg mentions later in the thread). In cases where we lack this, we use our own persistent cache which introduces unnecessary complexity. Now I am considering going as far as adding a multi-threaded indexer for cases where a persistent cache / build system modifications aren't an option (work to begin in the next week or two).
>>>
>>> .debug_aranges would provide a lot of value to our users.
>>>
>>> On Thu, Mar 11, 2021 at 3:48 PM David Blaikie via Dwarf-Discuss <dwarf-discuss at lists.dwarfstd.org> wrote:
>>>>
>>>> On Thu, Mar 11, 2021 at 5:48 AM <paul.robinson at sony.com> wrote:
>>>>>
>>>>> Hopefully not to side-track things too much... maybe wants its own
>>>>> thread, if there's more to debate here.
>>>>
>>>>
>>>> Yeah, how about we spin it off into another thread (done here)
>>>>
>>>>>
>>>>> >> For the case you suggested where it would be useful to keep the range
>>>>> >> list for the CU in the .o file, I think .debug_aranges is what you're
>>>>> >> looking for.
>>>>> >
>>>>> > aranges has been off by default in LLVM for a while - it adds a lot of
>>>>> > overhead (doesn't have all the nice rnglist encodings for instance -
>>>>> > nor can it use debug_addr, and if it did it'd still be duplicate with
>>>>> > the CU ranges wherever they were).
>>>>>
>>>>> Did you want to file an issue to improve how .debug_aranges works?
>>>>
>>>>
>>>> I don't currently understand the value it provides, and I at least don't have a use case for it, so I'm not sure I'd be the best person to advocate/drive that work.
>>>>
>>>>> Complaining that it duplicates CU ranges is missing the point, though;
>>>>> it's an index, like .debug_names, of course it duplicates other info.
>>>>> If you want to suggest an improved index, like we did with .debug_names,
>>>>> that would be great too.
>>>>
>>>>
>>>> .debug_names is quite different though - it collects information from across the DIE tree - information that is expensive to otherwise gather (walking the whole DIE tree).
>>>>
>>>> .debug_aranges is not like that for most producers (producers that do include the address ranges on the CU DIE) - the data is readily available immediately on the CU. That does involve reading some of .debug_abbrev, and interpreting a handful of attributes - but at least for the use cases I'm aware of, that overhead isn't worth the size increase.
>>>>
>>>> Do you have numbers on the benefits of .debug_aranges compared to parsing the ranges from CU DIEs?
>>>>
>>>> (one possible issue: the CU doesn't /have/ to contain low/high/ranges if its children DIEs contain addresses - having that as a guarantee, or some preferred way of encoding zero length (high/low of 0 would be acceptable, I guess) would be nice & make it cheap to skip over CUs that don't have any address ranges)
>>>>
>>>> Roughly, a modern debug_aranges to me would look something like:
>>>>
>>>> <length>
>>>> <version>
>>>> <CU sec_offset>
>>>> <addr_base>
>>>> <rnglist sec_offset>
>>>>
>>>> So it could fully re-use the rnglist encoding. If this was going to be as compact as possible, it'd need to be configurable which encodings it uses - ranges V high/low, addrx V addr - at which point it'd probably look like a small DIE with an inline abbrev (similar to the way DWARFv5 encodes the file and directory entries now, and how debug_names is self-describing) - at which point it looks to me a lot like parsing the CU DIEs.
>>>>
>>>> _______________________________________________
>>>> Dwarf-Discuss mailing list
>>>> Dwarf-Discuss at lists.dwarfstd.org
>>>> http://lists.dwarfstd.org/listinfo.cgi/dwarf-discuss-dwarfstd.org
>>>
>>>
>>>
>>> --
>>> Samy Al Bahra [http://repnop.org]
>
>
>
> --
> Samy Al Bahra [http://repnop.org]