[Dwarf-Discuss] debug_aranges use and overhead

Sat Mar 20 01:59:16 GMT 2021

On Fri, Mar 19, 2021 at 9:34 AM Samy Al Bahra <sbahra at repnop.org> wrote:

> Hi David,
>
> Sorry I'm a bit late to the game.
>

No worries at all - appreciate any/all perspectives/data here, for sure!

> On the value of having .debug_aranges and the performance impact:
>
> Our debugger was designed for performance and does end to end lazy
> evaluation (down to the DIE).
>

Nice! (certainly aspects of LLVM's DWARF parsing I'd love to move towards
more of a lazy model like this)

> This is quite old (excuse the formatting) but numbers are here:
> https://engineering.backtrace.io/2014-09-15-bt-lightweight-backtrace-tool/
> , search for "Chromium".  This is something other debuggers can take
> advantage of if they run in a non-interactive / batch mode (think bulk
> processing of millions - billions of dumps a month)
>

"This is something... " - what is "this" you're referring to there? Lazy
loading? Yeah, for sure. Why do you restrict/suggest that a highly lazy
approach would only be suitable for non-interactive/batch execution?

> and is generally useful when folks are iterating in development (fast
> feedback for crashes while having some background indexing work going on).
>

If it's useful in non-interactive/batch and iterative - is there a use case
you're suggesting such lazy evaluation isn't applicable to?

> I'm also happy to run benchmarks for you with and without .debug_aranges
> on top of our debugger if it'll be useful.
>

Yeah, I'd certainly be curious if you have a chance! Though it may depend a
bit on what your implementation does in the absence of .debug_aranges. \/

> One of the crucial optimizations we made is incremental indexing on top of
> .debug_aranges based on PC values
>

Could you explain that in more detail - and why that approach can't be used
with CU ranges?

> (+ complexities Greg mentions later in the thread). In cases where we lack
> this, we use our own persistent cache which introduces unnecessary
> complexity. Now I am considering going as far as adding a multi-threaded
> indexer for cases where a persistent cache / build system modifications
> aren't an option (work to begin in the next week or two).
>
> .debug_aranges would provide a lot of value to our users.
>
> On Thu, Mar 11, 2021 at 3:48 PM David Blaikie via Dwarf-Discuss <
> dwarf-discuss at lists.dwarfstd.org> wrote:
>
>> On Thu, Mar 11, 2021 at 5:48 AM <paul.robinson at sony.com> wrote:
>>
>>> Hopefully not to side-track things too much... maybe wants its own
>>> thread, if there's more to debate here.
>>>
>>
>> Yeah, how about we spin it off into another thread (done here)
>>
>>
>>> >> For the case you suggested where it would be useful to keep the range
>>> >> list for the CU in the .o file, I think .debug_aranges is what you're
>>> >> looking for.
>>> >
>>> > aranges has been off by default in LLVM for a while - it adds a lot of
>>> > overhead (doesn't have all the nice rnglist encodings for instance -
>>> > nor can it use debug_addr, and if it did it'd still be duplicate with
>>> > the CU ranges wherever they were).
>>>
>>> Did you want to file an issue to improve how .debug_aranges works?
>>>
>>
>> I don't currently understand the value it provides, and I at least don't
>> have a use case for it, so I'm not sure I'd be the best person to
>> advocate/drive that work.
>>
>> Complaining that it duplicates CU ranges is missing the point, though;
>>> it's an index, like .debug_names, of course it duplicates other info.
>>> If you want to suggest an improved index, like we did with .debug_names,
>>> that would be great too.
>>>
>>
>> .debug_names is quite different though - it collects information from
>> across the DIE tree - information that is expensive to otherwise gather
>> (walking the whole DIE tree).
>>
>> .debug_aranges is not like that for most producers (producers that do
>> include the address ranges on the CU DIE) - the data is readily available
>> immediately on the CU. That does involve reading some of .debug_abbrev, and
>> interpreting a handful of attributes - but at least for the use cases I'm
>> aware of, that overhead isn't worth the size increase.
>>
>> Do you have numbers on the benefits of .debug_aranges compared to parsing
>> the ranges from CU DIEs?
>>
>> (one possible issue: the CU doesn't /have/ to contain low/high/ranges if
>> its children DIEs contain addresses - having that as a guarantee, or some
>> preferred way of encoding zero length (high/low of 0 would be acceptable, I
>> guess) would be nice & make it cheap to skip over CUs that don't have any
>> address ranges)
>>
>> Roughly, a modern debug_aranges to me would look something like:
>>
>> <length>
>> <version>
>> <CU sec_offset>
>> <addr_base>
>> <rnglist sec_offset>
>>
>> So it could fully re-use the rnglist encoding. If this was going to be as
>> compact as possible, it'd need to be configurable which encodings it uses -
>> ranges V high/low, addrx V addr - at which point it'd probably look like a
>> small DIE with an inline abbrev (similar to the way DWARFv5 encodes the
>> file and directory entries now, and how debug_names is self-describing) -
>> at which point it looks to me a lot like parsing the CU DIEs.
>>
>> _______________________________________________
>> Dwarf-Discuss mailing list
>> Dwarf-Discuss at lists.dwarfstd.org
>> http://lists.dwarfstd.org/listinfo.cgi/dwarf-discuss-dwarfstd.org
>>
>
>
> --
> Samy Al Bahra [http://repnop.org]
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.dwarfstd.org/pipermail/dwarf-discuss-dwarfstd.org/attachments/20210319/1ede370f/attachment.html>