[Dwarf-discuss] performance tools and inverted location lists

Kyle Huey khuey@pernos.co
Fri Jun 16 15:51:02 GMT 2023


On Fri, Jun 16, 2023 at 8:28 AM Ben Woodard via Dwarf-discuss <
dwarf-discuss@lists.dwarfstd.org> wrote:

> I was looking at 2 level location tables
> https://dwarfstd.org/issues/140906.1.html and can see how it could
> improve things there. One thing that I noticed about it was that it was
> based on some prior art done in HP-UX.
>
> A nut that I have been trying to crack for a few years but haven't really
> figured out comes from the performance tools, binary analysis people and I
> could even make use of it for my ABI work. They would really like something
> that I've been calling "inverted location lists". Location lists are
> basically a function (yes I know it is technically a relation but let's not
> quibble about that right now):
>
> f( PC, "variable") -> location description
>
> This totally makes sense in the context of a debugger where the context of
> the lookup includes the PC, and the user inputs the symbolic name of the
> variable. The point that I try to make is DWARF is not just for debuggers
> anymore. A much broader range of consumers is emerging.
>
> In the context of performance tools, they need:
>
> f( PC, <concrete> location) -> symbolic name
>
> The reason why I specify "concrete" location is because it is never going
> to be a literal constant or an implicit or a DWARF expression or any of the
> other things along those lines. It is going to be an address or a register,
> things that have concrete existence.
>

lldb has a version of this function implemented in the functions
StackFrame::GuessValueForAddress/GuessValueForRegisterAndOffset.

- Kyle

> Here are the use cases for this that have come up:
>
>    - For performance tools, they have some piece of code which is not
>    performing as it should. The tools show them that they are getting a large
>    number of cache misses between this PC and another PC. They can disassemble
>    the machine code and they see a handful of instructions. To be able to
>    figure out what is wrong they need to associate the operands of these
>    instructions to what they refer to at the source level. Doing this
>    disassembled optimized machine code and then associating the data accesses
>    within it back to constructs within the source language really puts this
>    level of analysis is the domain of experts and must be done by hand.
>    Because of the limited information currently provided by DWARF it is not
>    something that can be coded into a tool.
>    - Within ABI analysis there are a few cases where compilation options
>    can change the calling convention of a function sufficiently enough that it
>    can create an ABI mismatch. The simple example that illustrates the point
>    is SSE. You can pass vector operands by value in vector registers. However,
>    if you compile a program without SSE then it is passed by reference. So in
>    the ABI testing program, if I take the locations of the formal parameters
>    at the call site and then use an inverted location list to look up what
>    variables exist at those locations in the called function, if they don't
>    match then I have an ABI problem.
>    - One binary analysis use case is quite similar to the ABI use case,
>    they see a call instruction, they can look up the target of the call to
>    find the function prototype. From that they can figure out where the
>    parameters to the function call are at the time of the call. However, they
>    want to know in the calling scope what symbolic name in the source language
>    for the parameter in a particular location at the time of the call. Simple
>    example: find all the examples of "dlopen" calls - what library name is
>    being opened, is it something that we can determine from static analysis or
>    is it something read from a config file...
>
> What all three of us have tried doing with limited success is taking the
> location list data and then inverting the mappings. The problem is that
> this is far from complete and generating that data is very computationally
> expensive.
>
> I've looked for prior art to see if someone had come up with a solution to
> this but I haven't found any. The allusion to the HP-UX two level line maps
> is something that I had never come across before. Since the combined
> experience of this group is much deeper than what I have, does anybody know
> of any prior art that I can use as a basis for a DWARF feature enhancement
> issue?
>
> -ben
>
>
> --
> Dwarf-discuss mailing list
> Dwarf-discuss@lists.dwarfstd.org
> https://lists.dwarfstd.org/mailman/listinfo/dwarf-discuss
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.dwarfstd.org/pipermail/dwarf-discuss/attachments/20230616/795b25ee/attachment.htm>


More information about the Dwarf-discuss mailing list