[Dwarf-discuss] performance tools and inverted location lists

Fri Jun 16 15:28:45 GMT 2023

I was looking at 2 level location tables 
https://dwarfstd.org/issues/140906.1.html and can see how it could 
improve things there. One thing that I noticed about it was that it was 
based on some prior art done in HP-UX.

A nut that I have been trying to crack for a few years but haven't 
really figured out comes from the performance tools, binary analysis 
people and I could even make use of it for my ABI work. They would 
really like something that I've been calling "inverted location lists". 
Location lists are basically a function (yes I know it is technically a 
relation but let's not quibble about that right now):

    f( PC, "variable") -> location description

This totally makes sense in the context of a debugger where the context 
of the lookup includes the PC, and the user inputs the symbolic name of 
the variable. The point that I try to make is DWARF is not just for 
debuggers anymore. A much broader range of consumers is emerging.

In the context of performance tools, they need:

    f( PC, <concrete> location) -> symbolic name

The reason why I specify "concrete" location is because it is never 
going to be a literal constant or an implicit or a DWARF expression or 
any of the other things along those lines. It is going to be an address 
or a register, things that have concrete existence.

Here are the use cases for this that have come up:

  * For performance tools, they have some piece of code which is not
    performing as it should. The tools show them that they are getting a
    large number of cache misses between this PC and another PC. They
    can disassemble the machine code and they see a handful of
    instructions. To be able to figure out what is wrong they need to
    associate the operands of these instructions to what they refer to
    at the source level. Doing this disassembled optimized machine code
    and then associating the data accesses within it back to constructs
    within the source language really puts this level of analysis is the
    domain of experts and must be done by hand. Because of the limited
    information currently provided by DWARF it is not something that can
    be coded into a tool.
  * Within ABI analysis there are a few cases where compilation options
    can change the calling convention of a function sufficiently enough
    that it can create an ABI mismatch. The simple example that
    illustrates the point is SSE. You can pass vector operands by value
    in vector registers. However, if you compile a program without SSE
    then it is passed by reference. So in the ABI testing program, if I
    take the locations of the formal parameters at the call site and
    then use an inverted location list to look up what variables exist
    at those locations in the called function, if they don't match then
    I have an ABI problem.
  * One binary analysis use case is quite similar to the ABI use case,
    they see a call instruction, they can look up the target of the call
    to find the function prototype. From that they can figure out where
    the parameters to the function call are at the time of the call.
    However, they want to know in the calling scope what symbolic name
    in the source language for the parameter in a particular location at
    the time of the call. Simple example: find all the examples of
    "dlopen" calls - what library name is being opened, is it something
    that we can determine from static analysis or is it something read
    from a config file...

What all three of us have tried doing with limited success is taking the 
location list data and then inverting the mappings. The problem is that 
this is far from complete and generating that data is very 
computationally expensive.

I've looked for prior art to see if someone had come up with a solution 
to this but I haven't found any. The allusion to the HP-UX two level 
line maps is something that I had never come across before. Since the 
combined experience of this group is much deeper than what I have, does 
anybody know of any prior art that I can use as a basis for a DWARF 
feature enhancement issue?

-ben

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.dwarfstd.org/pipermail/dwarf-discuss/attachments/20230616/bd9e9b29/attachment.htm>