[Dwarf-Discuss] string reduction techniques

Greg Clayton clayborg@gmail.com
Tue Nov 2 02:13:58 GMT 2021


LLDB also uses mangled names. The clang compiler is our expression parser and it always tries to resolve symbols during compilation/JIT and it supplies mangled names when looking for functions to resolve when it JITs code up. It is nice to be able to do quick name lookups using these mangled names to find the address of the function. That being said, we could work around it. Not sure how easy that would be though as mangled names can end up demangling to the same name with some loss of information and it would be important to be able to find the right in charge or out of charge constructor when the compiler asks for a specific symbol using the mangled name. We have more uses of mangled names but most of them relate to parsing the symbol tables, so removing them from DWARF wouldn?t affect those areas.

I wonder if these is a way to have a DW_AT_partial_linkage_name that relies on the decl context of a DIE. Like if you have a class "foo" in the global namespace it could have a DW_AT_partial_linkage_name with the value "_Z3foo". A DW_TAG_subprogram that is a child of this "foo" class  inside this class could have another partial linkage name "3bari" that could be put together with the parent "_Z3foo" for a function like:

Void foo::bar(int);

Since many mangled names often start with the same prefix it might help reduce the string table size.


> On Nov 1, 2021, at 6:52 PM, Daniel Berlin via Dwarf-Discuss <dwarf-discuss at lists.dwarfstd.org> wrote:
> 
> Finally, a question i know the answer to!
> 
> It brings us all the way back to when I was the C++ maintainer for GDB, which is the most ancient of history.  
> Unfortunately, this a trip to a horrible place
> I actually spent a lot of time trying to make it so we didn't need linkage names, because, even then, they took up a *lot* of space.
> 
> On Mon, Nov 1, 2021 at 8:35 PM Cary Coutant via Dwarf-Discuss <dwarf-discuss at lists.dwarfstd.org <mailto:dwarf-discuss at lists.dwarfstd.org>> wrote:
> >> I can't be sure about this exponential growth.  I don't have the data to back it
> >> up.  But I will say, when we created DWARF64, I was skeptical that it would be
> >> needed during my career.  And yet here we are...
> >
> > Yep, still got mixed feelings about DWARF64 - partly the pieces that we're seeing with the need for some solutions for mixed DWARF32/64, etc, makes it feel like maybe it's not got a bit of "settling in" to do. And I'm still rather hopeful we might be able to reduce the overheads enough to avoid widespread use of DWARF64 - but it's not a sure thing by any means.
> 
> Agreed. I'd like to explore as many avenues as we can to eliminate the
> need for DWARF64.
> 
> 
> >> Honestly, I've never been sure why gcc generates DW_AT_linkage_name.  Our
> >> debugger almost never uses it.  (There is one use to detect "GNU indirect"
> >> functions.)  I wonder if it would be possible to avoid them if you provided
> >> enough info about the template parameters, if the debugger had its own name
> >> mangler.  I had to write one for our debugger a couple years ago, and it
> >> definitely was a persnickety beast.  But doable with enough information.  Mind
> >> you, I'm not sure there is enough information to do it perfectly with the state
> >> of DWARF & gcc right now.
> >
> > Yeah, that was/is certainly my first pass - the way I've done the DW_AT_name one is to have a feature in clang that produces the short name "t1" but then also embeds the template argument list in the name (like this: "_STNt1|<int>") - then llvm-dwarfdump will detect this prefix, split up the name, rebuild the original name as it would if it'd been given only the simple name ("t1") and compare it to the one from clang. Then I can run this over large programs and check everything round-trips correctly & in clang, classify any names we can't roundtrip so they get emitted in full rather than shortened.
> > We could do something similar with linkage names - since to know there's some prior art in your work there.
> >
> > I wouldn't be averse to considering what'd take to make DWARF robust enough to always roundtrip simple and linkage names in this way - I don't think it'd take a /lot/ of extra DWARF content.
> 
> Fuzzy memory here, but as I recall, GCC didn't generate linkage names
> (or only did in some very specific cases) until the LTO folks
> convinced us they needed it in order to relate profile data back to
> the source. Perhaps if we came up with a better way of doing that, we
> could eliminate the linkage names.
> 
> No, see, that's a mildly reasonable answer.
> If you go far enough back, the linkage names exist for a few reasons:
> 1. Because the debug info wasn't always good enough, and so GDB used to demangle the linkage names and parse them using a hacked up C++-ish parser for type info.
> 2. Even when it didn't, it decoded linkage names to detect things like destructors/constructors, etc.
> 3. Because It used it to do remangling properly and try to generate method signatures to lookup (and for #1)
> 4. Because it was used to do symbol lookup of in the ELF/etc symbol tables for static things/etc.
> 5. Because it saved space in STABS to do #1 (they predate DWARF by far).
> 
> If you checkout gdb source code, circa 2001, and search for things like check_stub_method, and follow all the things it calls (like gdb_mangle_name), you can learn the history of linkage names (and probably throw up in your mouth a little).
>  If you do a case insensitive search for things like "physname" and "phys_name", you'll see all the places it used to use the linkage names.
> I spent a lot of time abstracting out things like the constructor/destructor name testing, vptr name finding, etc, so that someone later might have a chance to get rid of linkage names (it was also necessary because of the gcc 2.95->3.0 ABI change).
> 
> 
> 
> _______________________________________________
> Dwarf-Discuss mailing list
> Dwarf-Discuss at lists.dwarfstd.org <mailto:Dwarf-Discuss at lists.dwarfstd.org>
> http://lists.dwarfstd.org/listinfo.cgi/dwarf-discuss-dwarfstd.org <http://lists.dwarfstd.org/listinfo.cgi/dwarf-discuss-dwarfstd.org>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.dwarfstd.org/pipermail/dwarf-discuss-dwarfstd.org/attachments/20211101/f0a18e47/attachment-0001.html>



More information about the Dwarf-discuss mailing list