[Dwarf-Discuss] Including the DWO name in the CU hash

David Blaikie dblaikie@gmail.com
Fri May 19 22:51:18 GMT 2017


some context:

1) A little while ago, I added the dwo_name to the dwo CU to improve
diagnostic quality on CU hash collisions in during a dwp action (previously
it could only report which input files (possibly other DWP files) contained
the duplicates/collision which could be very manual to track back to the
original input DWO files - having the original DWO names in the diagnostic
made it relatively easy to track down).

2) LLVM's new ThinLTO presents a high chance of duplicate DWO CUs - it does
this by creating effectively "new" CUs containing a stripped down version
of an existing CU - containing only a handful of functions that may be
relevant to optimizing some other CU. (imagine two CUs both using a single
inline function from a 3rd CU - the 3rd CU's inline function and the basic
CU itself is imported into the compilation steps of the other two CUs - so
in the end you get two DWO files, each with two CUs, where one CU contains
only an abstract definition of the inline function).

My initial thinking here was that I could cross-pollinate the CU hash from
each CU within a single compilation, since the primary CU would have enough
uniqueness (hash all the CUs, then cross-hash them).

But then I realized the CUs should already be unique because they include
the dwo_name which will be different between the two stripped down CU
clones. But the dwo_name isn't included in the hash - so I prototyped
including it & it does what you'd expect.

Extra wrinkle: Once the dwo_name is in the hash, then it defeats my
original motivation for including it in the DWO CU in the first place: such
CUs will never collide, so the name would never be useful for diagnostic
quality.

Should I drop the dwo_name from the DWO CU and manually/explicitly include
it in the hash? Does cross pollination sound better? Should I only do
either of these when dealing with more than one CU in a DWO? (in which case
the diagnostic improvement would still be valid - it catches some
interesting cases, but they're not /very/ interesting like major bugs (&
does DWO ID collisions have some false positives too, which hashing the
dwo_id would fix), etc... and the mechanism wasn't built for bug catching
in any case)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.dwarfstd.org/pipermail/dwarf-discuss-dwarfstd.org/attachments/20170519/109c014e/attachment.htm>



More information about the Dwarf-discuss mailing list