[Dwarf-Discuss] multiple entry points for inlined subroutines?

Fri Aug 11 16:57:27 GMT 2017

I'm unfamiliar with the term "view number". Is that a GCC-specific extension?

-- adrian

> On Aug 11, 2017, at 6:25 AM, Alexandre Oliva <aoliva at redhat.com> wrote:
> 
> I've been working on generating more precise entry points for inlined
> subroutines in GCC.
> 
> One of the goals of this was to provide the debug info consumer with a
> view number, in addition to the address (more on that later), but
> working on this, I found out that in several cases a single inlining of
> a function ends up having some basic blocks replicated, including those
> holding the entry point, and we have no way to represent the multiple
> addresses.  This may often occur when a loop containing an inlined
> function is unrolled, but it doesn't actually require unrolling: other
> simpler CFG transformations that duplicate blocks in whole or in part
> are enough to get some inlined entry points (but not necessarily entire
> inlined functions) duplicated.
> 
> When an inlined function is duplicated as a whole, I guess it is just
> reasonable to represent it as multiple inlinings of the same function,
> this is not always the case: I have observed cases in which little more
> than the entry point got duplicated into two separate branches.
> However, even in the unrolling case, it is often the case that all
> instances end up sharing the same automatic variables, except when the
> intent is to have separate per-iteration variables for e.g. swing modulo
> scheduling or other forms of iteration pipelining.  Such variable
> replication comes with its own set of challenges, that I'm not focusing
> on for the time being.  For now, I'm thinking of duplication of code
> within the same subprogram without the introduction of additional copies
> of variables.
> 
> In this scenario, each inlined function comes with its own set of
> lexical blocks and local variables, even if unrolling and whatnot ends
> up creating multiple copies of each use of such variables.
> 
> Ideally, I'd like to inform debug information consumers about all
> inlined entry points, even when multiple such entry points are
> associated with the same inlined instance of the function.  But AFAICT
> DW_AT_entry_pc can only hold one address or offset.
> 
> I've considered several representations:
> 
> - multiple DW_AT_entry_pc attributes in the same DIE -> not permitted 2.2
> 
> - multiple lexical blocks with abstract origin pointing to the abstract
> function, each with a separate entry point -> no good, the lexical block
> tree will be misrepresented or replicated
> 
> - using an exprloc form, composing an array of entry addresses with
> multiple DW_OP_piece expression -> not a very compact representation,
> and hardly a natural extension
> 
> - abuse range lists having DW_AT_entry_pc reference one such list, with
> two entry points per range entry -> not an unreasonable extension, but
> readers might be confused if ranges are reversed, or if we have an odd
> number of entry points
> 
> - likewise, but having a single entry point per range entry, wasting the
> other address -> no problem with odd counts, but wasteful
> 
> - allow an address list form (addrptr class?) to be used in
> DW_AT_entry_pc, with some convention to terminate the list -> this would
> work AFAICT.  Is there any reason to not propose this for consideration
> in DWARF6?
> 
> Any other thoughts on the isuse of representing multiple entry points
> for an inlined subroutine, or even for lexical blocks or regular
> subprograms?
> 
> 
> 
> On to adding view numbers to addresses.
> 
> I don't think it's just location list addresses and inline entry points
> that could gain in precision being augmented with view numbers.  So far,
> I have proposed an extension to record view numbers in location lists,
> and I'm now introducing a GNU extended attribute to hold the view number
> associated with an DW_AT_entry_pc in the same DIE.
> 
> 
> One issue I'm running into is that view numbers are often computed by
> the assembler, and encoded as uleb128 numbers.  GCC, however, wants to
> compute the sizes and offsets of DIEs itself.  All existing sdata- and
> udata-encoded attributes ever emitted by GCC are ones that it can
> compute as constants itself; not so when it comes to view numbers.
> 
> Anyway, keeping DIE sizes compiler-computed constants is an issue I've
> run into, and it becomes a larger concern once I start considering
> proposing an attribute class that holds an address and a view.
> 
> Aside from the issue of compiler-computed DIE sizes and offsets, some
> other lists that hold addresses want to be aligned, to simplify address
> relocations and whatnot.  Once we add view numbers encoded as uleb128,
> any hope for alignment is gone.  We might represent view numbers at the
> same alignment, to keep them in the same sections, but that would likely
> be wasteful.
> 
> Using a single indirect representations for both the address and the
> view number, or a separate indirect reference to each, all have their
> downsides, and even for long lists of such pairs, referencing a list of
> addresses (good) and a list of views (erhm) so that corresponding
> entries of each list are combined into tuples leaves something to be
> desired.
> 
> I'd appreciate any thoughts on how to introduce compactly-represented
> view numbers in most places where code addresses or address ranges could
> be used.  Have these difficulties been run into before?  Any advice to
> share?
> 
> 
> Thanks in advance,
> 
> -- 
> Alexandre Oliva, freedom fighter    http://FSFLA.org/~lxoliva/
> You must be the change you wish to see in the world. -- Gandhi
> Be Free! -- http://FSFLA.org/   FSF Latin America board member
> Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer
> _______________________________________________
> Dwarf-Discuss mailing list
> Dwarf-Discuss at lists.dwarfstd.org
> http://lists.dwarfstd.org/listinfo.cgi/dwarf-discuss-dwarfstd.org