[Dwarf-Discuss] Segment selectors for Harvard architectures
Todd Allen
todd.allen@concurrent-rt.com
Mon Mar 23 21:42:00 GMT 2020
Paul,
I haven't needed to contend with this issue. But as I was looking over the
standard, this was my initial gut reaction too: use the segment selectors. This
use actually does seem like it's a characteristic of the target architecture to
me. You started the discussion with "Harvard architectures".
DWARF does permit architectures to specify aspects of their DWARF description,
after all. I can't recall it ever being done *formally*, but it's been done
informally for every architecture that uses DWARF. At a bare minimum, register
encodings. And usually you have to root around in somebody else's source code
to find it.
This one has a slightly higher chance of breaking a consumer, if that consumer
was written not to tolerate the segment selectors. But I think it would be fair
to put any such blame on the consumer in that case. If the consumer doesn't die
with a SIGSEGV, then it might ignore the segments. And then it would be no
worse off than now.
On Thu, Mar 19, 2020 at 06:05:16PM +0000, Dwarf Discussion wrote:
> This recently came up in the LLVM project. Harvard architectures
> put code and data into separate address spaces, but those spaces
> are not explicit; instructions that load/store memory implicitly
> use the data space, while things like taking a function address or
> doing indirect branches will implicitly use the code space. This
> doubles the effective size of memory without consuming an address
> bit, as well as having other secondary benefits like not allowing
> self-modifying code.
>
> Nearly all of the DWARF information does not need to distinguish
> between code and address spaces, because it's easy to derive that
> from context. Addresses in the line table or a range list will be
> code addresses; in .debug_info, addresses of code elements will be
> code addresses, while variables will be data addresses. And so on.
>
> This only seems to break down in the .debug_aranges section, which
> records both data and code addresses without any context to let a
> consumer know which is what. In a flat-address architecture, no
> distinction is needed; in a segmented architecture, there will be
> a segment selector as part of any address, and that includes the
> .debug_aranges section. What about for Harvard architectures?
>
> What I suggested in the LLVM project is that .debug_aranges would
> have a 1-byte segment selector and use some trivial scheme such as
> 0=code, 1=data to distinguish what kind of address it is. Other
> DWARF sections wouldn't need a selector because they can all use
> context to figure it out; this avoids the size overhead of using
> segment selectors everywhere else.
>
> Pavel Labath pointed out that this seems inconsistent and might
> make consumers unhappy; segment selectors are described as a
> characteristic of the target architecture, so having them in one
> place and not others might look suspicious. IMO it's a reasonable
> "permissive" use of the existing DWARF structures, but it seemed
> worth asking here.
>
> Does this (segment selector only in .debug_aranges) sound okay?
> Should there be non-normative text or a wiki description of this?
> Do we want to codify the 0=code 1=data use of segment selectors
> for all Harvard architectures (that don't otherwise have explicit
> segements) so that this doesn't have to be set by ABI committees?
>
> I'm willing to write up whatever needs writing up, either as a
> proposal or as a wiki entry.
>
> Thanks,
> --paulr
>
> _______________________________________________
> Dwarf-Discuss mailing list
> Dwarf-Discuss at lists.dwarfstd.org
> http://lists.dwarfstd.org/listinfo.cgi/dwarf-discuss-dwarfstd.org
--
Todd Allen
Concurrent Real-Time
More information about the Dwarf-discuss
mailing list