[Dwarf-Discuss] Segment selectors for the range list table.

David Blaikie dblaikie@gmail.com
Thu Jul 16 19:20:46 GMT 2020


Anyone know of any existing DWARF producers that use segmented addressing?
Might be interesting to see how they're using the features.

On Thu, Jul 16, 2020 at 9:30 AM Michael Eager <eager at eagercon.com> wrote:

> On 7/15/20 9:49 PM, David Blaikie wrote:
> >
> >
> > On Wed, Jul 15, 2020 at 7:07 PM Michael Eager via Dwarf-Discuss
> > <dwarf-discuss at lists.dwarfstd.org
> > <mailto:dwarf-discuss at lists.dwarfstd.org>> wrote:
> >
> >     Segmented addresses have been in the DWARF specification since
> >     Version 2
> >        and AFAIK have not been changed since that time.  DWARF V5 did
> >     not add
> >     any functionality to segmented addresses that was not present in
> DWARF
> >     V2/3.  At least, there was no intention to do so.  Segmented
> addresses
> >     are described in Section 2.12.
> >
> >     A segmented address maps into a linear address in a
> processor-specific
> >     fashion.
> >
> >
> > That seems at odds with the non-normative text of 2.12 "In some systems,
> > addresses are specified as offsets within a given segment rather than as
> > locations within a single flat address space."
>
> That means that an x86 address could be represented as offset 0x1234 in
> segment 0x4444, which would translate to 0x44440+0x1234=0x45674.  Note
> that x86 permits aliases, so that offset 0x0124 in segment 0x4555 is the
> same address.
>
> DWARF sometimes uses wording which is intended to generalize a concept.
> Conceivably, another architecture could use the same DWARF attribute in
> a similar way.  That's why the non-normative text says "some systems"
> rather than specifically referencing x86.  But we do have that as the
> only example listed in the table.
>
> > And also would be confusing to me - if there is a contiguous linear
> > address space, why would DWARF need to specify the use of a segment
> > selector, and why do some references to addresses allow the inclusion of
> > a segment selector and some don't? Why not just always use the
> > non-segmented address description for DWARF?
>
> So that a compiler can generate segment address values and offsets
> independently.  An x86 code generator may not know what segment it is
> generating code in.
>

But parts of the format (such as debug_ranges - sort of, and debug_line)
assume the DWARF producer does know the absolute address. It seems
inconsistent to me.

Perhaps it's more like Paul was postulating - that the spec assumes code is
in a code segment/doesn't need to be clarified. (but that gets a bit
confused in debug_aranges - if it only is meant to contain code (not data),
why does it need a segment selector - and also in the DIEs - if code is
always in a known/assumable segment then why can you vary segment for
low_pc/high_pc/ranges?)


>
> AFAIK, all addresses can be segmented addresses, except in the line
> table where it isn't needed.
>
> Perhaps we should have (long ago) required flat/linear addresses for x86
> instead of segmented addresses.
>

What's the line table's segment_selector_size (in the DWARFv5 header) for?
(this sort of agrees and disagrees with you - it's there, but it's not used
in any part of the debug_line format that I can see)

What I'm confused by is how you can use segmented addressing on a
per-DIE-subtree basis (eg: one subprogram can specify one segment, and
another subprogram can specify a different segment - the spec seems to be
pretty clear that this is intentionally supported (says that it applies to
high_pc/low_pc/ranges, and that it delegates to the nearest parent DIE's
DW_AT_segment)) but it doesn't seem like you can have a range list (in v4
or v5 (except via the addrx encodings)) that can have different segments
for different subranges within a single range list. So how would you
describe the ranges of a CU that had subprograms in distinct segments? And
if you can't describe that - then why does the spec go out of its way to
explicitly allow it with the wording about segment overrides, etc.


>
> > & I don't find any mention of this idea that some addresses are absolute
> > and some are segment-relative in 2.12 - it does say that "If none of the
> > entries in the chain of parents for this entry back to its containing
> > compilation unit entry have DW_AT_segment attributes, then the entry is
> > assumed to exist within a flat address space." - as though a flat (I
> > assume this is synonymous with "linear"?) address space is distinct from
> > the segmented address space being discussed otherwise?
>
> Flat address space == linear address space.
>
>  From a certain perspective, x86 memory space is broken up into 65K
> 16-byte segments mapped onto a 256K linear address range.

>     AFAIK, only the Intel 8086 and descendants have this
> >     functionality.  (It's a many to one mapping in the 8086
> implementation,
> >     but that's a problem for a bygone era.)  There's a reference to i386
> >     memory models in Table 2.7.
> >
> >     DWARF assumes a linear address space.  A segmented address maps to a
> >     specific address in this linear address space.  The entries in
> >     DW_AT_ranges for subprograms with different segment addresses would
> >     usually be referenced by their address in the linear address space.
> If
> >     DW_AT_ranges has a DW_AT_segment, this is an indication that the
> >     debugger is to perform the processor-specific computation to
> translate
> >     the segment-address pair to the linear address.
> >
> >     There is no need to do anything with segments in the line table,
> since
> >     the line table contains addresses in the linear address space.
> >
> >     There is some (perhaps considerable) confusion in terminology in the
> >     x86
> >     world, because the x86 has multiple segment registers which on other
> >     processors would be called base registers.  The values in these
> >     registers reference memory segments and are added to whatever offset
> is
> >     contained in the program to generate an address.  These segment
> >     registers, and the memory segments which they point to, are NOT the
> >     segments represented by DW_AT_segment.
> >
> >     Re "reading the segment selector" and "addrx encoding":  The
> addresses
> >     in DWARF DIEs are static, not dynamic.  There is no register+offset
> >     encoding, and processor registers are not read to determine where a
> >     subprogram is in memory.
> >
> >
> > Sorry, I don't quite follow the connections between all those statements.
>
> Perhaps I didn't understand your comments about "reading the segment
> selector" and "addrx encoding".
>
> TL;DR:
> DW_AT_segment was designed to describe x86 memory model addresses:
> https://en.wikipedia.org/wiki/Intel_Memory_Model.
>
> Possibly other architectures can use it, but I'm not familiar with any
> that do.
>
>
> > 2.17 says that if a DIE has a DW_AT_high_pc and DW_AT_segment, then the
> > high_pc is relative to the specified segment. That's a bit redundant if
> > high_pc uses FORM_addrx, because the address in the address pool can
> > specify its own segment, but a producer could choose which way to go
> > there. (presumably if the AT_segment is there, you should interpret the
> > addrx high_pc relative to that segment - assuming debug_addr has no
> > segment selector in it - or perhaps it should go the other way and
> > ignore the local AT_segment and only rely on whatever segment is in
> > debug_addr)
>
> DW_FORM_addrx (and the .debug_addr section) were introduced in DWARF V5
> to allow compression of DW_FORM_addr addresses.  DW_AT_segment is
> intended to describe an (x86) address in the form that the processor
> uses.  The first is one of many different compression schemes in DWARF,
> the second is part of an architectural description.
>

debug_addr supports segment selectors - in the debug_addr header it has a
field for "segment selector size" and the entries in the address list are
"segment/address pairs.".

So now there's two ways a segment selector for an address could be
specified - if you had a DW_TAG_subprogram with a DW_AT_low_pc using addrx
into a debug_addr with a non-zero segment selector and the subprogram also
had a DW_AT_segment, wonder which one's meant to win.

Though mostly my point was: since debug_addr entries can have segment
selectors, then debug_rnglists can have different segments for different
subranges within a singular range list. But without that (either using
direct addresses, or in v4 debug_ranges) you couldn't vary segment across a
single range list. Though the debug_rnglist header does have a segment
selector size in it - it doesn't seem to use it anywhere in its format
(similarly, debug_loclists and debug_line v5 has a segment selector size,
but doesn't seem to use it?).


>
> >     On 7/15/20 4:31 PM, David Blaikie via Dwarf-Discuss wrote:
> >      > Looking at how segment selectors work:
> >      >
> >      > DW_AT_segment: Applies to a DIE subtree, including any ranges,
> >     high/low
> >      > pc, locations, labels, etc
> >      > debug_range/loc (v4 and below): Doesn't seem to allow specifying
> >     segment
> >      > variation - inherits from the segment given on the nearest parent
> >     DIE
> >      > that refers to the entry
> >      > debug_rnglist/loclist (v5): includes segment selector size in the
> >      > header, but doesn't seem to use it - segment selection via
> >     addresses in
> >      > the address pool (RLE/LLE_*x encodings) would allow fine-grained
> >     segment
> >      > selection, but direct address forms don't seem to allow segment
> >      > selection ("This operand is the
> >      > 19 same size as used in DW_FORM_addr.")
> >      > debug_addr: segment_size in header, then list of {segment
> >     selector, address}
> >      > debug_aranges: segment_size in header says, then the list contains
> >      > triples of {segment selector, start address, length}
> >      > debug_line: v5 encodes the address and segment selector size in
> the
> >      > header, but I'm not sure if/how it's used. The DW_LNE_set_address
> >      > operation says:
> >      > "The DW_LNE_set_address opcode takes a single relocatable address
> >     as an
> >      > operand. The size of the operand is the size of an address on the
> >     target
> >      > machine. It sets the address register to the value given by the
> >      > relocatable address and sets the op_index register to 0." -
> doesn't
> >      > sound like it's reading the segment selector there.
> >      >
> >      > So... I don't think DWARFv5 made anything worse - if anything it
> did
> >      > enable /a/ way to use fine grained segment selectors in range
> >     lists and
> >      > location lists that doesn't appear, to me, to have been provided
> >     before.
> >      > (it could be needed if you had some functions in some segment and
> >     some
> >      > functions in another segment (which could be represented at the
> >      > subprogram DIE level - DW_AT_segment 1 on one DW_TAG_subprogram,
> >      > DW_AT_segment 2 on another DW_TAG_subprogram - but how would you
> >      > represent the DW_AT_ranges for this CU (in DWARFv4, or in DWARFv5
> >      > without using addrx encodings)? I don't know how, because I think
> >      > debug_ranges could describe one range list entry as being from one
> >      > segment, and another range list entry as being in another segment
> >     - they
> >      > would all be in whatever segment was in DW_AT_segment on the CU)
> >      >
> >      > does that make sense? Have I missed something about how you could
> >     use
> >      > segment selectors in a debug_loc, debug_ranges, or
> >     loclist/rnglist that
> >      > isn't using an addrx encoding?
> >      >
> >      > On Wed, Jul 15, 2020 at 6:37 AM Robinson, Paul via Dwarf-Discuss
> >      > <dwarf-discuss at lists.dwarfstd.org
> >     <mailto:dwarf-discuss at lists.dwarfstd.org>
> >      > <mailto:dwarf-discuss at lists.dwarfstd.org
> >     <mailto:dwarf-discuss at lists.dwarfstd.org>>> wrote:
> >      >
> >      >
> >      >
> >      >      > -----Original Message-----
> >      >      > From: Dwarf-Discuss
> >     <dwarf-discuss-bounces at lists.dwarfstd.org
> >     <mailto:dwarf-discuss-bounces at lists.dwarfstd.org>
> >      >     <mailto:dwarf-discuss-bounces at lists.dwarfstd.org
> >     <mailto:dwarf-discuss-bounces at lists.dwarfstd.org>>> On Behalf
> >      >      > Of Xing GUO via Dwarf-Discuss
> >      >      > Sent: Tuesday, July 14, 2020 10:39 PM
> >      >      > To: dwarf-discuss at lists.dwarfstd.org
> >     <mailto:dwarf-discuss at lists.dwarfstd.org>
> >      >     <mailto:dwarf-discuss at lists.dwarfstd.org
> >     <mailto:dwarf-discuss at lists.dwarfstd.org>>
> >      >      > Subject: [Dwarf-Discuss] Segment selectors for the range
> >     list table.
> >      >      >
> >      >      > Hi there,
> >      >      >
> >      >      > The DWARFv5 spec mentioned that there might be segment
> >     selectors in
> >      >      > the range list entries and when the segment_selector_size
> >     is 0, the
> >      >      > segment selectors are omitted from the range list entries.
> >      >     However, it
> >      >      > didn't mention how the segment selector should be encoded
> >     when the
> >      >      > segment_selector_size isn't 0. Can anyone help me figure
> >     it out?
> >      >      > Thanks a lot!
> >      >
> >      >     Hi Xing,
> >      >
> >      >     The segment selectors in the range list would be encoded the
> >     same way
> >      >     as they would be in the main .debug_info section.  Range
> >     lists and
> >      >     location lists are essentially extensions to .debug_info, for
> >     cases
> >      >     where the range or location cannot be represented by simple
> >     DW_AT_*
> >      >     attribute values.
> >      >
> >      >     The specifics of encoding the segment selector would be
> >     whatever is
> >      >     appropriate to the target.  DWARF does not specify these
> details.
> >      >
> >      >     Best Regards,
> >      >     --paulr
> >      >
> >      >
> >      >      >
> >      >      > 7.28 (page 243)
> >      >      > The segment size is given by the segment_selector_size
> >     field of the
> >      >      > header, and the address size is given by the address_size
> >     field
> >      >     of the
> >      >      > header. If the segment_selector_size field in the header
> >     is zero, the
> >      >      > segment selector is omitted from the range list entries.
> >      >      >
> >      >      > --
> >      >      > Cheers,
> >      >      > Xing
> >
> >
> >
> >     --
> >     Michael Eager
> >     _______________________________________________
> >     Dwarf-Discuss mailing list
> >     Dwarf-Discuss at lists.dwarfstd.org
> >     <mailto:Dwarf-Discuss at lists.dwarfstd.org>
> >     http://lists.dwarfstd.org/listinfo.cgi/dwarf-discuss-dwarfstd.org
> >
>
> --
> Michael Eager
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.dwarfstd.org/pipermail/dwarf-discuss-dwarfstd.org/attachments/20200716/43c88838/attachment-0001.html>



More information about the Dwarf-discuss mailing list