[Dwarf-Discuss] Segment selectors for the range list table.
David Blaikie
dblaikie@gmail.com
Thu Jul 16 19:20:46 GMT 2020
Anyone know of any existing DWARF producers that use segmented addressing?
Might be interesting to see how they're using the features.
On Thu, Jul 16, 2020 at 9:30 AM Michael Eager <eager at eagercon.com> wrote:
> On 7/15/20 9:49 PM, David Blaikie wrote:
> >
> >
> > On Wed, Jul 15, 2020 at 7:07 PM Michael Eager via Dwarf-Discuss
> > <dwarf-discuss at lists.dwarfstd.org
> > <mailto:dwarf-discuss at lists.dwarfstd.org>> wrote:
> >
> > Segmented addresses have been in the DWARF specification since
> > Version 2
> > and AFAIK have not been changed since that time. DWARF V5 did
> > not add
> > any functionality to segmented addresses that was not present in
> DWARF
> > V2/3. At least, there was no intention to do so. Segmented
> addresses
> > are described in Section 2.12.
> >
> > A segmented address maps into a linear address in a
> processor-specific
> > fashion.
> >
> >
> > That seems at odds with the non-normative text of 2.12 "In some systems,
> > addresses are specified as offsets within a given segment rather than as
> > locations within a single flat address space."
>
> That means that an x86 address could be represented as offset 0x1234 in
> segment 0x4444, which would translate to 0x44440+0x1234=0x45674. Note
> that x86 permits aliases, so that offset 0x0124 in segment 0x4555 is the
> same address.
>
> DWARF sometimes uses wording which is intended to generalize a concept.
> Conceivably, another architecture could use the same DWARF attribute in
> a similar way. That's why the non-normative text says "some systems"
> rather than specifically referencing x86. But we do have that as the
> only example listed in the table.
>
> > And also would be confusing to me - if there is a contiguous linear
> > address space, why would DWARF need to specify the use of a segment
> > selector, and why do some references to addresses allow the inclusion of
> > a segment selector and some don't? Why not just always use the
> > non-segmented address description for DWARF?
>
> So that a compiler can generate segment address values and offsets
> independently. An x86 code generator may not know what segment it is
> generating code in.
>
But parts of the format (such as debug_ranges - sort of, and debug_line)
assume the DWARF producer does know the absolute address. It seems
inconsistent to me.
Perhaps it's more like Paul was postulating - that the spec assumes code is
in a code segment/doesn't need to be clarified. (but that gets a bit
confused in debug_aranges - if it only is meant to contain code (not data),
why does it need a segment selector - and also in the DIEs - if code is
always in a known/assumable segment then why can you vary segment for
low_pc/high_pc/ranges?)
>
> AFAIK, all addresses can be segmented addresses, except in the line
> table where it isn't needed.
>
> Perhaps we should have (long ago) required flat/linear addresses for x86
> instead of segmented addresses.
>
What's the line table's segment_selector_size (in the DWARFv5 header) for?
(this sort of agrees and disagrees with you - it's there, but it's not used
in any part of the debug_line format that I can see)
What I'm confused by is how you can use segmented addressing on a
per-DIE-subtree basis (eg: one subprogram can specify one segment, and
another subprogram can specify a different segment - the spec seems to be
pretty clear that this is intentionally supported (says that it applies to
high_pc/low_pc/ranges, and that it delegates to the nearest parent DIE's
DW_AT_segment)) but it doesn't seem like you can have a range list (in v4
or v5 (except via the addrx encodings)) that can have different segments
for different subranges within a single range list. So how would you
describe the ranges of a CU that had subprograms in distinct segments? And
if you can't describe that - then why does the spec go out of its way to
explicitly allow it with the wording about segment overrides, etc.
>
> > & I don't find any mention of this idea that some addresses are absolute
> > and some are segment-relative in 2.12 - it does say that "If none of the
> > entries in the chain of parents for this entry back to its containing
> > compilation unit entry have DW_AT_segment attributes, then the entry is
> > assumed to exist within a flat address space." - as though a flat (I
> > assume this is synonymous with "linear"?) address space is distinct from
> > the segmented address space being discussed otherwise?
>
> Flat address space == linear address space.
>
> From a certain perspective, x86 memory space is broken up into 65K
> 16-byte segments mapped onto a 256K linear address range.
> AFAIK, only the Intel 8086 and descendants have this
> > functionality. (It's a many to one mapping in the 8086
> implementation,
> > but that's a problem for a bygone era.) There's a reference to i386
> > memory models in Table 2.7.
> >
> > DWARF assumes a linear address space. A segmented address maps to a
> > specific address in this linear address space. The entries in
> > DW_AT_ranges for subprograms with different segment addresses would
> > usually be referenced by their address in the linear address space.
> If
> > DW_AT_ranges has a DW_AT_segment, this is an indication that the
> > debugger is to perform the processor-specific computation to
> translate
> > the segment-address pair to the linear address.
> >
> > There is no need to do anything with segments in the line table,
> since
> > the line table contains addresses in the linear address space.
> >
> > There is some (perhaps considerable) confusion in terminology in the
> > x86
> > world, because the x86 has multiple segment registers which on other
> > processors would be called base registers. The values in these
> > registers reference memory segments and are added to whatever offset
> is
> > contained in the program to generate an address. These segment
> > registers, and the memory segments which they point to, are NOT the
> > segments represented by DW_AT_segment.
> >
> > Re "reading the segment selector" and "addrx encoding": The
> addresses
> > in DWARF DIEs are static, not dynamic. There is no register+offset
> > encoding, and processor registers are not read to determine where a
> > subprogram is in memory.
> >
> >
> > Sorry, I don't quite follow the connections between all those statements.
>
> Perhaps I didn't understand your comments about "reading the segment
> selector" and "addrx encoding".
>
> TL;DR:
> DW_AT_segment was designed to describe x86 memory model addresses:
> https://en.wikipedia.org/wiki/Intel_Memory_Model.
>
> Possibly other architectures can use it, but I'm not familiar with any
> that do.
>
>
> > 2.17 says that if a DIE has a DW_AT_high_pc and DW_AT_segment, then the
> > high_pc is relative to the specified segment. That's a bit redundant if
> > high_pc uses FORM_addrx, because the address in the address pool can
> > specify its own segment, but a producer could choose which way to go
> > there. (presumably if the AT_segment is there, you should interpret the
> > addrx high_pc relative to that segment - assuming debug_addr has no
> > segment selector in it - or perhaps it should go the other way and
> > ignore the local AT_segment and only rely on whatever segment is in
> > debug_addr)
>
> DW_FORM_addrx (and the .debug_addr section) were introduced in DWARF V5
> to allow compression of DW_FORM_addr addresses. DW_AT_segment is
> intended to describe an (x86) address in the form that the processor
> uses. The first is one of many different compression schemes in DWARF,
> the second is part of an architectural description.
>
debug_addr supports segment selectors - in the debug_addr header it has a
field for "segment selector size" and the entries in the address list are
"segment/address pairs.".
So now there's two ways a segment selector for an address could be
specified - if you had a DW_TAG_subprogram with a DW_AT_low_pc using addrx
into a debug_addr with a non-zero segment selector and the subprogram also
had a DW_AT_segment, wonder which one's meant to win.
Though mostly my point was: since debug_addr entries can have segment
selectors, then debug_rnglists can have different segments for different
subranges within a singular range list. But without that (either using
direct addresses, or in v4 debug_ranges) you couldn't vary segment across a
single range list. Though the debug_rnglist header does have a segment
selector size in it - it doesn't seem to use it anywhere in its format
(similarly, debug_loclists and debug_line v5 has a segment selector size,
but doesn't seem to use it?).
>
> > On 7/15/20 4:31 PM, David Blaikie via Dwarf-Discuss wrote:
> > > Looking at how segment selectors work:
> > >
> > > DW_AT_segment: Applies to a DIE subtree, including any ranges,
> > high/low
> > > pc, locations, labels, etc
> > > debug_range/loc (v4 and below): Doesn't seem to allow specifying
> > segment
> > > variation - inherits from the segment given on the nearest parent
> > DIE
> > > that refers to the entry
> > > debug_rnglist/loclist (v5): includes segment selector size in the
> > > header, but doesn't seem to use it - segment selection via
> > addresses in
> > > the address pool (RLE/LLE_*x encodings) would allow fine-grained
> > segment
> > > selection, but direct address forms don't seem to allow segment
> > > selection ("This operand is the
> > > 19 same size as used in DW_FORM_addr.")
> > > debug_addr: segment_size in header, then list of {segment
> > selector, address}
> > > debug_aranges: segment_size in header says, then the list contains
> > > triples of {segment selector, start address, length}
> > > debug_line: v5 encodes the address and segment selector size in
> the
> > > header, but I'm not sure if/how it's used. The DW_LNE_set_address
> > > operation says:
> > > "The DW_LNE_set_address opcode takes a single relocatable address
> > as an
> > > operand. The size of the operand is the size of an address on the
> > target
> > > machine. It sets the address register to the value given by the
> > > relocatable address and sets the op_index register to 0." -
> doesn't
> > > sound like it's reading the segment selector there.
> > >
> > > So... I don't think DWARFv5 made anything worse - if anything it
> did
> > > enable /a/ way to use fine grained segment selectors in range
> > lists and
> > > location lists that doesn't appear, to me, to have been provided
> > before.
> > > (it could be needed if you had some functions in some segment and
> > some
> > > functions in another segment (which could be represented at the
> > > subprogram DIE level - DW_AT_segment 1 on one DW_TAG_subprogram,
> > > DW_AT_segment 2 on another DW_TAG_subprogram - but how would you
> > > represent the DW_AT_ranges for this CU (in DWARFv4, or in DWARFv5
> > > without using addrx encodings)? I don't know how, because I think
> > > debug_ranges could describe one range list entry as being from one
> > > segment, and another range list entry as being in another segment
> > - they
> > > would all be in whatever segment was in DW_AT_segment on the CU)
> > >
> > > does that make sense? Have I missed something about how you could
> > use
> > > segment selectors in a debug_loc, debug_ranges, or
> > loclist/rnglist that
> > > isn't using an addrx encoding?
> > >
> > > On Wed, Jul 15, 2020 at 6:37 AM Robinson, Paul via Dwarf-Discuss
> > > <dwarf-discuss at lists.dwarfstd.org
> > <mailto:dwarf-discuss at lists.dwarfstd.org>
> > > <mailto:dwarf-discuss at lists.dwarfstd.org
> > <mailto:dwarf-discuss at lists.dwarfstd.org>>> wrote:
> > >
> > >
> > >
> > > > -----Original Message-----
> > > > From: Dwarf-Discuss
> > <dwarf-discuss-bounces at lists.dwarfstd.org
> > <mailto:dwarf-discuss-bounces at lists.dwarfstd.org>
> > > <mailto:dwarf-discuss-bounces at lists.dwarfstd.org
> > <mailto:dwarf-discuss-bounces at lists.dwarfstd.org>>> On Behalf
> > > > Of Xing GUO via Dwarf-Discuss
> > > > Sent: Tuesday, July 14, 2020 10:39 PM
> > > > To: dwarf-discuss at lists.dwarfstd.org
> > <mailto:dwarf-discuss at lists.dwarfstd.org>
> > > <mailto:dwarf-discuss at lists.dwarfstd.org
> > <mailto:dwarf-discuss at lists.dwarfstd.org>>
> > > > Subject: [Dwarf-Discuss] Segment selectors for the range
> > list table.
> > > >
> > > > Hi there,
> > > >
> > > > The DWARFv5 spec mentioned that there might be segment
> > selectors in
> > > > the range list entries and when the segment_selector_size
> > is 0, the
> > > > segment selectors are omitted from the range list entries.
> > > However, it
> > > > didn't mention how the segment selector should be encoded
> > when the
> > > > segment_selector_size isn't 0. Can anyone help me figure
> > it out?
> > > > Thanks a lot!
> > >
> > > Hi Xing,
> > >
> > > The segment selectors in the range list would be encoded the
> > same way
> > > as they would be in the main .debug_info section. Range
> > lists and
> > > location lists are essentially extensions to .debug_info, for
> > cases
> > > where the range or location cannot be represented by simple
> > DW_AT_*
> > > attribute values.
> > >
> > > The specifics of encoding the segment selector would be
> > whatever is
> > > appropriate to the target. DWARF does not specify these
> details.
> > >
> > > Best Regards,
> > > --paulr
> > >
> > >
> > > >
> > > > 7.28 (page 243)
> > > > The segment size is given by the segment_selector_size
> > field of the
> > > > header, and the address size is given by the address_size
> > field
> > > of the
> > > > header. If the segment_selector_size field in the header
> > is zero, the
> > > > segment selector is omitted from the range list entries.
> > > >
> > > > --
> > > > Cheers,
> > > > Xing
> >
> >
> >
> > --
> > Michael Eager
> > _______________________________________________
> > Dwarf-Discuss mailing list
> > Dwarf-Discuss at lists.dwarfstd.org
> > <mailto:Dwarf-Discuss at lists.dwarfstd.org>
> > http://lists.dwarfstd.org/listinfo.cgi/dwarf-discuss-dwarfstd.org
> >
>
> --
> Michael Eager
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.dwarfstd.org/pipermail/dwarf-discuss-dwarfstd.org/attachments/20200716/43c88838/attachment-0001.html>
More information about the Dwarf-discuss
mailing list