[Dwarf-discuss] ISSUE: tensor types. V3
Ben Woodard
woodard@redhat.com
Mon Apr 24 19:27:32 GMT 2023
On 4/24/23 09:50, Todd Allen via Dwarf-discuss wrote:
> On 4/21/23 16:31, Ben Woodard via Dwarf-discuss wrote:
>>>> Insert the following paragraph between the first paragraph of
>>>> normative text describing DW_TAG_array_type and the second
>>>> paragraph
>>>> dealing with multidimensional ordering.
>>>>
>>>> --------------------------------------------------------------------
>>>> An array type that refers to a vector or matrix type, shall be
>>>> denoted with DW_AT_tensor whose integer constant, will
>>>> specify the
>>>> kind of tensor it is. The default type of tensor shall be
>>>> the kind
>>>> used by the vector registers in the target architecture.
>>>>
>>>> Table 5.4: Tensor attribute values
>>>> ------------------------------------------------------------------
>>>> Name | Meaning
>>>> ------------------------------------------------------------------
>>>> DW_TENSOR_default | Default encoding and semantics used by
>>>> target
>>>> | architecture's vector registers
>>>> DW_TENSOR_boolean | Boolean vectors map to vector mask
>>>> registers.
>>>> DW_TENSOR_opencl | OpenCL vector encoding and semantics
>>>> DW_TENSOR_neon | NEON vector encoding and semantics
>>>> DW_TENSOR_sve | SVE vector encoding and semantics
>>>> ------------------------------------------------------------------
>>> As someone who was not sitting in on your debugging GPUs discussions,
>>> this table
>>> is baffling. Is it based on the "Vector Operations" table on the clang
>>> LanguageExtensions page you mentioned?
>> Yes
>>> That page is a wall of text, so I might
>>> have missed another table, but these values are a subset of columns
>>> from that
>>> table.
>>>
>>> 1 of the values here is a source language (opencl), 2 reflect
>>> specific vector
>>> registers of one specific architecture (neon & sve), and I don't even
>>> know what
>>> boolean is meant to be. Maybe a type that you would associate with
>>> predicate
>>> registers? I think this table needs a lot more explanation.
>> This was something that Pedro pointed out and it was something that I
>> hadn't thought of. The overall justification for this is that these
>> types were semantically different than normal C arrays in several
>> distinct ways. There is this table which explains the differences:
>> https://clang.llvm.org/docs/LanguageExtensions.html#vector-operations
>> The argument is that the semantics of different flavors are different
>> enough that they need to be distinct.
>>
>> I really do not know much of anything about OpenCL style vectors, I
>> wouldn't at all be against folding that constant in because it is
>> something that could be inferred from the source language. I left it in
>> because I thought that there might exist in cases where clang compiles
>> some OpenCL code that references some intrinsics written in another
>> language like C/C++ which depends on the semantics of OpenCL vector
>> types.
>>
>> NEON, yeah I think we should drop that one. The current GCC semantics
>> are really Intel's vector semantics. By changing it from "GCC semantics"
>> to "Default encoding and semantics used by target architecture's vector
>> registers" I think we eliminate the need for that.
>>
>> You are correct boolean is for predicate register types. After looking
>> at the calling conventions, these are not passed as types themselves. So
>> for the purpose of this submission, I don't think we need it. I believe
>> that some of the stuff that Tony and the AMD, and intel guys are almost
>> ready to submit has DWARF examples of how to make use of predicate
>> registers in SIMD and SIMT and access variables making use of predicate
>> registers should be sufficient for those.
>>
>> ARM SVE and RISC-V RVV are really weird because of those HW
>> implementation defined vs architecturally defined register and therefore
>> type widths. It has been a couple of compiler generation iterations
>> since I looked at the DWARF for those but but when I last looked, the
>> compilers didn't know what to do with those and so they didn't generate
>> usable DWARF. So I feel like there are additional unsolved problems with
>> the SVE and RVV types that will need to be addressed. It is a problem,
>> that I know that I need to look into -- but right now I do not have any
>> "quality of DWARF" user issues pulling it closer to the top of my
>> priority list. The only processor I've seen with SVE is the A64FX used
>> in Fugaku and the HPE Apollo 80's, the Apple M1 and M2 don't have it and
>> I haven't seen any of the newer ARM enterprise CPUs. I don't think there
>> are any chips with RVV yet. Once more users have access to hardware that
>> supports it, I know that it will be more of a problem. I kind of feel
>> like that will be a whole submission in and of itself.
>>
>>
> So you're thinking that "OpenCL vector semantics" ought to be
> determinable from DW_AT_language DW_LANG_OpenCL? Seems reasonable.
>
> DW_TENSOR_boolean: Could it just be determinable from the shape of the
> array? For example:
>
> <BOOL> DW_TAG_base_type
> DW_AT_bit_size : 1
>
> DW_TAG_array_type
> DW_AT_name : predicate_t
> DW_AT_byte_size : 16
> DW_AT_type : <BOOL>
> DW_AT_tensor : yes (encoding TBD)
> DW_TAG_subrange_type
> DW_AT_type : <whatever>
> DW_AT_lower_bound : 0
> DW_AT_upper_bound : 128
>
> NEON/SVE/RVV ought to be determinable by knowing what kind of machine
> the debugger is running on (ARM/RISC-V). Or, for something like
> dwarfdump which might try to read a foreign-architecture ELF file, from
> the ELF header. (Not that dwarfdump specifically is going to care...)
> As for NEON vs. SVE, is there a need to differentiate them? And can it
> not be done by shape of the type?
That one continues to be hard. ARM processors that support SVE also have
NEON registers which like the Intel SSE MMX AVX kind of vector registers
are architecturally specified as having a specific number of bits.
Handling those are trivial.
The weird thing about SVE registers (and the same things also apply to
RVV) are that the number of bits is not architecturally defined and is
therefore unknown at compile time. The size of the registers can even
vary from hardware implementation to hardware implementation. So a
simple processor may only have a 128b wide SVE register while a monster
performance core may have 2048b wide SVE registers. The predicate
registers scale the same way. I that it can even vary from core to core
within a CPU sort of like intel's P-cores vs E-cores. To be able to even
know how much a loop is vectorized you need to read a core specific
register that specifies how wide the vector registers are on this
particular core. Things like induction variables are incremented by the
constant in that core specific register divided by size of the type
being acted upon. So some of the techniques used to select lanes in
DWARF don't quite work the same way.
Just to make things even more difficult, when one of these registers are
spilled to memory like the stack the size is unknown at compile time and
so any subsequent spilling has to determine the size that it takes up.
So any subsequent offsets need to use DWARF expressions to that
reference the width of the vector.
...and then there is SME which is like SVE but they are matrices rather
than vectors. The mind boggles.
> If all those things are eliminated, then you're back to just needing a
> flag: tensor vs. not-tensor.
>
>> How about:
>>
>> Table 5.4: Tensor attribute values
>> ------------------------------------------------------------------
>> Name | Meaning
>> ------------------------------------------------------------------
>> DW_TENSOR_default | Default encoding and semantics used by target
>> | architecture's vector registers
>> ------------------------------------------------------------------
>>
>> The point is I believe that there are going to be flavors. Can we leave
>> it an enum?
>>
>> Then if SVE, and RVV end up being sufficiently different we have a way
>> to handle them. I also double checked and ARM V9.1 SME is now publicly
>> disclosed so we have at least 3 architectures that I know that have
>> matrix registers but the compiler support hasn't quite caught up yet.
>>
> You argued that it still should be an enum, but with only one "default"
> value defined. And I guess any other values that might be added later
> would be (or at least start as) vendor extensions. It's peculiar, and I
> don't think we have that anywhere else in the standard.
I guess that my point is that I'm fairly certain that SVE and RVV will
need special handling and when the compilers start handling the matrix
types that the hardware is starting to support, they are going need some
help as well.
> If it ever became necessary, you can always add a 2nd attribute for it.
> As an example, in our Ada compiler decades ago, we did this for
> DW_AT_artificial. It's just a flag, so either present or not-present.
> We added a 2nd DW_AT_artificial_kind with a whole bunch of different
> enums for the various kinds our compiler generated. The point is you
> still can get there even if DW_AT_tensor is just a flag.
Totally, not opposed to that if that is the way that people want to
handle it. My only (admittedly weak) argument against doing it that way
is that there there will now be two attributes rather than one and the
space that it takes up. John DelSignore was just dealing with a program
that had 4.9GB of DWARF, it would be nice to keep it as compact as
possible. Of course most of that is likely location lists and template
instantiations and stuff like that not the relatively rare case like
this. The cases where this shows up are likely going to be fairly rare.
Would this be an acceptable compromise for V4 of my proposal? I drop it
back to just being a flag for the time being. Then in a subsequent
submission (which may or may not be in the DWARF6 cycle -- but hopefully
is in time for DWARF6), if I find it necessary to make a flavor to
support SVE, RVV or SME, then my submission for that will include
changing DW_AT_tensor to requiring a constant that then references an
enum like I did above. If it comes out before DWARF6 is released then
great, we don't have to redefine anything. If It bumped to DWARF7 then
we add a _kind attribute.
-ben
> Regards,
> Todd
>
More information about the Dwarf-discuss
mailing list