[Dwarf-discuss] ISSUE: tensor types. V3

Fri Apr 21 22:31:54 GMT 2023

On 4/21/23 12:56, Todd Allen via Dwarf-discuss wrote:
> I've been playing catch-up on this discussion today.  I was convinced of the
> value early on just based on the need of this information to follow the ABI
> parameter passing rules on certain architectures.
Really -- that is what I care about too. Everything else that I have 
done was just to make it acceptable to everyone else.
>   And I was with you right
> up until this V3 version.  Comments below:
>
> On Thu, Apr 13, 2023 at 11:57:08AM -0700, Dwarf Discussion wrote:
>>     I didn't put back any changes that would allow these tensor types to
>>     appear on the DWARF stack. I feel that particular topic hasn't been
>>     settled yet. The general plan is I will work with Jakub and create some
>>     cases where a compiler could want to put these vector types on the DWARF
>>     stack. Tony Tye and the AMD team believe that the vector types do not need
>>     to be on the stack and believe that all the cases where the debuggers
>>     would want to access elements within the vector can be addressed with
>>     offsetting. IIUC a key point seems to be that they have never seen a case
>>     where an induction variable was embedded in a slot in a vector register,
>>     it always is a scalar. (I am not sure that I fully grokked their argument
>>     -- so please correct me) In the cases where it was, it could still be
>>     accessed as an implicit. Once I've got some examples of how a debugger
>>     might want to put vector types on the DWARF stack, the AMD team can
>>     suggest alternative approaches. I said that I would make a V4 proposal if
>>     the group ultimately comes to a consensus that vector registers are in
>>     fact needed on the stack.
> A proposal to allow vector types on the DWARF expression stack easily could be
> a distinct proposal, although it obvious would have a dependency on this one.
> This seems like a good application of the "keep proposals small" philosophy.
>
>>     Insert the following paragraph between the first paragraph of
>>     normative text describing DW_TAG_array_type and the second paragraph
>>     dealing with multidimensional ordering.
>>
>>         --------------------------------------------------------------------
>>         An array type that refers to a vector or matrix type, shall be
>>         denoted with DW_AT_tensor whose integer constant, will specify the
>>         kind of tensor it is. The default type of tensor shall be the kind
>>         used by the vector registers in the target architecture.
>>
>>             Table 5.4: Tensor attribute values
>>         ------------------------------------------------------------------
>>         Name              | Meaning
>>         ------------------------------------------------------------------
>>         DW_TENSOR_default | Default encoding and semantics used by target
>>                   | architecture's vector registers
>>         DW_TENSOR_boolean | Boolean vectors map to vector mask registers.
>>         DW_TENSOR_opencl  | OpenCL vector encoding and semantics
>>         DW_TENSOR_neon    | NEON vector encoding and semantics
>>         DW_TENSOR_sve     | SVE vector encoding and semantics
>>         ------------------------------------------------------------------
> As someone who was not sitting in on your debugging GPUs discussions, this table
> is baffling.  Is it based on the "Vector Operations" table on the clang
> LanguageExtensions page you mentioned?
Yes
> That page is a wall of text, so I might
> have missed another table, but these values are a subset of columns from that
> table.
>
> 1 of the values here is a source language (opencl), 2 reflect specific vector
> registers of one specific architecture (neon & sve), and I don't even know what
> boolean is meant to be.  Maybe a type that you would associate with predicate
> registers?  I think this table needs a lot more explanation.

This was something that Pedro pointed out and it was something that I 
hadn't thought of. The overall justification for this is that these 
types were semantically different than normal C arrays in several 
distinct ways. There is this table which explains the differences: 
https://clang.llvm.org/docs/LanguageExtensions.html#vector-operations 
The argument is that the semantics of different flavors are different 
enough that they need to be distinct.

I really do not know much of anything about OpenCL style vectors, I 
wouldn't at all be against folding that constant in because it is 
something that could be inferred from the source language. I left it in 
because I thought that there might exist in cases where clang compiles 
some OpenCL code that references some intrinsics written in another 
language like C/C++ which depends on the semantics of OpenCL vector types.

NEON, yeah I think we should drop that one. The current GCC semantics 
are really Intel's vector semantics. By changing it from "GCC semantics" 
to "Default encoding and semantics used by target architecture's vector 
registers" I think we eliminate the need for that.

You are correct boolean is for predicate register types. After looking 
at the calling conventions, these are not passed as types themselves. So 
for the purpose of this submission, I don't think we need it. I believe 
that some of the stuff that Tony and the AMD, and intel guys are almost 
ready to submit has DWARF examples of how to make use of predicate 
registers in SIMD and SIMT and access variables making use of predicate 
registers should be sufficient for those.

ARM SVE and RISC-V RVV are really weird because of those HW 
implementation defined vs architecturally defined register and therefore 
type widths. It has been a couple of compiler generation iterations 
since I looked at the DWARF for those but but when I last looked, the 
compilers didn't know what to do with those and so they didn't generate 
usable DWARF. So I feel like there are additional unsolved problems with 
the SVE and RVV types that will need to be addressed. It is a problem, 
that I know that I need to look into -- but right now I do not have any 
"quality of DWARF" user issues pulling it closer to the top of my 
priority list. The only processor I've seen with SVE is the A64FX used 
in Fugaku and the HPE Apollo 80's, the Apple M1 and M2 don't have it and 
I haven't seen any of the newer ARM enterprise CPUs. I don't think there 
are any chips with RVV yet. Once more users have access to hardware that 
supports it, I know that it will be more of a problem. I kind of feel 
like that will be a whole submission in and of itself.

How about:

            Table 5.4: Tensor attribute values
        ------------------------------------------------------------------
        Name              | Meaning
        ------------------------------------------------------------------
        DW_TENSOR_default | Default encoding and semantics used by target
                          | architecture's vector registers
        ------------------------------------------------------------------

The point is I believe that there are going to be flavors. Can we leave 
it an enum?

Then if SVE, and RVV end up being sufficiently different we have a way 
to handle them. I also double checked and ARM V9.1 SME is now publicly 
disclosed so we have at least 3 architectures that I know that have 
matrix registers but the compiler support hasn't quite caught up yet.

>
> How do you envision debuggers using this information?  Merely disallowing things
> like operator++, or disallowing casts, or certain flavors of casts?  (Those were
> the differences I spotted in that table.)  This doesn't seem terribly
> compelling.  But if others think it is, maybe this should be broken up into
> distinct features instead of a lumpy enum?

So in essence after thinking about it more, I kind of think that you are 
right and these types are not sufficiently different. I do think that 
SVE and RVV and then SME are going to be sufficiently different that 
they will need a different flavor of tensor type. But like the vectors 
on the DWARF stack, that is a subsequent submission.

I also realized that for completeness sake, that I should have some 
edits in Chapter 7 saying what class of parameter DW_AT_tensor can 
accept and suggesting a numerical value for the attribute.

Shall I do a V4?

-ben

>