[Dwarf-discuss] ISSUE: tensor types. V3

Thu Apr 13 18:57:08 GMT 2023

Here is V3 of what was my vector types proposal.

Changes since V2:

We discussed this extensively in the DWARF for GPUs meeting. Cary 
originally wanted it to be a TAG rather than an attribute on an array 
and quite frankly, I don't care and so my default position is "What Cary 
wants, Cary gets". However, Pedro pointed out LLVMs different flavors of 
vector types which like the vector types bubbled up from the target 
architecture to language source through intrinsics. Each of these 
different vectors flavors has slightly different semantics. There is a 
really nice table on 
https://clang.llvm.org/docs/LanguageExtensions.html#id15 that summarizes 
the differences. This changed the course of discussion and it seemed 
that the group moved back to making it an attribute on an array. Since 
there are multiple flavors of vector, this led to adding a parameter to 
the attribute that defines the flavor and a table which defines what 
those constants mean.

I brought up the point about matrix registers. Jakub is right there are 
currently no compilers which make use of matrix vector types right now. 
Even AMD's GPUs which do have intrinsics for matrix operations end up 
implementing them with arrays of vector registers. This area is rapidly 
evolving due to its heavy use in HPC and AI. The challenge appears to be 
the compilers haven't supported these operations yet. Cary came up with 
the idea of calling it a "tensor" rather than defining DW_AT_vector and 
then later adding DW_AT_matrix. So through the entire document, vector 
has been changed to tensor.

Markus pointed out a few problems in my V2 version, I tried to address 
those. They were pretty minor and obvious. Markus please verify that I 
did it to your satisfaction otherwise V4.

What has not changed since V2:

I didn't put back any changes that would allow these tensor types to 
appear on the DWARF stack. I feel that particular topic hasn't been 
settled yet. The general plan is I will work with Jakub and create some 
cases where a compiler could want to put these vector types on the DWARF 
stack. Tony Tye and the AMD team believe that the vector types do not 
need to be on the stack and believe that all the cases where the 
debuggers would want to access elements within the vector can be 
addressed with offsetting. IIUC a key point seems to be that they have 
never seen a case where an induction variable was embedded in a slot in 
a vector register, it always is a scalar. (I am not sure that I fully 
grokked their argument -- so please correct me) In the cases where it 
was, it could still be accessed as an implicit. Once I've got some 
examples of how a debugger might want to put vector types on the DWARF 
stack, the AMD team can suggest alternative approaches. I said that I 
would make a V4 proposal if the group ultimately comes to a consensus 
that vector registers are in fact needed on the stack.

As for DWARF consumers, according to Cary, the reason why DWARF 
operations are currently limited to base types is to make it relatively 
easy on the consumers. If vector registers are in fact needed on the 
stack, Zoran is fairly certain that changes that he's preparing to 
enable gdb to support GPUs would also automatically handle vector 
registers on the stack. The problem for gdb would be client server 
operation with gdbserver. The goal with gdbserver has been to keep its 
execution footprint very small. While having huge registers on the DWARF 
stack is reasonable when gdb is operating on the target or on the client 
side, on the server end it may pose a problem. John DelSignore said that 
TotalView has a similar concern because of its client server 
architecture. He did point out though that DWARF expressions are ephemeral.

My impression was that Cary wanted to add this tensor types issue to the 
DWARF issue queue for discussion and once question of vector registers 
on the stack is settled, this proposal can be amended or a new proposal 
addressing just that issue can be filed.

------------------------------------------------------------------------
Tensor types

Some languages support vector data types, which are not possible to
represent today in standard DWARF.  A vector is an array of values.
These can be allocated to a SIMD vector register, if available, either
permanently or temporarily, and operations on vectors make use of SIMD
instructions, again if available.

For example, as an extension to C and C++, GCC supports defining
vector data types as described here:

https://gcc.gnu.org/onlinedocs/gcc/Vector-Extensions.html

In this C/C++ extension, vector types are similar to arrays, and you
can index and initialize them similarly, but they have some important
differences.  For example:

- C arrays automatically decay to pointers.  Vector types do not.

- Vector types can be passed by value to functions, and likewise
   functions can return vector types by value. Neither of which can be
   done with C arrays.

- Vector types can be used with a subset of normal C operations: +, -,
   *, /, unary minus, ^, |, &, ~, %.  For example, addition is defined as
   the addition of the corresponding elements of the operands.

A debugger that supports C/C++ expression evaluation will want to be
able to support these vector operations on vector objects too.

Vector types appear on function prototypes, so they have their own
mangling scheme in the Itanium ABI.

Other vendors have similar C/C++ extensions.  For example, Motorola’s
Altivec C/C++ Language Extensions, which predates GCC's extensions.

To distinguish these vector types from regular C arrays, GCC's DWARF
describes a vector type as an array with the DW_AT_GNU_vector
attribute.

Support for this DWARF extension has been implemented in GDB for well
over a decade.

Other languages have support for vector types, with similar ABI and/or
API implications, and so DW_AT_GNU_vector is also used for languages
beyond C/C++ today.

Clang also supports the GCC vector extensions, and in some cases
describes the vector types in DWARF using the same attribute as
GCC. However, clang also supports many other types of vectors with
somewhat different semantics than those used by GCC.

https://clang.llvm.org/docs/LanguageExtensions.html#vectors-and-extended-vectors

In the process of refining this proposal, we debated whether this
proposal should be a tag like DW_TAG_vector or if it should be an
attribute applied to an array. It was clang's various flavors of
vectors with different semantics which ultimately led to the current
proposal where it is an atribute with an argument that specifies the
flavor of type it is.

During the course of the discussion, it was pointed out that while no
compiler currently supports matrix types, there are several current
processors which have matrix registers that have intrinsics that allow
a programmer to access them. One example is AMD's second generation
CDNA processors and later.

https://gpuopen.com/learn/amd-lab-notes/amd-lab-notes-matrix-cores-readme/

Rather than introducing a second new attribute like DW_AT_matrix, we
thought that calling the attribute DW_AT_tensor was correct and
general enough to embrace the concept.

-------------------------------------------------

In Section 2.2 Attribute Types, DW_AT_tensor shall be added to Table 2.2

--------------------------------------------------------------------
     DW_AT_tensor                | A language tensor type
--------------------------------------------------------------------

The hyperlink in the "Identifies or Specifies" column shall point to
the paragraph added to Section 5.5 below for DW_AT_tensor.

In Section 5.5 Array Type Entries, replace first paragraph of
non-normative text with:

--------------------------------------------------------------------
     [non-normative] Many languages share the concept of an “array,”
     which is a table of components of identical type. Furthermore,
     many architectures contain vector types which mirror the language
     concept of a short single dimension array but have different
     encoding, a different calling convention and different arithmatic
     and logical operational semantics than the source language
     arrays. Likewise a few architectures are starting to add matrix
     register types with similar variations in encoding and semantics
     from normal source language array types.
--------------------------------------------------------------------

Insert the following paragraph between the first paragraph of
normative text describing DW_TAG_array_type and the second paragraph
dealing with multidimensional ordering.

--------------------------------------------------------------------
     An array type that refers to a vector or matrix type, shall be
     denoted with DW_AT_tensor whose integer constant, will specify the
     kind of tensor it is. The default type of tensor shall be the kind
     used by the vector registers in the target architecture.

         Table 5.4: Tensor attribute values
  ------------------------------------------------------------------
     Name              | Meaning
  ------------------------------------------------------------------
     DW_TENSOR_default | Default encoding and semantics used by target
               | architecture's vector registers
     DW_TENSOR_boolean | Boolean vectors map to vector mask registers.
     DW_TENSOR_opencl  | OpenCL vector encoding and semantics
     DW_TENSOR_neon    | NEON vector encoding and semantics
     DW_TENSOR_sve     | SVE vector encoding and semantics
  ------------------------------------------------------------------

     The width and when applicable the number of rows of the type
     shall be specified as array dimensions. The type contained
     within the tensor array type must be a DW_TAG_base_type entry.
--------------------------------------------------------------------

A table shall be added to chapter 7 defining these tensor types and
giving these attributes numerical values. Something like:

--------------------------------------------------------------------
     7.N Tensor types

     The encodings of the constants used in DW_AT_tensor attribute are
     given in table 7.N.

         Table 7.N: Tensor type encoding
     ---------------------------
     Name              | Value
     ---------------------------
     DW_TENSOR_default | 0x00
     DW_TENSOR_boolean | 0x01
     DW_TENSOR_opencl  | 0x02
     DW_TENSOR_neon    | 0x03
     DW_TENSOR_sve     | 0x04
     ---------------------------

--------------------------------------------------------------------
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.dwarfstd.org/pipermail/dwarf-discuss/attachments/20230413/00fcfd00/attachment-0001.htm>