[Dwarf-discuss] ISSUE: CPU vector types.

Thu Mar 30 20:26:56 GMT 2023

On 2023-03-29 8:55 p.m., Ben Woodard via Dwarf-discuss wrote:
> 
> On 3/28/23 13:17, David Blaikie wrote:
...
>> What DWARF should be used to describe the type of 'a'? And how does
>> this encoding scale to all the other similar intrinsic types?
>>
> As a person who has spent a crazy amount of time doing ABI work and static analysis this is what I would like: (I'm kind of assembling this by cutting and pasting and editing so please excuse minor errors like sizes and ignoring siblings. It is hand written hand-wavy DWARF)
> 
> *Factored out preamble to all of them:*
> [0] base_type            abbrev: 3
>        byte_size            (data1) 4
>        encoding             (data1) float (4)
>        name                 (strp) "float"
> [5]  base_type            abbrev: 3
>        byte_size            (data1) 4
>        encoding             (data1) unsigned (4)
>        name                 (strp) "unsigned int"
> [8]  base_type            abbrev: 3
>        byte_size            (data1) 4
>        encoding             (data1) signed (4)
>        name                 (strp) "int"
> [10] base_type            abbrev: 3
>        byte_size            (data1) 8
>        encoding             (data1) double (4)
>        name                 (strp) "double float"
> [15] subprogram           abbrev: 32
>    external             (flag_present) yes
>    name                 (string) "f"
> [20]      formal_parameter     abbrev: 15
>          name                 (string) "a"
>          type                 (ref4) [30]
> 
> *void f( float *a){}**
> *[30] pointer_type         abbrev: 5
>        byte_size            (implicit_const) 8
>        type                 (ref4) [0]
> 
> *void f( float a[]){}*
> [30] array_type         abbrev: 5
>        type                 (ref4) [0]
> 

In reality, the "float a[]" case above is not really an array.
'a' is still a pointer.  There is no such thing as passing an array by value in C/C++.  The two cases above
are defining the same function overload.  Note:

$ g++ v.cc -o -g3 -O0
v.cc:11:6: error: redefinition of ‘void f(float*)’
   11 | void f(float a[]) {}
      |      ^
v.cc:10:6: note: ‘void f(float*)’ previously defined here
   10 | void f(float *a) {}
      |      ^

I guess you're saying that a consumer would know that it is looking at a C or c++ function, and thus is knows
that even if the argument's type is described as an array, that it is really a pointer?

> *void f( float a[4]){}

Here I believe '4' must be ignored by the compiler's code generator, at least, the compiler can't really
assume that 'a' points to an array with 4 elements.  The '4' is just basically documentation.  Some
compilers, such as GCC, use it for warnings, though.

There's another case, one you didn't mention, and it is one that _does_ change ABI, which is:

  void f(float a[static 4]) {} // C only

This tells the compiler that the passed pointer points to an array that has at least 4 elements,
which also implies that you can't pass a NULL pointer.

>From N3047, C23 draft, 6.7.6.3 Function declarators, 6:

 "A declaration of a parameter as "array of type" shall be adjusted to "qualified pointer to type", where
 the type qualifiers (if any) are those specified within the [ and ] of the array type derivation. If the
 keyword static also appears within the [ and ] of the array type derivation, then for each call to
 the function, the value of the corresponding actual argument shall provide access to the first element
 of an array with at least as many elements as specified by the size expression."

Note: 'a' is still adjusted to a pointer.

> *[30] array_type         abbrev: 5
>        type                 (ref4) [0]
> [40] subrange_type        abbrev: 31
>        upper_bound          (data1) 3*
> *
> 
> *void f( float a[s], **unsigned s**){} // C only*

Nit: this won't compile, need to flip parameters so 's' comes first:

 void f(unsigned s, float a[s]) {} // C only

> As you can see, much of what I really would like is to:
> 
>  1. Not have arrays degenerate into pointers when the source code is explicit about this. Consider a linked lists vs an arrays, both could have the same ABI fingerprint. - I believe that this should be written into the standard as a best practice example.I will write this up and file it.
>  2. When it is an array and the bound is specified, this also should be included in the ABI fingerprint. - I believe that this to should be written into the standard as a best practice example.I will write this up and file it.
>  3. In my ABI work I need a way to disambiguate "typedef float[4] __m128" from "__m128" as it is now defined in xmmintrin.h. The difference is important because in libabigail we do not look at the location of the parameters, we just assume that the platform ABI as implemented by the compiler takes care of that. Thus if arrays didn't degenerate into pointers (introducing the ambiguity between the head of a linked list and an array), then:
> 
> typedef float[4] __m128;
> void f( __m128 a){}
> 
> Would stick a pointer to a in general purpose register. While:
> 
> #include "xmmintin.h"
> void f( __m128 a){}
> 
> Would stick a in a vector register because calling convention is different and libabigail wouldn't be able to tell the difference.
> 
> We don't process the formal parameter's location at least in part because it is hard. We would have to add code in libabigail to process the location list but also because the quality of location information from different compilers has been inconsistent. And the purpose of libabigail the tool was not to check how well the compilers implemented the platform ABI but to test libraries for ABI compatibility.
> 
> I spent a notable portion of yesterday writing various bits of arguments against Cary's DW_TAG_vector and then throwing them away because they really were not at all convincing even to myself. The only argument that I found convincing to even myself was parsimony. We currently have DW_TAG_array and I couldn't come up with how it would be different in any way from DW_TAG_array + DW_AT_vector. So based on that rather weak argument, I'll say that I really don't care if it is:
> 
> DW_TAG_vector
> 
> or
> 
> DW_TAG_array
>   DW_AT_vector
> 
> Tony let me know he's become convinced that they do not need a vector type for their GPU work and are planning to drop their vectors as base type proposal. That leaves my needs around ABI as the only pending concern and that may be handled by Kyle's proposal to make the location of the return value something encoded in the DWARF rather than having to infer it from the platform ABI.
> 
> If we didn't get something like DW_TAG_vector or DW_TAG_array + DW_AT_vector, and instead only went with just Kyle's proposal specifying the location of the return value, then libabigail would then need to be taught to process location information.

My knee-jerk reaction initially was that we should be able to just describe vector registers as arrays, and then
they just happen to be arrays whose location happens to be in a vector register sometimes.

They _are_ mostly arrays, and you can index into them as such, but a crucial difference is that they don't
implicitly decay to pointers, unlike plain C arrays.  You've already pointed that out, but I'm stressing it
because that is the key thing for me.

With:

  typedef float[4] __m128;
   __m128 vec;

... this is valid:

  float *ptr = vec;

and thus a debugger such as GDB allows doing such an assignment:

 (gdb) print ptr = vec

however with this:

  typedef float __m128 __attribute__((vector_size(16)));
 __m128 vec;

... this is NOT valid, and compilers reject it:

  float *ptr = vec;

and thus a debugger may want to prohibit doing such an assignment
in its C/C++ expression evaluator as well.

Similarly, when debugging a C++ program, and the user calls a function
passing in a vector, a debugger should not pick a "float *" overload.  Like:

  typedef float __m128 __attribute__((vector_size(16)));
  __m128 vec;
  void func(float *a) { printf("ouch!\n"); }

  int main()
  {
    // func(vec); // can't do this.
  }

however, GDB does pick the float pointer overload:

  (gdb) p func(vec)
  ouch!
  $1 = void
  (gdb) 

Whoops.  That is a bug.  It should have errored out the same way it errors out when
you call the function with some other argument that also shouldn't convert:

  (gdb) p func((int *) 0)
  Cannot resolve function func to any overloaded instance

GDB supports DW_AT_GNU_vector, so it should have been able to reject that.

So teaching DWARF to specify the location of the return value is actually orthogonal
here -- we need to be able to distinguish regular C array from vector arrays for
other reasons too, specifically, the types are different at the language level.
So vector-ness should indeed be a property of the type.

Pedro Alves