[Dwarf-discuss] ISSUE: CPU vector types.

Fri Mar 31 03:29:05 GMT 2023

-ben

> On Mar 30, 2023, at 1:27 PM, Pedro Alves <alves.ped@gmail.com> wrote:
> 
> On 2023-03-29 8:55 p.m., Ben Woodard via Dwarf-discuss wrote:
>> 
>> On 3/28/23 13:17, David Blaikie wrote:
> ...
>>> What DWARF should be used to describe the type of 'a'? And how does
>>> this encoding scale to all the other similar intrinsic types?
>>> 
>> As a person who has spent a crazy amount of time doing ABI work and static analysis this is what I would like: (I'm kind of assembling this by cutting and pasting and editing so please excuse minor errors like sizes and ignoring siblings. It is hand written hand-wavy DWARF)
>> 
>> *Factored out preamble to all of them:*
>> [0] base_type            abbrev: 3
>>        byte_size            (data1) 4
>>        encoding             (data1) float (4)
>>        name                 (strp) "float"
>> [5]  base_type            abbrev: 3
>>        byte_size            (data1) 4
>>        encoding             (data1) unsigned (4)
>>        name                 (strp) "unsigned int"
>> [8]  base_type            abbrev: 3
>>        byte_size            (data1) 4
>>        encoding             (data1) signed (4)
>>        name                 (strp) "int"
>> [10] base_type            abbrev: 3
>>        byte_size            (data1) 8
>>        encoding             (data1) double (4)
>>        name                 (strp) "double float"
>> [15] subprogram           abbrev: 32
>>    external             (flag_present) yes
>>    name                 (string) "f"
>> [20]      formal_parameter     abbrev: 15
>>          name                 (string) "a"
>>          type                 (ref4) [30]
>> 
>> *void f( float *a){}**
>> *[30] pointer_type         abbrev: 5
>>        byte_size            (implicit_const) 8
>>        type                 (ref4) [0]
>> 
>> *void f( float a[]){}*
>> [30] array_type         abbrev: 5
>>        type                 (ref4) [0]
>> 
> 
> In reality, the "float a[]" case above is not really an array.
> 'a' is still a pointer.  There is no such thing as passing an array by value in C/C++.  The two cases above
> are defining the same function overload.  Note:
> 
> $ g++ v.cc -o -g3 -O0
> v.cc:11:6: error: redefinition of ‘void f(float*)’
>   11 | void f(float a[]) {}
>      |      ^
> v.cc:10:6: note: ‘void f(float*)’ previously defined here
>   10 | void f(float *a) {}
>      |      ^
> 
> I guess you're saying that a consumer would know that it is looking at a C or c++ function, and thus is knows
> that even if the argument's type is described as an array, that it is really a pointer?

Yeah basically. I should have had more emphasis saying that this is _how_I_would_like_it_to_be_ as a person whose introduction to DWARF was static analysis of ABIs. You have correctly stated how it really is. How it currently is. I just wish it was different so that my static analysis tools could detect the difference between “a head of linked list pointer” “a C array pointer” and all the other semantically distinct uses of C pointers.

In essence, I was trying to point out the ambiguities that this creates for people like me doing static analysis. One of favorite catch phrases is “DWARF it ain’t just for debuggers anymore”, we use it for static analysis and performance tools and they have slightly different needs than debuggers.

For example debuggers need location information which are in essence functions that work like 
   f( pc, variable) -> location

Performance tools and certain binary analysis tools really could use something that I call “inverted location lists” which work like: 
    f( pc, location) -> variable or expression. 
This would allow us to quickly answer questions like: These instructions cause a huge number of L1 cache misses. Which variable accesses in those instructions are causing that? There are a few experts amongst us who can figure that out.  However, making automated tools which which we can give to developers who are less experienced to allow them figure this out has been remarkably difficult. I’ve tried, a few other people have tried. The way that location lists work, it just doesn’t give you all the information that you need to completely reverse the mapping from:
   f( pc, variable) -> location
To:
  f( pc, location) -> variable.
> 
>> *void f( float a[4]){}
> 
> Here I believe '4' must be ignored by the compiler's code generator, at least, the compiler can't really
> assume that 'a' points to an array with 4 elements.  The '4' is just basically documentation.  Some
> compilers, such as GCC, use it for warnings, though.
> 
> There's another case, one you didn't mention, and it is one that _does_ change ABI, which is:
> 
>  void f(float a[static 4]) {} // C only

Good one! I’ve largely moved onto C++ and haven’t been watching the C standard as closely.

> 
> This tells the compiler that the passed pointer points to an array that has at least 4 elements,
> which also implies that you can't pass a NULL pointer.
> 
> From N3047, C23 draft, 6.7.6.3 Function declarators, 6:
> 
> "A declaration of a parameter as "array of type" shall be adjusted to "qualified pointer to type", where
> the type qualifiers (if any) are those specified within the [ and ] of the array type derivation. If the
> keyword static also appears within the [ and ] of the array type derivation, then for each call to
> the function, the value of the corresponding actual argument shall provide access to the first element
> of an array with at least as many elements as specified by the size expression."
> 
> Note: 'a' is still adjusted to a pointer.
> 
>> *[30] array_type         abbrev: 5
>>        type                 (ref4) [0]
>> [40] subrange_type        abbrev: 31
>>        upper_bound          (data1) 3*
>> *
>> 
>> *void f( float a[s], **unsigned s**){} // C only*
> 
> Nit: this won't compile, need to flip parameters so 's' comes first:
> 
> void f(unsigned s, float a[s]) {} // C only
> 
Doh! I knew that. I swapped the order of the parameters around when I was handwriting the DWARF so that I wouldn’t have to change as many things when I cut and pasted and in the moment, while being lazy, I forgot that rule about C VLAs. 

>> As you can see, much of what I really would like is to:
>> 
>> 1. Not have arrays degenerate into pointers when the source code is explicit about this. Consider a linked lists vs an arrays, both could have the same ABI fingerprint. - I believe that this should be written into the standard as a best practice example.I will write this up and file it.
>> 2. When it is an array and the bound is specified, this also should be included in the ABI fingerprint. - I believe that this to should be written into the standard as a best practice example.I will write this up and file it.
>> 3. In my ABI work I need a way to disambiguate "typedef float[4] __m128" from "__m128" as it is now defined in xmmintrin.h. The difference is important because in libabigail we do not look at the location of the parameters, we just assume that the platform ABI as implemented by the compiler takes care of that. Thus if arrays didn't degenerate into pointers (introducing the ambiguity between the head of a linked list and an array), then:
>> 
>> typedef float[4] __m128;
>> void f( __m128 a){}
>> 
>> Would stick a pointer to a in general purpose register. While:
>> 
>> #include "xmmintin.h"
>> void f( __m128 a){}
>> 
>> Would stick a in a vector register because calling convention is different and libabigail wouldn't be able to tell the difference.
>> 
>> We don't process the formal parameter's location at least in part because it is hard. We would have to add code in libabigail to process the location list but also because the quality of location information from different compilers has been inconsistent. And the purpose of libabigail the tool was not to check how well the compilers implemented the platform ABI but to test libraries for ABI compatibility.
>> 
>> I spent a notable portion of yesterday writing various bits of arguments against Cary's DW_TAG_vector and then throwing them away because they really were not at all convincing even to myself. The only argument that I found convincing to even myself was parsimony. We currently have DW_TAG_array and I couldn't come up with how it would be different in any way from DW_TAG_array + DW_AT_vector. So based on that rather weak argument, I'll say that I really don't care if it is:
>> 
>> DW_TAG_vector
>> 
>> or
>> 
>> DW_TAG_array
>>   DW_AT_vector
>> 
>> Tony let me know he's become convinced that they do not need a vector type for their GPU work and are planning to drop their vectors as base type proposal. That leaves my needs around ABI as the only pending concern and that may be handled by Kyle's proposal to make the location of the return value something encoded in the DWARF rather than having to infer it from the platform ABI.
>> 
>> If we didn't get something like DW_TAG_vector or DW_TAG_array + DW_AT_vector, and instead only went with just Kyle's proposal specifying the location of the return value, then libabigail would then need to be taught to process location information.
> 
> My knee-jerk reaction initially was that we should be able to just describe vector registers as arrays, and then
> they just happen to be arrays whose location happens to be in a vector register sometimes.
> 
> They _are_ mostly arrays, and you can index into them as such, but a crucial difference is that they don't
> implicitly decay to pointers, unlike plain C arrays.  You've already pointed that out, but I'm stressing it
> because that is the key thing for me.
> 
> With:
> 
>  typedef float[4] __m128;
>   __m128 vec;
> 
> ... this is valid:
> 
>  float *ptr = vec;
> 
> and thus a debugger such as GDB allows doing such an assignment:
> 
> (gdb) print ptr = vec
> 
> however with this:
> 
>  typedef float __m128 __attribute__((vector_size(16)));
> __m128 vec;
> 
> ... this is NOT valid, and compilers reject it:
> 
>  float *ptr = vec;
> 
> and thus a debugger may want to prohibit doing such an assignment
> in its C/C++ expression evaluator as well.
> 
> Similarly, when debugging a C++ program, and the user calls a function
> passing in a vector, a debugger should not pick a "float *" overload.  Like:
> 
>  typedef float __m128 __attribute__((vector_size(16)));
>  __m128 vec;
>  void func(float *a) { printf("ouch!\n"); }
> 
>  int main()
>  {
>    // func(vec); // can't do this.
>  }
> 
> however, GDB does pick the float pointer overload:
> 
>  (gdb) p func(vec)
>  ouch!
>  $1 = void
>  (gdb) 
> 
> Whoops.  That is a bug.  It should have errored out the same way it errors out when
> you call the function with some other argument that also shouldn't convert:
> 
>  (gdb) p func((int *) 0)
>  Cannot resolve function func to any overloaded instance
> 
> GDB supports DW_AT_GNU_vector, so it should have been able to reject that.
> 
> So teaching DWARF to specify the location of the return value is actually orthogonal
> here -- we need to be able to distinguish regular C array from vector arrays for
> other reasons too, specifically, the types are different at the language level.
> So vector-ness should indeed be a property of the type.
> 
> Pedro Alves
>