[Dwarf-Discuss] EXTERNAL: Corner-cases with bitfields

Mon May 9 11:13:40 GMT 2022

Hi,

Thanks for the feedback.

> 
> It sounds like your ABI is basing its decision on a boolean: is the field a bit
> field or not.  And you're trying to deduce this from DW_AT_bit_offset.  Perhaps
> a better solution would be to make this explicit in the DWARF, some new
> DW_AT_bitfield flag.  There's very little that the DWARF standard can do to
> mandate such an attribute.  (Permissive standard yadda yadda.)  But if it's
> necessary for debuggers to work correctly in a given ABI, compilers should be
> well-motivated to produce it when generating code for that ABI.
> 
> [...]
> 
> I'm agreeing with Michael that describing the unnamed bitfield seems dubious.
> If it does impact the ABI, I'm wondering if that impact is indirect: that is,
> the presence of this 0-width bit field changes an attribute of the next field,
> and that attribute is responsible for difference in the behavior.  If so, is
> there any way other than a 0-width bit field to cause the same behavior?  This
> might be another case where describing the attribute that's directly responsible
> might be better.
> 
> -- 
> Todd Allen
> Concurrent Real-Time

Indeed, two flags can conceptually provide sufficient information to the
debugger.  I am not aware of a situation not involving the 0-wide
unnamed bitfield which can trigger the same behavior.  The unnamed
field is not really important in itself, only the effect it has on the
following field is.

To help the discussion, let me use a concrete example and how it is
handled in our ABI as it is today.  Lets suppose you have a struct like:

  struct Foo {
      int fill[4];
      char c1;
      char c2 : 8;
      char c3 : 8;
      char : 0;
      char c4 : 8;
      char c5 : 8;
  };

If we have a function returning a value of such type by value, the
return value would be passed in registers back to the caller (the
situation would be similar when calling a function with a Foo
parameter).  In our architecture, the registers used are 32 bit wide*
and called Vx, x being the register number (*this is a simplified
model sufficient for the current discussion).

In this example, the `fill` member is not really important.  It is only
there because a value under 64 bits  would otherwise be returned packed.

So suppose that we stop a program just after returning from a function
returning such value.  This is what a debugger such as GDB would do when
using the `finish` command.  To figure out the value returned by the
function, the debugger needs to inspect the debuggee's registers based
on the function's return type.

In this situation, the ABI conceptually decomposes the struct into its
member and places each member in a register.  Bitfield members are
considered as packed into one member.  The result is:

+-------------------+-------------------+-------------------+-------------------+
| V0                | V1                | V2                | V3                |
|      fill[0]      |      fill[1]      |      fill[2]      |      fill[3]      |
+-------------------+-------------------+-------------------+-------------------+
+-------------------+-------------------+-------------------+
| V4                | V5                | V6                |
| xxxxxxxxxxxx | c1 | xxxxxxx | c3 | c2 | xxxxxxx | c5 | c4 |
+-------------------+-------------------+-------------------+

If we do not specify the `: 8`s, then each char is allocated its own
register (so c1?in V4, c2 in V5, and so on until c5 in V8).

If we remove the unnamed 0-sized field, then c2, c3, c4, c5 are all
packed into V5.

My main problem today is that the debugger needs to figure out how such
type is passed as an argument or returned from a function, while basing
its decision solely on the description of the Foo type in the debugging
information.

Unfortunately, as far I know, the debugging information does not allow
me to differentiate the Foo type as written above to a type without the
`: 8` or without the unnamed field.  As a consequence, the debugger will
in some situations make wrong decisions.  Also note that such difference
would not change how a value of type Foo is laid out in memory, only the
decomposition into registers for function call purposes is impacted.

I do not claim the ABI is flawless, good or bad in any way.  It is the
way it is and for the time being it is going to remain stable.
Nevertheless, if we want to provide good debugging experience for our
target (which we do), we need to address those cases in a way or
another (two flags are good candidates).

I hope this example helps to illustrate the limitations we are
currently facing.

Best,
Lancelot.