[Dwarf-Discuss] Corner-cases with bitfields

Lancelot SIX lsix.dwarf at lancelotsix.com
Mon May 9 04:50:56 PDT 2022


Hi,

Thanks for the comments.

I gave additional examples in another reply[1] which I think will help
clarify the difficulty I am seeing.  The core of the issue is that even
if different types have identical memory layouts, they can be handled
differently by our ABI.  The DWARF description of such types is not
sufficient to differentiate them, leading to incorrect behavior from the
debugger.

> 
> What is the difference in the layout for
>   struct { char a; char b; } X;
> and
>   struct { char a:8; char b:8; } Y;
> 
> I would have to review the ISO C standard in detail, but I believe that
> these are equivalent declarations.  In both structs, a is at byte offset
> 0 and b is at offset 1.
> 
> What is the difference in the DWARF description?

As far as I know, the X and Y are equivalent when it comes to the way
they are laid out in memory.

When compiling you example with GCC (11.2.0, with -g3 -gdwarf-5), I have:

0x00000055:   DW_TAG_structure_type
                DW_AT_byte_size (2)
                DW_AT_decl_file ("/home/lsix/tmp/Dw.c")
                DW_AT_decl_line (2)
                DW_AT_decl_column       (1)
                DW_AT_sibling   (0x0000006e)

0x0000005b:     DW_TAG_member
                  DW_AT_name    ("a")
                  DW_AT_decl_file       ("/home/lsix/tmp/Dw.c")
                  DW_AT_decl_line       (2)
                  DW_AT_decl_column     (0x0f)
                  DW_AT_type    (0x0000003b "char")
                  DW_AT_bit_size        (8)
                  DW_AT_data_bit_offset (0x00)

0x00000064:     DW_TAG_member
                  DW_AT_name    ("b")
                  DW_AT_decl_file       ("/home/lsix/tmp/Dw.c")
                  DW_AT_decl_line       (2)
                  DW_AT_decl_column     (0x19)
                  DW_AT_type    (0x0000003b "char")
                  DW_AT_bit_size        (8)
                  DW_AT_data_bit_offset (0x08)

for the Y variable's type.

X's type does not have DW_AT_bit_size/DW_AT_data_bit_offset but have a
DW_AT_data_member_location attribute.

Clang (13.0.0-2) gives similar descriptions for X and Y, both with a
DW_AT_data_member_location attribute.

> 
> > What Clang does seems to be a reasonable thing to do if one is only
> > interested in the memory layout of the type.  This however is not
> > sufficient in our case to decide how to handle such type when
> > placing/inspecting arguments in registers in the context of function
> > calls. In our ABI, bitfield members are passed packed together, while
> > two chars in a struct would be placed in separate registers.
> 
> It's not my position to critique an ABI, but this seems to require
> additional packing and unpacking of char data.  Why not pack all struct
> data into a minimum number of registers?  Or place all containing
> entities in separate registers?

To be completely honest, I do not know why the arguments are not passed
packed, and I was surprised when I first encountered this behavior.
That being said, the ABI is what I am given to work with.  It is
unlikely to change in the near future because as for all ABIs, it needs
to maintain some backward compatibility.

> 
> > To clarify this situation, it would be helpful that a producer always
> > includes the DW_AT_bit_size attribute for bit field, which the standard
> > does not suggest nor require.
> 
> DWARF is a permissive standard, which means that at times there may be
> different ways of describing the same source code.  DWARF is unlikely to
> require that one description is prescribed and the other proscribed.
>

I do not dispute this, nor do I want to impose a particular way to
describe things.  The problem I am currently facing is that today's
DWARF does not provide a way to distinguish between 2 types which
behaves differently with our ABI but have equivalent memory layout.  At
least this is my understanding, I'd be happy to have a way to make a
distinction.

> It seems to me that this usage of :0 bitfields to indicate some
> difference in struct layout (although I remain unclear what) does not
> follow either the ISO C or C++ standards.
> 

As for the previous case, I think my initial example is probably not
clear enough.  The one in [1] hopefully gives clearer description of
what the impact is on the calling convention, and where my difficulty
is.

Best,
Lancelot SIX.

[1] http://lists.dwarfstd.org/pipermail/dwarf-discuss-dwarfstd.org/2022-May/004899.html


More information about the Dwarf-Discuss mailing list