[Dwarf-Discuss] [RFC] More compact (100x) -g3 .debug_gnu_macro (take 4)

Thu Jul 21 17:55:18 GMT 2011

On Thu, Jul 21, 2011 at 10:10:39AM -0700, Richard Henderson wrote:
> On 07/21/2011 04:22 AM, Jakub Jelinek wrote:
> > Currently, the patch emits 3 byte section headers at the start of
> > the .debug_gnu_macro chunks referenced from .debug_info (through
> > DW_AT_GNU_macros), containing version number (2 byte, 4 ATM) and
> > 1 byte section offset, but the DW_GNU_MACRO_transparent_include
> > referenced sequences don't have it.
> > The .debug_gnu_macro section isn't completely usable without the referencing
> > CUs anyway, so IMHO we could still get away completely without
> > any section header, but if we need it, the question is if
> > the offset size there is useful and if the section header shouldn't
> > go before the transparent_include chains as well (only with that
> > e.g. readelf -wm would be able to dump .debug_gnu_macro without
> > reading .debug_info and tracking offsets to it).
> 
> I've been wondering if the header shouldn't contain the opcode
> definitions, similar to .debug_line, and drop your _define_opcode.
> It does mean that you couldn't re-define opcodes within any one
> sequence, but does that actually seem useful?

I've talked to Tom about it last night.  The advantage of
not having it in the header is saving 1 byte for the case when
no extension opcodes need to be defined, and perhaps if we changed
the wording that the defined opcodes end at 0 termination to allow
it to last, then we could with many opcodes share the opcode arguments
descriptions.
So we could have
DW_GNU_MACRO_transparent_include .Ldebug_macro17
after the header of many sections and then
.Ldebug_macro17:
DW_GNU_MACRO_define_opcode DW_GNU_MACRO_lo_user+0 1 DW_FORM_udata
DW_GNU_MACRO_define_opcode DW_GNU_MACRO_lo_user+1 2 DW_FORM_udata DW_FORM_sdata
DW_GNU_MACRO_define_opcode DW_GNU_MACRO_lo_user+2 1 DW_FORM_strp
0

If the opcode definitions were in the header, then they could be
either after a uleb128 that would say how many of the definitions there
are, followed by what I've been proposing as DW_GNU_MACRO_define_opcode
arguments alone (i.e. opcode number and DW_FORM_block array of forms
for arguments).  Or it could be instead without the uleb128, but zero
terminated.

> Defining the opcodes in the header makes it clear that there 
> should be a header for the include sequences, and that makes it
> clear that the defined opcodes are local to a given sequence,
> without having to have awkward wording as for _define_opcode.
> 
> I do like mjw's idea of using the version number to distinguish
> our implementation and one with the dwarf5 stamp of approval.
> This suggests going ahead with .debug_macro as the section name.

If we knew that DWARF5 would either start the .debug_macro sections
with a header starting with the 2 byte version and the version there would be
5 (I think if it does start with a 2 byte version number, it would use 5),
then perhaps it would be safe to use .debug_macro section with version 4 (or
1?).  Shall we use DW_GNU_MACRO_* names, or DW_MACRO_GNU_* names?

> > In x86_64 cc1plus for which I've been posting figures, I see
> > 395 CUs referencing .debug_gnu_macro and at most 511 different
> > .debug_gnu_macro chains with unique md5sums.  So, the cost of the
> > 3 byte headers is for cc1plus just in CU referenced chunks
> > 1185 bytes, 3 byte headers in all .debug_gnu_macro chunks
> > 2718 bytes.
> 
> Putting the opcode definitions into the header would increase
> the overhead more, somewhere between 12 and 20 bytes per chain.
> Which is, I think still manageable.

The question is, do we want to always describe all the opcodes we use,
or can we assume the ops described in the corresponding standard as
given?  Say if DWARF 5 (and our version 4) defines 8 standard opcodes,
and DWARF 6 adds another 3, and we want to use the new opcodes, with
-gdwarf-5 -gno-strict-dwarf we'd define the opcode arguments for the
3 DWARF 6 ops (or a subset of them that we actually use), while for
-gdwarf-6 we wouldn't define any and just put version 6 into the section.

> > Also, should the decision whether to emit .debug_gnu_macro or .debug_macinfo
> > depend on -gdwarf-strict, or should we have a separate switch for that?
> 
> I'm fine with strict.  Anyone else have an opinion?

Ok.

	Jakub