[Dwarf-Discuss] Line table "no-op" sequence

Thu Apr 26 15:30:36 GMT 2018

On Wed, Apr 25, 2018 at 11:38 AM,  <paul.robinson at sony.com> wrote:
>> One technique you haven't mentioned is to stretch out LEB128 numbers
>> with extra 0x80's.
>
> Yeah, I kind of don't like abusing the LEB format like that.  Maybe
> for one or two bytes, but not arbitrarily long strings (as you note,
> some consumers will decide it's corrupted data).

I don't think it's abuse of the format at all, as long as you don't go
over the reasonable maximum length. There's nothing in the spec that
requires an LEB128 to be minimum length, and in fact there's a case in
the generation of Gnu exception handling tables where the assembler
actually *must* generate a non-minimal LEB128. (It needs to generate a
length field that represents the length of a block that must be padded
to a multiple of 4 bytes, but if the required length is, say, 128, the
length field takes 2 bytes, but that decreases the padding by one
byte, so that the length is now 127, which only takes 1 byte, adding
that byte back to the padding. To avoid an infinite loop there, it
generates a two-byte 127 value.)

>> When doing an incremental link, gold will pad the .debug_line section
>> with a dummy line number program of appropriate length (minimum 29
>> bytes). Here are the relevant comments:
>> ...
>>   // We set the header_length field to cover the entire hole, so the
>>   // line number program is empty.
>
> I have a consumer that whines if the header_length doesn't exactly match
> the fields as defined in the appropriate DWARF version.  But maybe I can
> make it tolerate this.
>
>>   // Some consumers don't check the header_length field, and simply
>>   // start reading the line number program immediately following the
>>   // header.  For those consumers, we fill the remainder of the free
>>   // space with DW_LNS_set_basic_block opcodes.  These are effectively
>>   // no-ops: the resulting line table program will not create any rows.
>
> I still say it's syntactically invalid unless it ends with end_sequence.
> But otherwise this is "great minds think alike."

What DWARF says is: "Every line number program sequence must end with
a DW_LNE_end_sequence instruction which creates a row whose address is
that of the byte after the last target machine instruction of the
sequence."

If the line table program contains no sequences (i.e., it's empty),
you don't need an end_sequence instruction.

>> I use a similar technique to pad the .debug_info and .debug_types
>> sections. Those are a bit easier, since we can simply pad the actual
>> data area with zeroes.
>
> Again I think I have a whiny consumer but it can probably be fixed.

We need to crack down on whiny consumers. They defeat the
extensibility that we designed into DWARF.

>> Another thing you can do is "hide" stuff inside an undocumented
>> extended opcode. Because extended ops always declare their length, you
>> can make a single extended op cover whatever hole you have (as long as
>> it's at least 3 bytes). If you use an extended opcode of, say 0x7f,
>> which hopefully no one has implemented, any conforming DWARF reader
>> will simply skip over it without complaint.
>
> Ooh I like that.  Should we officially reserve an opcode for this?

A DW_LNE_comment opcode? You could propose it for DWARF 6.

-cary