[Dwarf-Discuss] DWARF and source text embedding

Simon Brand simon@codeplay.com
Wed Feb 14 09:17:59 GMT 2018


Hi all,

I was the author of the original proposal for DWARF source text 
embedding, and my use case was OpenCL kernel code, which is often 
generated at runtime. For debugging, this generated source needs to be 
embedded in the binary in many cases since you may not have a writeable 
file system, or a file system at all (this may seem like a niche case, 
but is actually common in embedded and mobile OpenCL environments).

FWIW I'm happy to get behind this new proposal since there's already an 
LLVM implementation and it seems to cover my use-cases.

Simon


On 14/02/18 00:45, Michael Eager wrote:
> On 02/13/18 09:37, scott at scottlinder.com wrote:
>> Michael, Paul,
>>
>> In the current proposal, it is not an error to have any value 
>> (including an
>> empty string) in the _source attribute when the _has_source flag is 
>> true, which
>> allows for embedding an empty source string.
>>
>> After seeing more feedback on this point, I think you are right that 
>> the extra
>> flag is unnecessary. Looking at similar attributes like MD5 and how 
>> they are
>> handled I think it would be best to modify the proposal to remove the 
>> flag and
>> require the source be present on all files in the same line table if the
>> attribute is present in the prologue. I still think we should have 
>> wording
>> which indicates an empty string is still a valid value for embedded 
>> source, and
>> should not be interpreted as indicating the absence of embedded 
>> source for
>> that file. This is analogous to the current MD5 attribute, as even 16 
>> null
>> bytes is a valid MD5. What are your thoughts on this approach?
>>
>> Scott
>
> Are you saying that if any source file in embedded, that all need to be?
> Including both ephemeral generated source as well as fixed include
> files?
>
> What does it mean to have embedded source which is an empty string?
> How is that different from saying that embedded source is absent?
>
> I can imagine situations where embedding ephemeral generated source in a
> DWARF debug entry can make sense.? But I have the feeling that there is
> more to this in your environment than what I imagine.? Can you give a 
> description of the use case in which this might be used?
>
>>
>> On 2018-02-01 17:20, Michael Eager wrote:
>>> On 02/01/2018 12:01 PM, scott at scottlinder.com wrote:
>>>> Hi Paul,
>>>>
>>>> My intention was to support an empty source string; I want to be 
>>>> explicit about the presence of embedded source for each file.
>>>
>>> I'm not fond of the belt and suspenders approach.? If there is one
>>> specifier for an attribute, there's no need for a second to say that
>>> it's valid.? There's always the issue of what it means when the two
>>> attributes disagree, such as when you have a flag saying that there
>>> is embedded source, but the source string is empty.? Is that an error?
>>>
>>>> When reading the spec I did notice places where an empty string can 
>>>> indicate the absence of the attribute (e.g. DW_AT_name), but I 
>>>> would prefer to be explicit here.
>>>>
>>>> Scott
>>>>
>>>> On 2018-02-01 11:19, paul.robinson at sony.com wrote:
>>>>>> -----Original Message-----
>>>>>> From: Dwarf-Discuss 
>>>>>> [mailto:dwarf-discuss-bounces at lists.dwarfstd.org] On
>>>>>> Behalf Of scott at scottlinder.com
>>>>>> Sent: Wednesday, January 31, 2018 2:05 PM
>>>>>> To: dwarf-discuss at lists.dwarfstd.org
>>>>>> Subject: [Dwarf-Discuss] DWARF and source text embedding
>>>>>>
>>>>>> Hello all,
>>>>>>
>>>>>> I am a compiler engineer at AMD, working on tools for debugging
>>>>>> online-compiled
>>>>>> programs. The problem I am attempting to solve was brought up 
>>>>>> previously
>>>>>> in the
>>>>>> DWARF Standard issue 161018.1 titled "DWARF-embedded source for
>>>>>> online-compiled
>>>>>> programs", and is the result of runtimes like OpenCL doing online
>>>>>> compilation
>>>>>> in an environment where it is not desireable (or even feasible) 
>>>>>> to write
>>>>>> sources to disk. In these cases, it would be useful to support 
>>>>>> embedding
>>>>>> the
>>>>>> source directly in the resulting DWARF. I would like to propose a
>>>>>> similar
>>>>>> solution to the one outlined in the above issue, but without 
>>>>>> structural
>>>>>> changes
>>>>>> to the specification.
>>>>>>
>>>>>> ====
>>>>>>
>>>>>> Add two new optional fields to the file_names prologue of the line
>>>>>> table.
>>>>>>
>>>>>> Section 6.2.4.1:
>>>>>> Add two bullets after "5. DW_LNCT_MD5"
>>>>>> 6. DW_LNCT_has_source
>>>>>> ???? DW_LNCT_has_source indicates that the value is a boolean which
>>>>>> affects the
>>>>>> ???? interpretation of an accompanying DW_LNCT_source value. When 
>>>>>> present
>>>>>> there
>>>>>> ???? must be an accompanying DW_LNCT_source value. When true, 
>>>>>> consumers
>>>>>> may use
>>>>>> ???? the embedded source instead of attempting to discover the 
>>>>>> source on
>>>>>> disk.
>>>>>> ???? When false, consumers will ignore the DW_LNCT_source value. 
>>>>>> This
>>>>>> code point
>>>>>> ???? is always paired with a flag form (e.g. DW_FORM_flag or
>>>>>> ???? DW_FORM_flag_present).
>>>>>> 7. DW_LNCT_source
>>>>>> ???? DW_LNCT_source indicates that the value is a null-terminated 
>>>>>> string
>>>>>> which
>>>>>> ???? is the original source text of the file. When present there 
>>>>>> must be
>>>>>> an
>>>>>> ???? accompanying DW_LNCT_has_source value. The string will 
>>>>>> contain the
>>>>>> UTF-8
>>>>>> ???? encoded source text with '\n' line endings. When the 
>>>>>> accompanying
>>>>>> ???? DW_LNCT_has_source value is false, the value of 
>>>>>> DW_LNCT_source will
>>>>>> be the
>>>>>> ???? empty string. This code point is always paired with a string 
>>>>>> form
>>>>>> (e.g.
>>>>>> ???? DW_FORM_string, DW_FORM_line_strp, DW_FORM_strp).
>>>>>
>>>>> Would a zero-length string indicate something other than 
>>>>> "has_source=false"?
>>>>> If not, then a separate has_source flag seems redundant.
>>>>> --paulr
>>>>>
>>>>>>
>>>>>> New type codes can be allocated for them in a 
>>>>>> backwards-compatible way,
>>>>>> or
>>>>>> codes for these new content types can be added in the range of
>>>>>> [DW_LNCT_lo_user, DW_LNCT_hi_user] to avoid changing the spec 
>>>>>> itself.
>>>>>>
>>>>>> Table 7.27:
>>>>>> Add DW_LNCT_has_source? 0x6
>>>>>> Add DW_LNCT_source????? 0x7
>>>>>>
>>>>>> Any DWARFv5 consumer which is unaware of this extension would 
>>>>>> continue
>>>>>> to
>>>>>> operate as before, ignoring the new fields. Any consumer which is 
>>>>>> aware
>>>>>> of the
>>>>>> extension would know to check DW_LNCT_has_source for each file_name
>>>>>> entry in
>>>>>> order to determine whether the embedded source field 
>>>>>> (DW_LNCT_source)
>>>>>> contains
>>>>>> the source text of the corresponding file.
>>>>>>
>>>>>> ====
>>>>>>
>>>>>> My team and I believe this simplifies the design by removing the 
>>>>>> need
>>>>>> for
>>>>>> changes to the compile unit sections, and by avoiding the 
>>>>>> addition of
>>>>>> multiple
>>>>>> file_name_entry_formats in a single program, all without 
>>>>>> sacrificing any
>>>>>> information. We have a preliminary implementation in LLVM/Clang, 
>>>>>> which
>>>>>> supports
>>>>>> embedding source (clang -gdwarf-5 -gembed-source) and inspecting 
>>>>>> it via
>>>>>> llvm-dwarfdump and llvm-objdump (with the -source flag). The 
>>>>>> patches are
>>>>>> available at https://reviews.llvm.org/D42765 (LLVM) and
>>>>>> https://reviews.llvm.org/D42766 (Clang).
>>>>>>
>>>>>> I would like any and all feedback on the design, and want to see 
>>>>>> about
>>>>>> the
>>>>>> possibility of adding the new content type codes outside of the 
>>>>>> "user"
>>>>>> range
>>>>>> (i.e. adding new entries for them in Table 7.27) in the next 
>>>>>> version of
>>>>>> the
>>>>>> specification.
>>>>>>
>>>>>> Regards,
>>>>>> Scott Linder
>>>>>>
>>>>>> _______________________________________________
>>>>>> Dwarf-Discuss mailing list
>>>>>> Dwarf-Discuss at lists.dwarfstd.org
>>>>>> http://lists.dwarfstd.org/listinfo.cgi/dwarf-discuss-dwarfstd.org
>>>> _______________________________________________
>>>> Dwarf-Discuss mailing list
>>>> Dwarf-Discuss at lists.dwarfstd.org
>>>> http://lists.dwarfstd.org/listinfo.cgi/dwarf-discuss-dwarfstd.org
>>>>
>>
>

-- 
Simon Brand
Senior Software Engineer, GPGPU Toolchains
Codeplay Software Ltd
Level C, Argyle House, 3 Lady Lawson St, Edinburgh, EH3 9DR
Tel: 0131 466 0503
Fax: 0131 557 6600
Website: http://www.codeplay.com
Twitter: https://twitter.com/codeplaysoft

This email and any attachments may contain confidential and /or privileged information and is for use by the addressee only. If you are not the intended recipient, please notify Codeplay Software Ltd immediately and delete the message from your computer. You may not copy or forward it, or use or disclose its contents to any other person. Any views or other information in this message which do not relate to our business are not authorized by Codeplay software Ltd, nor does this message form part of any contract unless so stated.
As internet communications are capable of data corruption Codeplay Software Ltd does not accept any responsibility for any changes made to this message after it was sent. Please note that Codeplay Software Ltd does not accept any liability or responsibility for viruses and it is your responsibility to scan any attachments.
Company registered in England and Wales, number: 04567874
Registered office: Regent House, 316 Beulah Hill, London, United Kingdom, SE19 3HF




More information about the Dwarf-discuss mailing list