[Dwarf-Discuss] PROPOSAL: DW_TAG_call_site

Jakub Jelinek jakub@redhat.com
Tue Aug 17 07:24:36 GMT 2010


Hi!

This is the second extension from us today, again we've implemented it as a
vendor extension, but we're looking for feedback here before deploying it
and we intend to submit it formally as DWARF 5 extension eventually.
The D.13 section below contains some examples which show uses of both
DW_OP_entry_value and DW_TAG_call_site.


Representation of call sites in the debugging information
=========================================================

Overview
--------

Many architectures pass arguments in registers and quite often
the register in which an argument has been passed is quickly reused
for something else, in that case all the debugger can say is that
a value has been optimized out.  If the argument is never modified
in the function, often the value can be discovered through extra
effort, by unwinding in the debugger to the caller and seeing what
value has been passed to the function.  If a constant is passed
to the function, or the argument is loaded from a call preserved
register or call preserved memory, then that is the value of
the argument in the callee.  For arguments passed in stack slots
this is needed less often, as the stack slot in which the value
has been passed is usually not reused for something else, but
it could be in some cases.

For backtraces it is often worthwhile to print what value has been
passed to an argument at the time a function has been called, rather
than what the argument currently has.  E.g. for
void foo (char *p)
{
  /* some code */
  p = strchr (p, '\0');
  bar ();
  /* some further code */
}
if a backtrace is done from within bar, argument p will be printed
as "", which isn't much useful, more interesting is the string before
it.  The debugger then could annotate the values in the backtrace
whether they mean the value passed to the function on function entry,
the current value of the parameter, that the value is known to be
the same in both places, print both values, etc.

The proposed extensions involve adding optional information about
call sites in the programs which say at which location what other
function (if known which one) is called and what values are passed
to its arguments and also a new DWARF expression opcode that
can be used to push the value a register argument or some memory location
had on entry of the current function.

If tail calls are involved, the proposed extensions allow this fact to
be detected.  In that case the return address of a callee is not the
function that actually called it, but a caller of the function that did the
tail call and call target in the call site at return address doesn't match
the current subprogram.  In some cases where the tail call sequence reaching
current subprogram is unambiguous, the debug information consumer might
print a virtual backtrace including the tail calls.  Even when the call site
target matches the current subprogram tail calls still might be involved -
if the current function could possibly indirectly tail call itself.
The extensions allow that case to be detected.  The debugging information
consumer should be conservative with that check and assume that is possible
unless proven otherwise.

Proposed changes to DWARF
-------------------------

New DWARF tags

DW_TAG_call_site		0x44

Allowable attributes:

DW_AT_abstract_origin
DW_AT_call_column
DW_AT_call_file
DW_AT_call_line
DW_AT_call_site_target
DW_AT_call_site_target_clobbered
DW_AT_low_pc
DW_AT_sibling
DW_AT_tail_call
DW_AT_type

DW_TAG_call_site_parameter	0x45

Allowable attributes:

DW_AT_abstract_origin
DW_AT_call_site_data_value
DW_AT_call_site_value
DW_AT_data_location
DW_AT_location
DW_AT_name
DW_AT_sibling
DW_AT_type

New DWARF attributes

DW_AT_call_site_value			0x6f	exprloc
DW_AT_call_site_data_value		0x70	exprloc
DW_AT_call_site_target			0x71	exprloc
DW_AT_call_site_target_clobbered	0x72	exprloc
DW_AT_tail_call				0x73	flag
DW_AT_all_tail_call_sites		0x74	flag
DW_AT_all_call_sites			0x75	flag
DW_AT_all_source_call_sites		0x76	flag

2.2

Add DW_TAG_call_site and DW_TAG_call_site_parameter to figure 1.

Add:

DW_AT_call_site_value			Value passed to a function argument
DW_AT_call_site_data_value		Value pointed to by passed function argument
DW_AT_call_site_target			Address of called subroutine
DW_AT_call_site_target_clobbered	Callee address value, which may
					use call clobbered registers or
					memory locations
DW_AT_tail_call				Call site is a tail call
DW_AT_all_tail_call_sites		All tail call sites in a subprogram
					have corresponding DW_TAG_call_site entries
DW_AT_all_call_sites			All normal and tail call sites in a
					subprogram have corresponding
					DW_TAG_call_site entries
DW_AT_all_source_call_sites		All normal and tail call sites and all
					inline calls have DW_TAG_call_site
					resp. DW_TAG_inlined_subroutine entries

to figure 2.

3.3.1

Add:

A subprogram entry may have DW_AT_all_tail_call_sites, DW_AT_all_call_sites
or DW_AT_all_source_call_sites attributes which are flags.
The DW_AT_all_tail_call_sites flag indicates that no DW_TAG_call_site
entries with DW_AT_tail_call flag set are missing for this subprogram entry.
The DW_AT_all_call_sites flag indicates that no DW_TAG_call_site entries
are missing for this subprogram entry.  If this attribute is set, the
DW_AT_all_tail_call_sites attribute is superfluous.
The DW_AT_all_source_call_sites attribute indicates that no DW_TAG_call_site
nor DW_TAG_inlined_subroutine entries are missing for this subprogram entry.
If this attribute is set, the DW_AT_all_tail_call_sites and
DW_AT_all_call_sites attributes are superfluous.
DW_TAG_call_site entries represent normal and tail call sites in the
subprogram or subroutines inlined into it, the flags cover entries owned
by the subroutine and entry point entries.
If the flags aren't set, some or all DW_TAG_call_site resp.
DW_TAG_inlined_subroutines entries might be missing.
<i>Entries owned by subroutine or entry point entries are children of
the subprogram or entry point entry they are attributes of and any of
its DW_TAG_lexical_block, DW_TAG_inlined_subroutine, DW_TAG_try_block
or DW_TAG_catch_block child entries, but not children of other nested
DW_TAG_subprogram or DW_TAG_entry_point entries.</i>

New section:

3.8 Call site entries

A call site is a way to represent the static or dynamic call graph in
the debugging information.


A call site is represented by a debugging information entry with the tag
DW_TAG_call_site.  The entry for a call site should be owned by the
debugging information entry representing the scope within which the
call is present in the source program.

<i>A source call can be compiled into different types of machine code:
Normal calls are call-like instructions which transfer control to the start
of some subprogram and leave the call site location address somewhere where
unwind information sees it.
Tail calls are jump-like instructions which transfer control to the start
of some subprogram, but the call site location address isn't visible in
the unwind information.
Tail recursion is a call to the current function which is compiled as a loop
into the middle of the current function.
Optimized out call is a call that is in unreachable code that hasn't been
emitted, like if (0) foo ();.
Inline call is a call to inlined subprogram, where at least one instruction
has the location of the inlined subprogram or any of its blocks or inlined
subprograms.
Optimized out inline call is a call to inlined subprogram which either
didn't expand to any instructions or only parts of instructions belong to it
and for debug information those instructions are given location in the
caller.
The DW_TAG_call_site entries describe normal and tail calls.</i>

The call site entry has a DW_AT_low_pc attribute which is the return
address after the call.  This corresponds to the return address
computed by CFI in the called function (6.4).
<i>On many architectures this is the address immediately following the
call instruction, but e.g. on architectures with delay slots it might
be an address after the delay slot of the call.</i>

If the call site is a tail call optimized through tail call optimization
into a jump instead of call, that doesn't leave traces of the function call
in the unwind information, it should have the DW_AT_tail_call attribute,
which is a flag.

The call site may have a DW_AT_abstract_origin attribute which is
a reference.  For direct calls or jumps where the called subprogram is
known it should be a reference to the called subprogram's debugging
information entry.  For indirect calls it may be a reference to a
DW_TAG_variable, DW_TAG_formal_parameter or DW_TAG_member entry representing
the subroutine pointer that is called.

The call site may have a DW_AT_call_site_target attribute which is
a DWARF expression.  For indirect calls or jumps where it is unknown at
compile time which subprogram will be called the expression computes the
address of the subprogram that will be called.  The DWARF expression should
not use register or memory locations that might be clobbered by the call.

The call site may have a DW_AT_call_site_target_clobbered attribute
which is a DWARF expression.  For indirect calls or jumps where the
address is not computable without use of registers or memory locations that
might be clobbered by the call this attribute may be used instead of
the DW_AT_call_site_target attribute.

The call site entry may have a DW_AT_type attribute referencing
a debugging information entry of the type of the called
function.  When DW_AT_abstract_origin is present, DW_AT_type is
usually omitted.

The call site entry may have DW_AT_call_file, DW_AT_call_line and
DW_AT_call_column attributes, each of whose value is an integer constant.
These attributes represent the source file, source line number, and source
column number, respectively, of the first character of the call statement or
expression.  The call file, call line, and call column attributes are
interpreted in the same way as the declaration file, declaration
line, and declaration column attributes, respectively (see Section 2.14).
<i>The call file, call line and call column coordinates do not describe the
coordinates of the subroutine declaration that was inlined, rather they describe
the coordinates of the call.</i>

The call site entry may own DW_TAG_call_site_parameter debugging information
entries representing the parameters passed to the call.

Each such DW_TAG_call_site_parameter debugging information entry should
have a DW_AT_location attribute which is a location expression.
This location expression shall describe where the parameter is passed
in (usually either some register, or a memory at stack register plus some
offset).

Each DW_TAG_call_site_parameter entry may have a DW_AT_call_site_value
attribute which is a DWARF expression.  This expression computes the value
passed to that parameter.  The expression should not use registers or memory
locations that might be clobbered by the call, as it might be evaluated after
unwinding from the called function back to the caller.

For parameters passed by reference, where the code passes a pointer to
a location which contains the parameter, or for reference type parameters
the DW_TAG_call_site_parameter entry may also have DW_AT_data_location
which is a location expression and DW_AT_call_site_data_value attribute
which is a DWARF expression.  The DW_AT_data_location attribute describes where
the referenced value lives during the call.  If it is just
DW_OP_push_object_address, it may be left out.  The
DW_AT_call_site_data_value attribute describes the value in that location.
The expression should not use registers or memory locations that might be
clobbered by the call, as it might be evaluated after unwinding from the
called function back to the caller.

Each DW_TAG_call_site_parameter entry may also have a DW_AT_abstract_origin
attribute which contains a reference to a DW_TAG_formal_parameter entry,
DW_AT_type attribute referencing the type of the parameter and/or DW_AT_name
attribute describing the parameter's name.

7.5.4

Add

DW_TAG_call_site		0x44
DW_TAG_call_site_parameter	0x45

to figure 18.

Add

DW_AT_call_site_value			0x6f	exprloc
DW_AT_call_site_data_value		0x70	exprloc
DW_AT_call_site_target			0x71	exprloc
DW_AT_call_site_target_clobbered	0x72	exprloc
DW_AT_tail_call				0x73	flag
DW_AT_all_tail_call_sites		0x74	flag
DW_AT_all_call_sites			0x75	flag
DW_AT_all_source_call_sites		0x76	flag

to figure 20.

Appendix A

Add DW_AT_all_tail_call_sites, DW_AT_all_call_sites and
DW_AT_all_source_call_sites as allowable attribute to DW_TAG_subprogram
and DW_TAG_entry_point.
Add:

DW_TAG_call_site		DW_AT_abstract_origin
				DW_AT_call_column
				DW_AT_call_file
				DW_AT_call_line
				DW_AT_call_site_target
				DW_AT_call_site_target_clobbered
				DW_AT_low_pc
				DW_AT_sibling
				DW_AT_tail_call
				DW_AT_type

DW_TAG_call_site_parameter	DW_AT_abstract_origin
				DW_AT_call_site_data_value
				DW_AT_call_site_value
				DW_AT_data_location
				DW_AT_location
				DW_AT_name
				DW_AT_sibling
				DW_AT_type

entries.

New section:

D.13  Call Site Examples

The following examples use a hypothetical machine which passes first
argument in register 0, second in register 1, third in register 2,
the stack pointer is register 3 and the machine has one call preserved
register 4.  Return value from function is passed in register 0.

/* C source */

extern void fn1 (long int, long int, long int);

long int
fn2 (long int a, long int b, long int c) 
{
  long int q = 2 * a;
  fn1 (5, 6, 7); 
  return 0;
}
 
long int
fn3 (long int x, long int (*fn4) (long int *))
{
  long int v, w, w2, z;
  w = (*fn4) (&w2);
  v = (*fn4) (&w2);
  z = fn2 (1, v + 1, w);
  {
    int v1 = v + 4;
    z += fn2 (w, v * 2, x);
  }
  return z;
}

/* Assembly */

fn2:
L1:
  %reg2 = 7
  %reg1 = 6
  %reg0 = 5
L2:
  call fn1
  %reg0 = 0
  return
L3:

fn3:
  %reg3 = %reg3 - 32
  [%reg3] = %reg4
  [%reg3 + 8] = %reg0
  [%reg3 + 16] = %reg1
  %reg0 = %reg3 + 24
  call %reg1
L6:
  %reg2 = [%reg3 + 16]
  [%reg3 + 16] = %reg0
  %reg0 = %reg3 + 24
  call %reg2
L7:
  %reg4 = %reg0
  %reg2 = [%reg3 + 16]
  %reg1 = %reg4 + 1
  %reg0 = 1
  call fn2
L4:
  %reg2 = [%reg3 + 8]
  [%reg3 + 8] = %reg0
  %reg0 = [%reg3 + 16]
  %reg1 = %reg4 + %reg4
  call fn2
L5:
  %reg2 = [%reg3 + 8]
  %reg0 = %reg0 + %reg2
L8:
  %reg4 = [%reg3]
  %reg3 = %reg3 + 32
  return

The location list for variable a in fn2 then might be:

<L1, L2> DW_OP_reg0
<L2, L3> DW_OP_entry_value 1 DW_OP_reg0 DW_OP_stack_value
<0, 0>

variable q in fn2 then might have location list:

<L1, L2> DW_OP_lit2 DW_OP_breg0 0 DW_OP_mul DW_OP_stack_value
<L2, L3> DW_OP_lit2 DW_OP_entry_value 1 DW_OP_reg0 DW_OP_mul DW_OP_stack_value
<0, 0>

Variables b and c would be similar location list to variable a,
except for different label in between the two ranges and would
use DW_OP_reg1 resp. DW_OP_reg2 instead of DW_OP_reg0
and DW_OP_breg1 resp. DW_OP_breg2 instead of DW_OP_breg0.

The call sites for all the calls in fn3 would be children of the
DW_TAG_subprogram (or its DW_TAG_lexical_block if it has any for the whole
function):

DW_TAG_call_site
  DW_AT_low_pc(L6)
  DW_AT_call_site_target(DW_OP_breg3 16 DW_OP_deref)
  DW_TAG_call_site_parameter
    DW_AT_location(DW_OP_reg0)
    DW_AT_call_site_value(DW_OP_breg3 24)
DW_TAG_call_site
  DW_AT_low_pc(L7)
  DW_AT_call_site_target(DW_OP_entry_value 1 DW_OP_reg1)
  DW_TAG_call_site_parameter
    DW_AT_location(DW_OP_reg0)
    DW_AT_call_site_value(DW_OP_breg3 24)
DW_TAG_call_site
  DW_AT_low_pc(L4)
  DW_AT_abstract_origin(reference to fn2 DW_TAG_subprogram)
  DW_TAG_call_site_parameter
    DW_AT_abstract_origin(reference to DW_TAG_formal_parameter a in fn2 subprogram)
    DW_AT_location(DW_OP_reg0)
    DW_AT_call_site_value(DW_OP_lit1)
  DW_TAG_call_site_parameter
    DW_AT_abstract_origin(reference to DW_TAG_formal_parameter b in fn2 subprogram)
    DW_AT_location(DW_OP_reg1)
    DW_AT_call_site_value(DW_OP_breg4 1)
  DW_TAG_call_site_parameter
    DW_AT_abstract_origin(reference to DW_TAG_formal_parameter c in fn2 subprogram)
    DW_AT_location(DW_OP_reg2)
    DW_AT_call_site_value(DW_OP_breg3 16 DW_OP_deref)
DW_TAG_lexical_block
  DW_AT_low_pc(L4)
  DW_AT_high_pc(L8)
  DW_TAG_variable
    DW_AT_name("v1")
    DW_AT_type(reference to int)
    DW_AT_location(DW_OP_breg4 4)
  DW_TAG_call_site
    DW_AT_low_pc(L5)
    DW_AT_call_site_target(reference to fn2 DW_TAG_subprogram)
    DW_TAG_call_site_parameter
    DW_AT_abstract_origin(reference to DW_TAG_formal_parameter a in fn2 subprogram)
      DW_AT_location(DW_OP_reg0)
      DW_AT_call_site_value(DW_OP_breg3 16 DW_OP_deref)
    DW_TAG_call_site_parameter
    DW_AT_abstract_origin(reference to DW_TAG_formal_parameter b in fn2 subprogram)
      DW_AT_location(DW_OP_reg1)
      DW_AT_call_site_value(DW_OP_lit2 DW_OP_breg4 0 DW_OP_mul)
    DW_TAG_call_site_parameter
    DW_AT_abstract_origin(reference to DW_TAG_formal_parameter c in fn2 subprogram)
      DW_AT_location(DW_OP_reg2)
      DW_AT_call_site_value(DW_OP_entry_value 1 DW_OP_reg0)

! Fortran source to show passing parameters by reference.
subroutine fn4 (n)
  integer :: n, x
  x = n
  n = n / 2
  call fn6
end subroutine
subroutine fn5 (n)
  interface fn4
    subroutine fn4 (n)
      integer :: n
    end subroutine
  end interface fn4
  integer :: n, x
  call fn4 (n)
  x = 5
  call fn4 (x)
end subroutine fn5

/* Assembly */

fn4:
  %reg2 = [%reg0]
  %reg2 = %reg2 / 2
  [%reg0] = %reg2
  call fn6
  return

fn5:
  %reg3 = %reg3 - 8
  call fn4
L9:
  [%reg3] = 5
  %reg0 = %reg3
  call fn4
L10:
  %reg3 = %reg3 + 8
  return

The location description for x in fn4 might be
DW_OP_entry_value 4 DW_OP_breg0 0 DW_OP_deref_size 4 DW_OP_stack_value.

The call sites in fn5 might be:

DW_TAG_call_site
  DW_AT_low_pc(L9)
  DW_AT_abstract_origin(reference to fn4 DW_TAG_subprogram)
  DW_TAG_call_site_parameter
    DW_AT_abstract_origin(reference to DW_TAG_formal_parameter n in fn4 subprogram)
    DW_AT_location(DW_OP_reg0)
    DW_AT_call_site_value(DW_OP_entry_value 1 DW_OP_reg0)
    ! DW_AT_data_location(DW_OP_push_object_address) ! left out
    DW_AT_call_site_data_value(DW_OP_entry_value 4 DW_OP_breg0 0 DW_OP_deref_size 4)
DW_TAG_call_site
  DW_AT_low_pc(L10)
  DW_AT_abstract_origin(reference to fn4 DW_TAG_subprogram)
  DW_TAG_call_site_parameter
    DW_AT_abstract_origin(reference to DW_TAG_formal_parameter n in fn4 subprogram)
    DW_AT_location(DW_OP_reg0)
    DW_AT_call_site_value(DW_OP_breg3 0)
    ! DW_AT_data_location(DW_OP_push_object_address) ! left out
    DW_AT_call_site_data_value(DW_OP_lit5)

Change History
--------------

May   28, 2010 - allow DW_AT_call_{file,line,column} instead of
		 DW_AT_decl_{file,line,column} on DW_TAG_call_site, disallow
		 the latter on DW_TAG_call_site_parameter, other minor
		 changes
May   21, 2010 - introduced DW_TAG_call_site_parameter, removed
		 DW_AT_call_site_count, added DW_AT_call_site_target_clobbered,
		 DW_AT_all_tail_call_sites, DW_AT_all_call_sites,
		 DW_AT_all_source_call_sites.  Further additions to appendix D
May    6, 2010 - add assembly code for the appendix D example
April 27, 2010 - split into separate DW_OP_entry_value and DW_TAG_call_site
		 proposals.
April 26, 2010 - remove DW_AT_tail_call_count, add instead
		 DW_AT_call_site_count.
April 15, 2010 - add optimized out parameters that aren't passed at all.
April 13, 2010 - initial draft.

	Jakub




More information about the Dwarf-discuss mailing list