[Dwarf-Discuss] compilers generating ABI non-compliant function calls?

Luke Drummond luke.drummond@codeplay.com
Tue Mar 9 15:43:21 GMT 2021


Hi Andrew

On Tue Mar 9, 2021 at 3:05 PM GMT, Andrew Cagney via Dwarf-Discuss wrote:

> Part of a typical Application Binary Interface is to specify the
> function calling convention. Several uses are:
>
> - ensuring function calls across interface boundaries work (function
> in one object calls function in second object)
> - the debugger supplementing the debug information describing the
> location of parameters
> - the debugger implementing inferior function calls
>
> Typically calls both between and within object files (DWARF
> compilation unit) follow the ABI (with exceptions for things like
> __mul, but good ABIs even defined those).
>
> Technically, however, only functions visible via an interface need
> comply with the ABI. This means that:
>
> - for simple objects, local functions; and
> - with link-time-optimization, everything except library interface
> functions
>
> are fair game for ABI non-compliant call optimizations.
>
> Is anyone aware of a compiler doing this (I figure with LTO there's a
> strong incentive)? And if so, how is this described to the debugger.
> The ABI / calling-convention is no longer on hand for filling in the
> blanks.
>
Both GCC and LLVM do this in some capacity. Probably many others too. An easy
way to trigger this behaviour is with a combination of `static` and `noinline`.

Here is an example with gcc10/amd64:

	$ cc -g -O2 -xc - <<EOF
	#include <stdio.h>
	#include <stdlib.h>

	static __attribute__((noinline)) int three_args(int i, int j, int k)
	{
			return i + k;
	}

	int add2(int i, int k)
	{
			return three_args(i, 0, k);
	}

	int main(int argc, char **argv)
	{
			int i = strtol(argv[1], NULL, 0);
			int k = strtol(argv[2], NULL, 0);
			printf("i + k == %d\n", add2(i, k));
	}
	EOF
	$ gdb --args ./a.out 3 4
	Reading symbols from ./a.out...
	(gdb) b three_args
	Breakpoint 1 at 0x11a0: file <stdin>, line 6.
	(gdb) r
	Starting program: /tmp/a.out 3 4

	Breakpoint 1, three_args (i=3, k=4, j=0) at <stdin>:6
	6       <stdin>: No such file or directory.
	(gdb) disassemble
	Dump of assembler code for function three_args:
	=> 0x00005555555551a0 <+0>:     lea    (%rdi,%rsi,1),%eax
		 0x00005555555551a3 <+3>:     ret
	End of assembler dump.
	(gdb)

Notice that `$rsi` is used in place of `%rdi` because the second formal argument
is unused and the compiler can use `%rsi` instead.

The dwarfdump for this program shows where the formal arguments were put by the
compiler:

< 1><0x000001eb>    DW_TAG_subprogram
                      DW_AT_abstract_origin       <0x000001bb>
                      DW_AT_low_pc                0x000011a0
                      DW_AT_high_pc               <offset-from-lowpc>4
                      DW_AT_frame_base            len 0x0001: 0x9c:
                          DW_OP_call_frame_cfa
                      DW_AT_GNU_all_call_sites    yes(1)
                      DW_AT_sibling               <0x0000021b>
< 2><0x00000206>      DW_TAG_formal_parameter
                        DW_AT_abstract_origin       <0x000001cc>
                        DW_AT_location              len 0x0001: 0x55:
                            DW_OP_reg5
< 2><0x0000020d>      DW_TAG_formal_parameter
                        DW_AT_abstract_origin       <0x000001e0>
                        DW_AT_location              len 0x0001: 0x54:
                            DW_OP_reg4
< 2><0x00000214>      DW_TAG_formal_parameter
                        DW_AT_abstract_origin       <0x000001d6>
                        DW_AT_const_value           0

Therefore the debugger doesn't have to special case anything because arbitrary
calling conventions can be communicated this way.

All the best

Luke
-- 
Codeplay Software Ltd.
Company registered in England and Wales, number: 04567874
Registered office: Regent House, 316 Beulah Hill, London, SE19 3HF



More information about the Dwarf-discuss mailing list