[Dwarf-discuss] Re: Dwarf-discuss Digest, Vol 8, Issue 2

Mon Apr 18 12:42:53 GMT 2005

Ron,

Daniel has it pretty much spot on, but I'll run through it again.

Note that I'm ignoring the complexities of integral promotions,
conversion and so on here (though, as you point out they enter into the
final decision on function overloading choices in C++), so let's just
consider the case where there are no such issues to worry about, e.g.

  extern foo (int);
  extern foo (long);

  int  int_value;
  long long_value;

on a machine in which the ABI specifies that sizeof(int) == sizeof(long)
and we only have one numeric encoding to worry about (so we're not on a
PDP7 where we could have one's complement or two's complement).

As a result of the ABI the only way in which the encodings of the two
base types will differ is in their name.

However to perform the overload resolution of

   print "foo (int_value)"         

or

   print "foo (long_value)"         

we need to know how the type of "int_value" or "long_value" maps onto
the C++ language standard types "int" or "long".

At present the only way to achieve that is by looking at the name which
was given in the basic type DIE.

The idea that you can do it by comparing dies fails because these
variables and functions can come from many different compilation units,
and there is nothing in DWARF which mandates that there is only a single
definition of each basic type. What seems much more common at present is
that there's (at least one) definition of each basic type in each
compilation unit.

Consider, also, that you may be trying to call a routine for which
there is _no_ debug information, merely the mangled loader symbol.
Demangling the loader symbol gives you the type names of the formal
arguments, but clearly no references to DIES.

Comparing names is 

1) unpleasant and slow (consider 
     unsigned long long int
     long unsigned long int
     long long unsigned int
     unsigned long long 
     long unsigned long 
     long long unsigned 
  all of which are valid names for the same type.)

2) error prone, particularly in the case where typedefs come into the
   picture.

   If a compiler generates dwarf for 

     typedef long my_type;
     my_type my_value;

   like this :-

      <1><455>: Abbrev Number: 2 (DW_TAG_base_type)
       DW_AT_name        : my_type
       DW_AT_byte_size   : 4
       DW_AT_encoding    : 5	(signed)

   (and I don't see anything which prevents it doing that),
   then the debugger cannot ever correctly resolve a call

   print "foo(my_value)"

   in the same context as before, since comparing that type by size,
   representation and signedness it can only determine that it is
   _either_ "int" _or_ "long".

> Yes, well this is a different problem, easily fixed by correcting
> the compiler to stop eliding the typedef definitions.

My preferred solution too, and easy if you're a compiler company, not so
easy if you don't have any pull on the compiler writers since they work
for another company entirely. My experience in the past is that compiler
vendors view the correctness and completeness of debug information as a
very low priority. (That's a rant for another day, though...)

Now let me go over your specific questions

> >Something like
> >
> >  DW_AT_cplus_true_type  <constant>
> >
> >where <constant> has one of the values
> >
> >  DW_true_type_short,
> >  DW_true_type_int,
> >  DW_true_type_long,
> >  DW_true_type_long_long,
> >
> >  DW_true_type_float,
> >  DW_true_type_double,
> >  DW_true_type_long_double,
> >  DW_true_type_long_long_double
> 
> Don't you need unsigned variants for all of these too?

No, because the signed/unsigned propery is already encoded in the
DIE. There's no need to have it again.

> I'm still not understanding the problem. Is the issue that C/C++
> say, for example, that 'int' and 'signed int' are the *same type*
> but that 'int' and 'long' are not the same type even if they
> have the identical representation (on some systems)? 

Yes.

> So the problem is making it easier for the debugger to compare two
> type dies and discover whether they are the same type or not?

Not really, the problem is determining from a DIE which of the types in
the language standard it represents.

Overloading is specified in the language standard in terms of types
defined in the language standard. Until we know which type in the
language standard a DIE represents we can't sanely do overload
resolution. At the moment the only way of deciding that is by looking at
the name, and that's grubby and unpleasant for the reasons previously
discussed.

> Assuming that we are really violently agreeing, it seems to me
> that Cownie's suggested solution is very C/C++ specific and does
> not extent at all well to other languages and/or other classes of
> types that might have a similar equivalence problem.

Yes, it's C++ specific because the equivalence I'm looking for is with
types named in the C++ standard. It's not for equivalence between DIES. 

Similar information may well be required for ADA, Java, Fortran and
other languages, and I'd be happy with changing it to 

  DW_AT_language_standard_type

and having more enumeration values which allowed those languages also to
assert which language standard type this DIE represents.

-- 
-- Jim
--
James Cownie	<jcownie at etnus.com>
Etnus, LLC.     +44 117 9071438
http://www.etnus.com