[Dwarf-discuss] Seeking a test program with a >4GB .debug_info section

John DelSignore JDelSignore@perforce.com
Fri Apr 21 21:16:21 GMT 2023


On 4/21/23 16:36, David Blaikie wrote:
On Fri, Apr 21, 2023 at 12:44 PM John DelSignore <JDelSignore@perforce.com<mailto:JDelSignore@perforce.com>> wrote:

Well, it took a long time to compile 5 CUs that contained your test code, and things were looking promising, but the link failed:

rocm2 42 04/21 15:14 /build/jdelsign/fatty % make
g++ -g -c fatty4.cxx -o fatty4.o
g++ -g -c fatty5.cxx -o fatty5.o
g++ -g -o fatty fatty.o fatty2.o fatty3.o fatty4.o fatty5.o
fatty5.o:(.debug_aranges+0x6): relocation truncated to fit: R_X86_64_32 against `.debug_info'
collect2: error: ld returned 1 exit status
make: *** [Makefile:5: fatty] Error 1
rocm2 43 04/21 15:39 /build/jdelsign/fatty %

I guess I'm now in favor of the proposal to get rid of .debug_aranges. :-)

I guess, backing up - what's your goal/what're you trying to do with DWARF over 4GB?

We have a TotalView user that has a gigantic executable (~9GB) and the .debug_info section alone is 4.9GB. A few of the other sections are about 1GB, but under the 32-bit limit. That's about all I know, because the user is at a secure location and cannot share the executable or other details.

TotalView had a built-in limit of 4GB for DWARF sections. Recently, I increased TV's DWARF section sizes to 64-bits to handle this user's code, but I'm sure there are other changes that are needed to properly handle a .debug_info section that is that big. So, I'm looking for a test case to feed to TotalView.

I changed "fatty" to compile with clang++ (instead of g++), and that worked. I have the following section sizes:

rocm2 55 04/21 16:40 /build/jdelsign/fatty % readelf -SW fatty|grep debug
  [30] .debug_abbrev     PROGBITS        0000000000000000 002948 000670 00      0   0  1
  [31] .debug_info       PROGBITS        0000000000000000 002fb8 15e82990e 00      0   0  1
  [32] .debug_str_offsets PROGBITS        0000000000000000 15e82c8c6 c0db3e8 00      0   0  1
  [33] .debug_str        PROGBITS        0000000000000000 16a907cae 217ee2aa 01  MS  0   0  1
  [34] .debug_addr       PROGBITS        0000000000000000 18c0f5f58 000050 00      0   0  1
  [35] .debug_line       PROGBITS        0000000000000000 18c0f5fa8 0001dd 00      0   0  1
  [36] .debug_line_str   PROGBITS        0000000000000000 18c0f6185 00004c 01  MS  0   0  1
rocm2 56 04/21 16:41 /build/jdelsign/fatty %

That looks like a decent test case (I haven't tried TotalView on it yet), so thanks for the suggestion.

You do have to use DWARF64 for a .debug_info section over 4GB - for any section-relative reference in that section, such as cross-CU references (sec_offset), or aranges or debug_names I think, etc. Because with DWARF32 the section references are 32bit, so can't exceed 4GB.

I suspected that the application might contain 64-bit DWARF, so I asked the user dump/grep the .debug_info section and oddly enough all of the CUs are 32-bit DWARF. I have no idea if they are damaged. The CUs must not reference any DIEs outside their own CU.

(also, with a different example, you'd get string data over 4GB, which you'd also need DWARF64 for, or in DWARFv5 (with some dispensations from DWARFv6) you could use DWARF64 for .debug_str_offsets (assuming all string references were strx forms) without using DWARF64 for everything else)

Understood. Luckily, my test program's string table made it with some room to spare.

In Split DWARF if each contribution is <4GB you only need a 64 bit cu/tu_index and 64 bit str_offsets (& you could even do that selectively, only using DWARF64 str_offsets for contributions that need 64 bit offsets), since nothing else references across whole sections - which makes it much more scalable/easier to solve.

Yes, I asked if they'd be willing to consider Split DWARF, but I haven't received an answer yet. I think it would save considerable link time and disk space, but they may have reasons why Split DWARF would not work for them.

BTW, given that "gold" is deprecated, is there a linker than can build a decent ".gdb_index" to use with the Split DWARF? Without a decent index, it's hard to get same-day service out of the debugger with these giant executables.

Also, another issue is that even if you have simple/small bits of DWARF32 (in some precompiled library, etc), if your total program exceeds 64 bit, you may not be able link the program because that DWARF32 might end up being put at the end of the section, and so the debug_aranges, for instance, needs to record the offset of the CU in 32 bits but can't. So there's various discussions about linkers being able to sort the DWARF32 contributions earlier/first before the DWARF64 contributions. Then you could still link DWARF32 precompiled libraries into huge programs that exceed the DWARF32 limits. I can go find some links to those threads/discussions if you need them, I think some happened in the LLVM open source community.

Looks like clang dodged all of those problems somehow:

rocm2 60 04/21 16:58 /build/jdelsign/fatty % llvm-dwarfdump fatty | grep "Compile Unit"
0x00000000: Compile Unit: length = 0x461a1e96, format = DWARF32, version = 0x0005, unit_type = DW_UT_compile, abbr_offset = 0x0000, addr_size = 0x08 (next unit at 0x461a1e9a)
0x461a1e9a: Compile Unit: length = 0x461a1e99, format = DWARF32, version = 0x0005, unit_type = DW_UT_compile, abbr_offset = 0x0148, addr_size = 0x08 (next unit at 0x8c343d37)
0x8c343d37: Compile Unit: length = 0x461a1e99, format = DWARF32, version = 0x0005, unit_type = DW_UT_compile, abbr_offset = 0x0292, addr_size = 0x08 (next unit at 0xd24e5bd4)
0xd24e5bd4: Compile Unit: length = 0x461a1e99, format = DWARF32, version = 0x0005, unit_type = DW_UT_compile, abbr_offset = 0x03dc, addr_size = 0x08 (next unit at 0x118687a71)
0x118687a71: Compile Unit: length = 0x461a1e99, format = DWARF32, version = 0x0005, unit_type = DW_UT_compile, abbr_offset = 0x0526, addr_size = 0x08 (next unit at 0x15e82990e)
rocm2 61 04/21 17:13 /build/jdelsign/fatty %

This should give the debuggers something to chew on for a while.

Cheers, John D.


- Dave


Cheers, John D.

On 4/21/23 13:28, John DelSignore wrote:

Thanks David, this is useful. I'll see what I can cobble together.

Cheers, John D.

On 4/20/23 21:58, David Blaikie wrote:
Oh, and I guess you could always make something even more artificial by hand - if you compile some random code with -g to assembly, you could then just pad out a .debug_info contribution with lots of zeros (there are some assembly directives for that, I think, but don't know assembly that well off hand) - would make it arbitrarily large without the need to tax the compiler creating novel/real DWARF, etc.

On Thu, Apr 20, 2023 at 6:54 PM David Blaikie <dblaikie@gmail.com<mailto:dblaikie@gmail.com>> wrote:
I /believe/ that Chromium (maybe specifically on ARM? not sure) may have hit/had problems with the 4GB limit - probably trivially if you build with clang but pass `-fstandalone-debug` which disables many type reduction/deduplication strategies.

If you want something more standalone... this:


#define MEMBERS(BASE) \
  int BASE##0 (int, int, int, int, int, int, int, int, int, int, int, int, int, int, int, int, int, int, int, int); \
  int BASE##1 (int, int, int, int, int, int, int, int, int, int, int, int, int, int, int, int, int, int, int, int); \
  int BASE##2 (int, int, int, int, int, int, int, int, int, int, int, int, int, int, int, int, int, int, int, int); \
  int BASE##3 (int, int, int, int, int, int, int, int, int, int, int, int, int, int, int, int, int, int, int, int); \
  int BASE##4 (int, int, int, int, int, int, int, int, int, int, int, int, int, int, int, int, int, int, int, int); \
  int BASE##5 (int, int, int, int, int, int, int, int, int, int, int, int, int, int, int, int, int, int, int, int); \
  int BASE##6 (int, int, int, int, int, int, int, int, int, int, int, int, int, int, int, int, int, int, int, int); \
  int BASE##7 (int, int, int, int, int, int, int, int, int, int, int, int, int, int, int, int, int, int, int, int); \
  int BASE##8 (int, int, int, int, int, int, int, int, int, int, int, int, int, int, int, int, int, int, int, int); \
  int BASE##9 (int, int, int, int, int, int, int, int, int, int, int, int, int, int, int, int, int, int, int, int);
template<int ... i>
struct t1 {
  MEMBERS(f0)
  MEMBERS(f1)
  MEMBERS(f2)
  MEMBERS(f3)
  MEMBERS(f4)
  MEMBERS(f5)
  MEMBERS(f6)
  MEMBERS(f7)
  MEMBERS(f8)
  MEMBERS(f9)
};
#define ITER(A, B)    \
  template <int... i> \
  struct A {         \
    B<i..., 0> v0;   \
    B<i..., 1> v1;   \
    B<i..., 2> v2;   \
    B<i..., 3> v3;   \
    B<i..., 4> v4;   \
    B<i..., 5> v5;   \
    B<i..., 6> v6;   \
    B<i..., 7> v7;   \
    B<i..., 8> v8;   \
    B<i..., 9> v9;   \
  };
ITER(t2, t1);
ITER(t3, t2);
ITER(t4, t3);
ITER(t5, t4);
ITER(t6, t5);
ITER(t7, t6);
ITER(top, t7);
int main() {
  t6<> v;
}

Doesn't quite hit 4GB, it's about 1.2GB in .debug_info (& takes 2.5 minutes to compile with clang) - 5 of these (could stamp them out by including this file into a few other source files & just changing the `main` function to some other name in each)

This specifically doesn't push the .debug_str section as hard - it's about half the size of the .debug_info in this program.



On Thu, Apr 20, 2023 at 7:08 AM John DelSignore via Dwarf-discuss <dwarf-discuss@lists.dwarfstd.org<mailto:dwarf-discuss@lists.dwarfstd.org>> wrote:
Is anyone aware of an open-source program or test program that when compiled and built on Linux x86_64, results in a .debug_info section that is greater than 4GB? I'm looking for a test program (realistic or not) that contains 32-bit DWARF CUs in a .debug_info section that is about 5GB long, or longer.

Thanks, John D.



This e-mail may contain information that is privileged or confidential. If you are not the intended recipient, please delete the e-mail and any attachments and notify us immediately.

--
Dwarf-discuss mailing list
Dwarf-discuss@lists.dwarfstd.org<mailto:Dwarf-discuss@lists.dwarfstd.org>
https://lists.dwarfstd.org/mailman/listinfo/dwarf-discuss<https://lists.dwarfstd.org/mailman/listinfo/dwarf-discuss>


CAUTION: This email originated from outside of the organization. Do not click on links or open attachments unless you recognize the sender and know the content is safe.


This e-mail may contain information that is privileged or confidential. If you are not the intended recipient, please delete the e-mail and any attachments and notify us immediately.



CAUTION: This email originated from outside of the organization. Do not click on links or open attachments unless you recognize the sender and know the content is safe.


This e-mail may contain information that is privileged or confidential. If you are not the intended recipient, please delete the e-mail and any attachments and notify us immediately.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.dwarfstd.org/pipermail/dwarf-discuss/attachments/20230421/3a2a480b/attachment-0001.htm>


More information about the Dwarf-discuss mailing list