I got the following from supplying a qemu_gmake.core (an armv6 file) to /usr/local/bin/gdb : Program terminated with signal SIGSEGV, Segmentation fault. #0 0x00000000006f3822 in get_core_register_section (regcache=0x80497ac00, regset=0x0, name=0x1291e6d ".reg", min_size=0, which=0, human_name=0x123deb4 "general-purpose", required=1) at corelow.c:544 544 if (size != min_size && !(regset->flags & REGSET_VARIABLE_SIZE)) (gdb) bt #0 0x00000000006f3822 in get_core_register_section (regcache=0x80497ac00, regset=0x0, name=0x1291e6d ".reg", min_size=0, which=0, human_name=0x123deb4 "general-purpose", required=1) at corelow.c:544 #1 0x00000000006f256e in get_core_registers (ops=0x22556c8 <core_ops>, regcache=0x80497ac00, regno=15) at corelow.c:629 #2 0x00000000007d5cee in delegate_fetch_registers (self=0x22556c8 <core_ops>, arg1=0x80497ac00, arg2=15) at ./target-delegates.c:143 #3 0x00000000007d3ff5 in target_fetch_registers (regcache=0x80497ac00, regno=15) at target.c:3540 #4 0x00000000006ec2b8 in regcache_raw_read (regcache=0x80497ac00, regnum=15, buf=0x7fffffffdb40 "") at regcache.c:660 #5 0x00000000006ecc05 in regcache_cooked_read (regcache=0x80497ac00, regnum=15, buf=0x7fffffffdb40 "") at regcache.c:751 #6 0x00000000006ed1e0 in regcache_cooked_read_unsigned (regcache=0x80497ac00, regnum=15, val=0x7fffffffdbb0) at regcache.c:855 #7 0x00000000006ee486 in regcache_read_pc (regcache=0x80497ac00) at regcache.c:1221 #8 0x000000000075cb9e in post_create_inferior (target=0x22556c8 <core_ops>, from_tty=1) at infcmd.c:429 #9 0x00000000006f1ed6 in core_open (arg=0x8049d924a "qemu_gmake.core", from_tty=1) at corelow.c:407 #10 0x000000000085d00e in core_file_command (filename=0x8049d924a "qemu_gmake.core", from_tty=1) at corefile.c:77 #11 0x000000000063b75e in do_cfunc (c=0x8049af3a0, args=0x8049d924a "qemu_gmake.core", from_tty=1) at ./cli/cli-decode.c:105 #12 0x000000000063f538 in cmd_func (cmd=0x8049af3a0, args=0x8049d924a "qemu_gmake.core", from_tty=1) at ./cli/cli-decode.c:1913 #13 0x00000000008e229d in execute_command (p=0x8049d9258 "e", from_tty=1) at top.c:674 #14 0x000000000079c606 in command_handler (command=0x8049d9240 "") at event-top.c:628 #15 0x000000000079ca8f in command_line_handler (rl=0x80481c020 " \222\235\004\b") at event-top.c:820 #16 0x000000000079bf73 in gdb_rl_callback_handler (rl=0x80481c020 " \222\235\004\b") at event-top.c:200 #17 0x0000000802280fa4 in rl_callback_read_char () from /usr/local/lib/libreadline.so.6 #18 0x000000000079bbff in gdb_rl_callback_read_char_wrapper (client_data=0x804821000) at event-top.c:173 #19 0x000000000079c438 in stdin_event_handler (error=0, client_data=0x804821000) at event-top.c:555 #20 0x000000000079ba92 in handle_file_event (file_ptr=0x804893d70, ready_mask=1) at event-loop.c:733 #21 0x000000000079a521 in gdb_wait_for_event (block=1) at event-loop.c:859 #22 0x0000000000799ecc in gdb_do_one_event () at event-loop.c:347 #23 0x000000000079a6f7 in start_event_loop () at event-loop.c:371 #24 0x0000000000793b37 in captured_command_loop (data=0x0) at main.c:324 #25 0x000000000078c7e5 in catch_errors (func=0x793af0 <captured_command_loop(void*)>, func_args=0x0, errstring=0x1cc597f "", mask=RETURN_MASK_ALL) at exceptions.c:236 #26 0x0000000000793370 in captured_main (data=0x7fffffffe5e8) at main.c:1149 #27 0x0000000000792038 in gdb_main (args=0x7fffffffe5e8) at main.c:1159 #28 0x0000000000408ac9 in main (argc=2, argv=0x7fffffffe678) at gdb.c:38 The crash is from regset begin NULL in: 507 static void 508 get_core_register_section (struct regcache *regcache, 509 const struct regset *regset, 510 const char *name, 511 int min_size, 512 int which, 513 const char *human_name, 514 int required) . . . (no references to regset) . . . 544 if (size != min_size && !(regset->flags & REGSET_VARIABLE_SIZE)) 545 { 546 warning (_("Unexpected size of section `%s' in core file."), 547 section_name); 548 } . . . There are calls around with regset set to NULL as a constant argument, for example: 627 else 628 { 629 get_core_register_section (regcache, NULL, 630 ".reg", 0, 0, "general-purpose", 1); 631 get_core_register_section (regcache, NULL, 632 ".reg2", 0, 2, "floating-point", 0); 633 } The 629 one is the one in the crash back trace listed above.
Thanks for reporting and the analysis. I've a silly workaround in mind, but the core files I'm able to generate with qemu are not recognized as core files by my gdb and I cannot test it. Do you have a core file to send to me?
I'll note that I use the likes of: # /usr/local/bin/gdb GNU gdb (GDB) 7.12 [GDB v7.12 for FreeBSD] . . . (gdb) set gnutarget arm-gnueabi-freebsd (gdb) core-file qemu_gmake.core [New process 51247] Segmentation fault (core dumped) Without the gnutarget specification I just get: "/root/poudriere_failure/work/binutils-2.27/ld/qemu_gmake.core" is not a core dump: File format is ambiguous Does doing similarly let you progress? As for providing the core file. . . It is from a gmake crash (GPLv3). As near as I can tell uploading the arm gmake file and/or the qemu_gmake.core file is a binary distribution under GPLv3 and would introduce whatever obligations GPLv3 indicates. I'll look and see if I'm willing to do whatever GPLv3 indicates. Otherwise I'd need to come up with an example core file that does not impose such obligations.
Created attachment 179006 [details] armv6 (cortex-a7) gmake core that crashes /usr/local/bin/gdb Looks like GPLv3 clause 6d is reasonable to use for uploading the core file as far as obligations go --as long as I explicitly point out the source is available via combining ftp.gnu.org and svn.freebsd.org materials (in a classic FreeBSD port manor). The sources are available from, for example: http://ftp.gnu.org/gnu/make/. . . (original sources) svn://svn.freebsd.org/ports/devel/gmake/. . . (FreeBSD port materials) The FreeBSD context determines the vintage of the gnu make materials and was head -r431413 on FreeBSD. The PORTVERSION was 4.2.1 and the PORTREVISION was 1 . DISTNAME was make-4.2.1 . FreeBSD's files folder for gmake has a patch-default.c file that is used during the build.
Hi Mark, the gnutarget I was using was wrong, that was the problem. Thanks for the help. Are you running an amd64 gdb 7.12 to load an armv6 core file or an armv6 gdb to load the armv6 core file?
(In reply to luca.pizzamiglio from comment #4) I'm running an amd64 /usr/local/bin/gdb 7.12 in a amd64 FreeBSD context to look at the arm core file: # uname -apKU FreeBSD FreeBSDx64 12.0-CURRENT FreeBSD 12.0-CURRENT #13 r312009M: Thu Jan 12 20:11:34 PST 2017 markmi@FreeBSDx64:/usr/obj/amd64_clang/amd64.amd64/usr/src/sys/GENERIC-NODBG amd64 amd64 1200019 1200019 # svnlite info /usr/ports/ | grep "Re[plv]" Relative URL: ^/head Repository Root: svn://svn.freebsd.org/ports Repository UUID: 35697150-7ecd-e111-bb59-0022644237b5 Revision: 431413 Last Changed Rev: 431413 # /usr/local/bin/gdb GNU gdb (GDB) 7.12 [GDB v7.12 for FreeBSD] . . . The gmake and qemu_gmake.core are from a context based on: poudriere . . . -a arm.armv6 -x . . . activity to build ports, my first experiments with such cross building of ports.
hi Mark, thanks for the full explanation. It's quite strange: I use poudriere to cross build gdb for arm and mips, without problem. I used the poudriere jail to create an binary that crash, creating a core file. The utility file tells me the the binary and the core file are correct: $ file ccreator.arm ccreator.arm: ELF 32-bit LSB executable, ARM, EABI5 version 1 (FreeBSD), dynamically linked, interpreter /libexec/ld-elf.so.1, for FreeBSD 11.0 (1100122), FreeBSD-style, not stripped $ file ccreator.arm.core ccreator.arm.core: ELF 32-bit LSB core file ARM, version 1 (FreeBSD), FreeBSD-style, from 'creator.arm' In gdb I can load the ccreator.arm (gdb) file ./ccreator.arm Reading symbols from ./ccreator.arm...done (gdb) core-file ./ccreator.arm.core "./ccreator.arm.core" is not a core dump: File format not recognized I'll post here a patch and you can test it, if it works.
Created attachment 179008 [details] Silly workaround, still not working This is my silly workaround, it avoids the crash, but it doesn't solve the problem. I need more time to understand if the problem is in the bfd library or in gdb itself...
(In reply to luca.pizzamiglio from comment #6) I did provide an attachment with the qemu_gmake.core file (in a compressed tar). I'll also note that the qemu_gmake.core file name was what poudriere produced in the tar archive for the failure (I was using poudriere -w). So you should be able to look at the exact file content that I get the problem with.
(In reply to Mark Millard from comment #8) Thanks for the core file. I've attached the silly patch and I've used your core file with my patched gdb, solving the crash, but not the problem. There is some issue accessing the register information. I'm not sure if the problem is qemu, gdb or bfd (the bundled library for the binary format)
(In reply to luca.pizzamiglio from comment #9) Just an FYI: lldb on an arm (bpim3) was able to get a useful interpretation of the qemu_gmake.core file when also given a copy of gmake. Below I give some information for potential comparison uses: For "TCG temporary leak before 00021826" the symbol dump in addresses order shows: Dumping symbol table for 4 modules. Symtab, file = /usr/local/bin/gmake, num_symbols = 957 (sorted by address): Debug symbol |Synthetic symbol ||Externally Visible ||| Index UserID DSX Type File Address/Value Load Address Size Flags Name ------- ------ --- --------------- ------------------ ------------------ ------------------ ---------- ---------------------------------- . . . [ 538] 6121 X Code 0x0000000000021820 0x00029820 0x0000000000000038 0x00000012 child_handler [ 592] 6175 X Code 0x0000000000021858 0x00029858 0x0000000000000d7c 0x00000012 reap_children . . . This looks like it tends to confirm the SIGCHLD handling is involved. And objdump on gmake shows: 00021820 <child_handler> push {fp, lr} 00021824 <child_handler+0x4> mov fp, sp 00021828 <child_handler+0x8> sub sp, sp, #8 0002182c <child_handler+0xc> mov r1, r0 00021830 <child_handler+0x10> str r0, [sp, #4] 00021834 <child_handler+0x14> movw r0, #36636 ; 0x8f1c 00021838 <child_handler+0x18> movt r0, #5 0002183c <child_handler+0x1c> ldr r2, [r0] 00021840 <child_handler+0x20> add r2, r2, #1 00021844 <child_handler+0x24> str r2, [r0] 00021848 <child_handler+0x28> str r1, [sp] 0002184c <child_handler+0x2c> bl 0002e9f0 <jobserver_signal> 00021850 <child_handler+0x30> mov sp, fp 00021854 <child_handler+0x34> pop {fp, pc} Interestingly 00021826 is between instructions and lldb reported for the registers: (lldb) register read General Purpose Registers: r0 = 0x9fffc0f8 r1 = 0x9fffc138 r2 = 0x000a18c0 r3 = 0xf4fde858 r4 = 0x9fffc138 r5 = 0xf4a00000 r6 = 0xb6db6db7 r7 = 0x00000012 r8 = 0xf4a0c000 r9 = 0xf4aa18c0 r10 = 0x9fffc260 r11 = 0x00000004 r12 = 0x9fffc0f8 sp = 0x9fffc0f8 lr = 0x9fffffcc pc = 0x00021822 cpsr = 0x80000030 i.e., the pc being 0x00021822 . That would be in the middle of the "push {fp, lr}" instruction and 4 bytes before the 00021826 figure. If it really tried to fetch an instruction at 0x00021822 that likely would also explain getting a SIGILL classification for the 4 bytes starting there.
(In reply to Mark Millard from comment #10) I should have warned that the example is from a later failure in gmake, a different qemu-gmake.core file. I was really only thinking of the PC information for comparison, not the other register values. If needed I can produce more information from a specific qemu_gmake.core file and make sure that you have that file and the matching information. Outside gmake (such as system libraries) things make not be a full match between the live arm system and what poudriere has going in its qemu use.
(In reply to Mark Millard from comment #10 and #11) Hi Mark, thanks for your analysis. I worked on the gdb side, finding that the problem is in the target definition area: gdb is using CRIS architecture functions to interpret the register values stored in the core file; obviously, it doesn't work. CRIS is a RISC CPU, it has nothing to do with ARM and I've no idea the reason of this wrong setting.
A commit references this bug: Author: olivier Date: Tue Feb 14 10:29:38 UTC 2017 New revision: 434072 URL: https://svnweb.freebsd.org/changeset/ports/434072 Log: Update to 7.12.1 Updating gdb to the last stable version and cleaning it up. PR: 217090 Submitted by: luca.pizzamiglio@gmail.com (maintainer) PR: 216027 - Recognizing the compiler to adopt options properly Reported by: julian@FreeBSD.org PR: 216132 - Fixing the segmentation fault, but arm core dump not yet usable Reported by: Mark Millard Changes: head/devel/gdb/Makefile head/devel/gdb/distinfo head/devel/gdb/files/patch-gdb-corelow.c
Can we close this PR now?
(In reply to Olivier Cochard from comment #14) Unfortunately, we cannot close this PR. The crash is gone, but gdb is not using the arm bfd functions to interpret an arm core file, but the cris architecture definition instead. Please, leave this bug report open.
Is this PR still relevant after we switched to an upstream FreeBSD/arm target?
See question comment18!
...meant comment16.
I don't find that it uses cris, but I find that it uses the wrong arm OSABI by default unless I manually set the architecture via 'set architecture' and that is only on 32-bit ARM. I don't think it crashes anymore though and if it picks the wrong OSABI that should probably be a separate bug.