Bug 216132 - devel/gdb: at e.g. -r431413: get_core_register_section can get SIGSEGV from NULL regset arguments
Summary: devel/gdb: at e.g. -r431413: get_core_register_section can get SIGSEGV from N...
Status: Closed FIXED
Alias: None
Product: Ports & Packages
Classification: Unclassified
Component: Individual Port(s) (show other bugs)
Version: Latest
Hardware: Any Any
: --- Affects Only Me
Assignee: freebsd-ports-bugs (Nobody)
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2017-01-16 07:30 UTC by Mark Millard
Modified: 2019-08-13 15:32 UTC (History)
5 users (show)

See Also:
luca.pizzamiglio: maintainer-feedback+


Attachments
armv6 (cortex-a7) gmake core that crashes /usr/local/bin/gdb (274.99 KB, application/x-gzip)
2017-01-17 20:53 UTC, Mark Millard
no flags Details
Silly workaround, still not working (1.67 KB, patch)
2017-01-17 21:55 UTC, luca.pizzamiglio
no flags Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Mark Millard 2017-01-16 07:30:51 UTC
I got the following from supplying a qemu_gmake.core (an armv6 file) to
/usr/local/bin/gdb :

Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x00000000006f3822 in get_core_register_section (regcache=0x80497ac00, regset=0x0, name=0x1291e6d ".reg", min_size=0, which=0, human_name=0x123deb4 "general-purpose", required=1) at corelow.c:544
544       if (size != min_size && !(regset->flags & REGSET_VARIABLE_SIZE))
(gdb) bt
#0  0x00000000006f3822 in get_core_register_section (regcache=0x80497ac00, regset=0x0, name=0x1291e6d ".reg", min_size=0, which=0, human_name=0x123deb4 "general-purpose", required=1) at corelow.c:544
#1  0x00000000006f256e in get_core_registers (ops=0x22556c8 <core_ops>, regcache=0x80497ac00, regno=15) at corelow.c:629
#2  0x00000000007d5cee in delegate_fetch_registers (self=0x22556c8 <core_ops>, arg1=0x80497ac00, arg2=15) at ./target-delegates.c:143
#3  0x00000000007d3ff5 in target_fetch_registers (regcache=0x80497ac00, regno=15) at target.c:3540
#4  0x00000000006ec2b8 in regcache_raw_read (regcache=0x80497ac00, regnum=15, buf=0x7fffffffdb40 "") at regcache.c:660
#5  0x00000000006ecc05 in regcache_cooked_read (regcache=0x80497ac00, regnum=15, buf=0x7fffffffdb40 "") at regcache.c:751
#6  0x00000000006ed1e0 in regcache_cooked_read_unsigned (regcache=0x80497ac00, regnum=15, val=0x7fffffffdbb0) at regcache.c:855
#7  0x00000000006ee486 in regcache_read_pc (regcache=0x80497ac00) at regcache.c:1221
#8  0x000000000075cb9e in post_create_inferior (target=0x22556c8 <core_ops>, from_tty=1) at infcmd.c:429
#9  0x00000000006f1ed6 in core_open (arg=0x8049d924a "qemu_gmake.core", from_tty=1) at corelow.c:407
#10 0x000000000085d00e in core_file_command (filename=0x8049d924a "qemu_gmake.core", from_tty=1) at corefile.c:77
#11 0x000000000063b75e in do_cfunc (c=0x8049af3a0, args=0x8049d924a "qemu_gmake.core", from_tty=1) at ./cli/cli-decode.c:105
#12 0x000000000063f538 in cmd_func (cmd=0x8049af3a0, args=0x8049d924a "qemu_gmake.core", from_tty=1) at ./cli/cli-decode.c:1913
#13 0x00000000008e229d in execute_command (p=0x8049d9258 "e", from_tty=1) at top.c:674
#14 0x000000000079c606 in command_handler (command=0x8049d9240 "") at event-top.c:628
#15 0x000000000079ca8f in command_line_handler (rl=0x80481c020 " \222\235\004\b") at event-top.c:820
#16 0x000000000079bf73 in gdb_rl_callback_handler (rl=0x80481c020 " \222\235\004\b") at event-top.c:200
#17 0x0000000802280fa4 in rl_callback_read_char () from /usr/local/lib/libreadline.so.6
#18 0x000000000079bbff in gdb_rl_callback_read_char_wrapper (client_data=0x804821000) at event-top.c:173
#19 0x000000000079c438 in stdin_event_handler (error=0, client_data=0x804821000) at event-top.c:555
#20 0x000000000079ba92 in handle_file_event (file_ptr=0x804893d70, ready_mask=1) at event-loop.c:733
#21 0x000000000079a521 in gdb_wait_for_event (block=1) at event-loop.c:859
#22 0x0000000000799ecc in gdb_do_one_event () at event-loop.c:347
#23 0x000000000079a6f7 in start_event_loop () at event-loop.c:371
#24 0x0000000000793b37 in captured_command_loop (data=0x0) at main.c:324
#25 0x000000000078c7e5 in catch_errors (func=0x793af0 <captured_command_loop(void*)>, func_args=0x0, errstring=0x1cc597f "", mask=RETURN_MASK_ALL) at exceptions.c:236
#26 0x0000000000793370 in captured_main (data=0x7fffffffe5e8) at main.c:1149
#27 0x0000000000792038 in gdb_main (args=0x7fffffffe5e8) at main.c:1159
#28 0x0000000000408ac9 in main (argc=2, argv=0x7fffffffe678) at gdb.c:38


The crash is from regset begin NULL in:

507     static void
508     get_core_register_section (struct regcache *regcache,
509                                const struct regset *regset,
510                                const char *name,
511                                int min_size,
512                                int which,
513                                const char *human_name,
514                                int required)
. . . (no references to regset) . . .
544       if (size != min_size && !(regset->flags & REGSET_VARIABLE_SIZE))
545         {
546           warning (_("Unexpected size of section `%s' in core file."),
547                    section_name);
548         }
. . .

There are calls around with regset set to NULL as a constant
argument, for example:

627       else
628         {
629           get_core_register_section (regcache, NULL,
630                                      ".reg", 0, 0, "general-purpose", 1);
631           get_core_register_section (regcache, NULL,
632                                      ".reg2", 0, 2, "floating-point", 0);
633         }

The 629 one is the one in the crash back trace listed above.
Comment 1 luca.pizzamiglio 2017-01-17 16:24:21 UTC
Thanks for reporting and the analysis.

I've a silly workaround in mind, but the core files I'm able to generate with qemu are not recognized as core files by my gdb and I cannot test it.

Do you have a core file to send to me?
Comment 2 Mark Millard 2017-01-17 19:41:47 UTC
I'll note that I use the likes of:

# /usr/local/bin/gdb
GNU gdb (GDB) 7.12 [GDB v7.12 for FreeBSD]
. . .
(gdb) set gnutarget arm-gnueabi-freebsd
(gdb) core-file qemu_gmake.core 
[New process 51247]
Segmentation fault (core dumped)

Without the gnutarget specification I just
get:

"/root/poudriere_failure/work/binutils-2.27/ld/qemu_gmake.core" is not a core dump: File format is ambiguous

Does doing similarly let you progress?


As for providing the core file. . .

It is from a gmake crash (GPLv3).

As near as I can tell uploading the arm
gmake file and/or the qemu_gmake.core file
is a binary distribution under GPLv3
and would introduce whatever obligations
GPLv3 indicates.

I'll look and see if I'm willing to do whatever
GPLv3 indicates. Otherwise I'd need to come
up with an example core file that does not
impose such obligations.
Comment 3 Mark Millard 2017-01-17 20:53:51 UTC
Created attachment 179006 [details]
armv6 (cortex-a7) gmake core that crashes /usr/local/bin/gdb

Looks like GPLv3 clause 6d is reasonable to use for
uploading the core file as far as obligations go
--as long as I explicitly point out the source is
available via combining ftp.gnu.org and
svn.freebsd.org materials (in a classic FreeBSD
port manor). The sources are available from,
for example:

http://ftp.gnu.org/gnu/make/. . . (original sources)
svn://svn.freebsd.org/ports/devel/gmake/. . . (FreeBSD port materials)

The FreeBSD context determines the vintage of the gnu
make materials and was head -r431413 on FreeBSD. The
PORTVERSION was 4.2.1 and the PORTREVISION was 1 .
DISTNAME was make-4.2.1 . FreeBSD's files folder for
gmake has a patch-default.c file that is used during
the build.
Comment 4 luca.pizzamiglio 2017-01-17 20:54:46 UTC
Hi Mark,

the gnutarget I was using was wrong, that was the problem.
Thanks for the help.

Are you running an amd64 gdb 7.12 to load an armv6 core file or an armv6 gdb to load the armv6 core file?
Comment 5 Mark Millard 2017-01-17 21:17:03 UTC
(In reply to luca.pizzamiglio from comment #4)

I'm running an amd64 /usr/local/bin/gdb 7.12 in a amd64 FreeBSD context
to look at the arm core file:

# uname -apKU
FreeBSD FreeBSDx64 12.0-CURRENT FreeBSD 12.0-CURRENT #13 r312009M: Thu Jan 12 20:11:34 PST 2017     markmi@FreeBSDx64:/usr/obj/amd64_clang/amd64.amd64/usr/src/sys/GENERIC-NODBG  amd64 amd64 1200019 1200019

# svnlite info /usr/ports/ | grep "Re[plv]"
Relative URL: ^/head
Repository Root: svn://svn.freebsd.org/ports
Repository UUID: 35697150-7ecd-e111-bb59-0022644237b5
Revision: 431413
Last Changed Rev: 431413

# /usr/local/bin/gdb
GNU gdb (GDB) 7.12 [GDB v7.12 for FreeBSD]
. . .

The gmake and qemu_gmake.core are from a context based
on:

poudriere . . . -a arm.armv6 -x . . .

activity to build ports, my first experiments with such
cross building of ports.
Comment 6 luca.pizzamiglio 2017-01-17 21:48:44 UTC
hi Mark,

thanks for the full explanation.

It's quite strange: I use poudriere to cross build gdb for arm and mips, without problem.

I used the poudriere jail to create an binary that crash, creating a core file.

The utility file tells me the the binary and the core file are correct:

$ file ccreator.arm
ccreator.arm: ELF 32-bit LSB executable, ARM, EABI5 version 1 (FreeBSD), dynamically linked, interpreter /libexec/ld-elf.so.1, for FreeBSD 11.0 (1100122), FreeBSD-style, not stripped

$ file ccreator.arm.core
ccreator.arm.core: ELF 32-bit LSB core file ARM, version 1 (FreeBSD), FreeBSD-style, from 'creator.arm'

In gdb I can load the ccreator.arm

(gdb) file ./ccreator.arm
Reading symbols from ./ccreator.arm...done
(gdb) core-file ./ccreator.arm.core
"./ccreator.arm.core" is not a core dump: File format not recognized

I'll post here a patch and you can test it, if it works.
Comment 7 luca.pizzamiglio 2017-01-17 21:55:22 UTC
Created attachment 179008 [details]
Silly workaround, still not working

This is my silly workaround,

it avoids the crash, but it doesn't solve the problem.
I need more time to understand if the problem is in the bfd library or in gdb itself...
Comment 8 Mark Millard 2017-01-17 21:59:12 UTC
(In reply to luca.pizzamiglio from comment #6)

I did provide an attachment with the qemu_gmake.core file
(in a compressed tar).

I'll also note that the qemu_gmake.core file name was what
poudriere produced in the tar archive for the failure (I was
using poudriere -w).

So you should be able to look at the exact file content
that I get the problem with.
Comment 9 luca.pizzamiglio 2017-01-17 22:19:03 UTC
(In reply to Mark Millard from comment #8)

Thanks for the core file.
I've attached the silly patch and I've used your core file with my patched gdb, solving the crash, but not the problem.

There is some issue accessing the register information. I'm not sure if the problem is qemu, gdb or bfd (the bundled library for the binary format)
Comment 10 Mark Millard 2017-01-26 22:23:31 UTC
(In reply to luca.pizzamiglio from comment #9)

Just an FYI: lldb on an arm (bpim3) was able to get a
useful interpretation of the qemu_gmake.core file
when also given a copy of gmake.

Below I give some information for potential comparison
uses:

For "TCG temporary leak before 00021826" the symbol dump in addresses
order shows:

Dumping symbol table for 4 modules.
Symtab, file = /usr/local/bin/gmake, num_symbols = 957 (sorted by address):
              Debug symbol
              |Synthetic symbol
              ||Externally Visible
              |||
Index   UserID DSX Type            File Address/Value Load Address       Size               Flags      Name
------- ------ --- --------------- ------------------ ------------------ ------------------ ---------- ----------------------------------
. . .
[  538]   6121   X Code            0x0000000000021820 0x00029820 0x0000000000000038 0x00000012 child_handler
[  592]   6175   X Code            0x0000000000021858 0x00029858 0x0000000000000d7c 0x00000012 reap_children
. . .

This looks like it tends to confirm the SIGCHLD handling is involved.

And objdump on gmake shows:

00021820 <child_handler> push   {fp, lr}
00021824 <child_handler+0x4> mov        fp, sp
00021828 <child_handler+0x8> sub        sp, sp, #8
0002182c <child_handler+0xc> mov        r1, r0
00021830 <child_handler+0x10> str       r0, [sp, #4]
00021834 <child_handler+0x14> movw      r0, #36636      ; 0x8f1c
00021838 <child_handler+0x18> movt      r0, #5
0002183c <child_handler+0x1c> ldr       r2, [r0]
00021840 <child_handler+0x20> add       r2, r2, #1
00021844 <child_handler+0x24> str       r2, [r0]
00021848 <child_handler+0x28> str       r1, [sp]
0002184c <child_handler+0x2c> bl        0002e9f0 <jobserver_signal>
00021850 <child_handler+0x30> mov       sp, fp
00021854 <child_handler+0x34> pop       {fp, pc}

Interestingly 00021826 is between instructions and
lldb reported for the registers:

(lldb) register read
General Purpose Registers:
       r0 = 0x9fffc0f8
       r1 = 0x9fffc138
       r2 = 0x000a18c0
       r3 = 0xf4fde858
       r4 = 0x9fffc138
       r5 = 0xf4a00000
       r6 = 0xb6db6db7
       r7 = 0x00000012
       r8 = 0xf4a0c000
       r9 = 0xf4aa18c0
      r10 = 0x9fffc260
      r11 = 0x00000004
      r12 = 0x9fffc0f8
       sp = 0x9fffc0f8
       lr = 0x9fffffcc
       pc = 0x00021822
     cpsr = 0x80000030

i.e., the pc being 0x00021822 . That would be in the
middle of the "push   {fp, lr}" instruction and 4
bytes before the 00021826 figure.

If it really tried to fetch an instruction at
0x00021822 that likely would also explain getting a
SIGILL classification for the 4 bytes starting
there.
Comment 11 Mark Millard 2017-01-26 23:00:56 UTC
(In reply to Mark Millard from comment #10)

I should have warned that the example is from a later failure
in gmake, a different qemu-gmake.core file.

I was really only thinking of the PC information for
comparison, not the other register values.

If needed I can produce more information from a specific
qemu_gmake.core file and make sure that you have that file
and the matching information.

Outside gmake (such as system libraries) things make not
be a full match between the live arm system and what
poudriere has going in its qemu use.
Comment 12 luca.pizzamiglio 2017-01-27 09:32:33 UTC
(In reply to Mark Millard from comment #10 and #11)

Hi Mark, thanks for your analysis.
I worked on the gdb side, finding that the problem is in the target definition area: gdb is using CRIS architecture functions to interpret the register values stored in the core file; obviously, it doesn't work.
CRIS is a RISC CPU, it has nothing to do with ARM and I've no idea the reason of this wrong setting.
Comment 13 commit-hook freebsd_committer freebsd_triage 2017-02-14 10:29:50 UTC
A commit references this bug:

Author: olivier
Date: Tue Feb 14 10:29:38 UTC 2017
New revision: 434072
URL: https://svnweb.freebsd.org/changeset/ports/434072

Log:
  Update to 7.12.1
  Updating gdb to the last stable version and cleaning it up.

  PR:		217090
  Submitted by:	luca.pizzamiglio@gmail.com (maintainer)

  PR:		216027
    - Recognizing the compiler to adopt options properly
    Reported by:   julian@FreeBSD.org

  PR:		216132
    - Fixing the segmentation fault, but arm core dump not yet usable
    Reported by:   Mark Millard

Changes:
  head/devel/gdb/Makefile
  head/devel/gdb/distinfo
  head/devel/gdb/files/patch-gdb-corelow.c
Comment 14 Olivier Cochard freebsd_committer freebsd_triage 2017-02-14 10:31:54 UTC
Can we close this PR now?
Comment 15 luca.pizzamiglio 2017-02-14 10:37:08 UTC
(In reply to Olivier Cochard from comment #14)

Unfortunately, we cannot close this PR. The crash is gone, but gdb is not using the arm bfd functions to interpret an arm core file, but the cris architecture definition instead.

Please, leave this bug report open.
Comment 16 John Baldwin freebsd_committer freebsd_triage 2018-09-09 05:00:31 UTC
Is this PR still relevant after we switched to an upstream FreeBSD/arm target?
Comment 17 Walter Schwarzenfeld 2019-08-13 15:20:48 UTC
See question comment18!
Comment 18 Walter Schwarzenfeld 2019-08-13 15:21:35 UTC
...meant comment16.
Comment 19 John Baldwin freebsd_committer freebsd_triage 2019-08-13 15:32:47 UTC
I don't find that it uses cris, but I find that it uses the wrong arm OSABI by default unless I manually set the architecture via 'set architecture' and that is only on 32-bit ARM.  I don't think it crashes anymore though and if it picks the wrong OSABI that should probably be a separate bug.