Bug 257036 - devel/gdb: Fails to run gdb/thread.c:1309: internal-error: void switch_to_thread(thread_info *): Assertion `thr != NULL' failed.
Summary: devel/gdb: Fails to run gdb/thread.c:1309: internal-error: void switch_to_thr...
Status: In Progress
Alias: None
Product: Ports & Packages
Classification: Unclassified
Component: Individual Port(s) (show other bugs)
Version: Latest
Hardware: riscv Any
: --- Affects Only Me
Assignee: Luca Pizzamiglio
URL:
Keywords: needs-qa
Depends on:
Blocks:
 
Reported: 2021-07-07 05:54 UTC by dgilbert
Modified: 2021-07-08 21:45 UTC (History)
2 users (show)

See Also:
pizzamig: maintainer-feedback+


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description dgilbert 2021-07-07 05:54:22 UTC
When debugging vmcore.3 of kernel (both found at https://nextcloud.towernet.ca/s/FkZqQ5pY2cGHD9A )  I get the following:

[1:12:12]root@ump:/var/crash> kgdb /boot/kernel.old/kernel /var/crash/vmcore.3
GNU gdb (GDB) 10.2 [GDB v10.2 for FreeBSD]
Copyright (C) 2021 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "riscv64-portbld-freebsd14.0".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<https://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /boot/kernel.old/kernel...
(No debugging symbols found in /boot/kernel.old/kernel)
/usr/ports/devel/gdb/work-py38/gdb-10.2/gdb/thread.c:1309: internal-error: void switch_to_thread(thread_info *): Assertion `thr != NULL' failed.
A problem internal to GDB has been detected,
further debugging may prove unreliable.
Quit this debugging session? (y or n)

the vmcore actually has a rather worrisome panic with nfs and nfs cache ... so it would be good to solve this.
Comment 1 Kubilay Kocak freebsd_committer freebsd_triage 2021-07-07 06:06:18 UTC
@Reporter Could you please add the following information:

 - uname -a output
 - gdb installed from ports or packages? (assuming ports due to presence of WRKSRC)
 - Configuration of gdb (include build log as attachment)
Comment 2 dgilbert 2021-07-07 06:14:33 UTC
FreeBSD ump.daveg.ca 14.0-CURRENT FreeBSD 14.0-CURRENT #2 unmatched-n247472-2c2ed1f58a18: Wed Jul  7 01:02:27 EDT 2021     dgilbert@vr.home.dclg.ca:/home/dgilbert/FreeBSD/obj/home/dgilbert/FreeBSD/src/riscv.riscv64/sys/GENERIC  riscv

Ports, not pkg.

No options changed.
Comment 3 Luca Pizzamiglio freebsd_committer 2021-07-07 09:52:42 UTC
Hi. Last question: portrevision 1 or 0?
Comment 4 dgilbert 2021-07-07 16:57:08 UTC
gdb-10.2_1
Comment 5 dgilbert 2021-07-07 17:24:38 UTC
I'm not going to close this yet, but I now understand what is happening and I can open the much more serious bug.

/usr/lib/debug/boot/kernel/kernel.debug is where that lives now.  When I replaced /boot/kernel (Manually, cross compiled because reasons) I didn't replace the hidden file.  This causes kgdb to crash with this impenetrable message.

... seems like this bug should morph into putting in a simple warning if the debug file doesn't match properly.

_and_ ... it seems "some" symbols are in the kernel file itself --- so don't nerf that functionality.
Comment 6 dgilbert 2021-07-08 03:34:52 UTC
I don't agree.  I don't think this is _speicific_ to riscv... unless you're saying that this failure to work with the wrong debug file (which is now well hidden) is also risc-v specific.
Comment 7 Zhenlei Huang 2021-07-08 05:07:52 UTC
Also observed this bug on amd64. It is gdb-10.1_1 installed via pkg.


kgdb -c /var/crash/vmcore.6 /boot/kernel/kernel 
GNU gdb (GDB) 10.1 [GDB v10.1 for FreeBSD]
Copyright (C) 2020 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-portbld-freebsd13.0".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<https://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /boot/kernel/kernel...
Reading symbols from /usr/lib/debug//boot/kernel/kernel.debug...
/wrkdirs/usr/ports/devel/gdb/work-py37/gdb-10.1/gdb/thread.c:1309: internal-error: void switch_to_thread(thread_info *): Assertion `thr != NULL' failed.
A problem internal to GDB has been detected,
further debugging may prove unreliable.
Quit this debugging session? (y or n) y

This is a bug, please report it.  For instructions, see:
<https://www.gnu.org/software/gdb/bugs/>.

/wrkdirs/usr/ports/devel/gdb/work-py37/gdb-10.1/gdb/thread.c:1309: internal-error: void switch_to_thread(thread_info *): Assertion `thr != NULL' failed.
A problem internal to GDB has been detected,
further debugging may prove unreliable.
Create a core file of GDB? (y or n) n
Comment 8 Luca Pizzamiglio freebsd_committer 2021-07-08 21:25:46 UTC
(In reply to dgilbert from comment #5)
I agree, the gdb message is not very informative.
However, those asserts are there to not temper with the debugged process, not as form of "input validation".

gdb doesn't provide the feature of verify if the debug symbol file and the binary are in sync, this information is just not available and it could be very hard to implement, IMO.

In general, I would close this PR, because:
1. it's not a FreeBSD specific bug
2. it's a potential gdb feature, that should be reported upstream
Comment 9 dgilbert 2021-07-08 21:45:01 UTC
Regardless of the type of bug, the GDB behavior is to lock hard as it takes a gynormous dump.  In the creating a dump file sens of the term.  I realize that gdb doesn't need to be certain types of "user friendly" ... but this is actively "user hostile" ... and there are certainly people who _should_ be encouraged to use GDB that do not yet know that we've hidden the kernel debug symbols.