I've seen this for a long time when using ddb over an IPMI serial console and people were saying that it could be because of IPMI and whatever.
Now I had a KDB: enter: manua escape to debugger situation (unclear where from yet) on my laptop. I said "db> cont" on v0 and it is stuck at that, not bringing back the shell session alive. However I can switch to v1 v2 v3 and all these shells are perfectly working fine.
I wonder what is broken that returning from ddb doesn't work anymore?
Coming back hours later pressing <Enter> again on the "cont" line which didn't show a shell prompt, gave me the 4 <Enter>s I pressed in total (3 earlier, 1 now) and the command prompt came back. Something is still fishy.
Anyone else with other experience (IPMI, serial, tty) would be welcome.
And I currently had this on a classic serial line machine as well.
I still get kernel printfs coming but typing anything doesn't work.
I hope someone has an idea... It's annoying if you cannot remotely power cycle a machine and need hands-on.
(In reply to Bjoern A. Zeeb from comment #2)
I gave this a try in vt0 and, after entering cont in the ddb prompt the terminal recovered and my bash prompt reappeared. A number of carriage returns were apparently emitted by the kernel, but that was not a problem.
BUT I'm using sc and NOT vt. What are you using?
I see it on an amd64 system. With debug.kdb.alt_break_to_debugger=1, I can enter ddb using the alt break sequence and resuming works fine. When I enter with sysctl debug.kdb.enter=1, I get the same hang. Happily, I can re-enter ddb in this state using the alt break sequence, so it's possible to debug a bit.
In this state, the shell is stuck:
Tracing pid 1447 tid 100097 td 0xfffffe000b532c00
sched_switch() at sched_switch+0x5b2/frame 0xfffffe003bac74c0
mi_switch() at mi_switch+0x155/frame 0xfffffe003bac74e0
sleepq_switch() at sleepq_switch+0x11a/frame 0xfffffe003bac7520
sleepq_catch_signals() at sleepq_catch_signals+0x262/frame 0xfffffe003bac7570
sleepq_timedwait_sig() at sleepq_timedwait_sig+0x12/frame 0xfffffe003bac75b0
_cv_timedwait_sig_sbt() at _cv_timedwait_sig_sbt+0x184/frame 0xfffffe003bac7620
tty_drain() at tty_drain+0x1cc/frame 0xfffffe003bac7680
tty_ioctl() at tty_ioctl+0x26d/frame 0xfffffe003bac76d0
ttydev_ioctl() at ttydev_ioctl+0x247/frame 0xfffffe003bac7720
devfs_ioctl() at devfs_ioctl+0xcc/frame 0xfffffe003bac7770
vn_ioctl() at vn_ioctl+0x132/frame 0xfffffe003bac7880
devfs_ioctl_f() at devfs_ioctl_f+0x1e/frame 0xfffffe003bac78a0
kern_ioctl() at kern_ioctl+0x276/frame 0xfffffe003bac7900
sys_ioctl() at sys_ioctl+0x127/frame 0xfffffe003bac79d0
amd64_syscall() at amd64_syscall+0x135/frame 0xfffffe003bac7af0
fast_syscall_common() at fast_syscall_common+0xf8/frame 0xfffffe003bac7af0
--- syscall (54, FreeBSD ELF64, sys_ioctl), rip = 0x8005e016a, rsp = 0x7fffffffd8e8, rbp = 0x7fffffffd930 ---
Indeed, there are some bytes "stuck" in the tty queues:
db> show tty 0xfffff80004075000
inq: 0xfffff80004075048 begin 0 linestart 2 reprint 2 end 2 nblocks 180 quota 180
outq: 0xfffff80004075088 begin 16 end 29 nblocks 93 quota 93
termios: iflag 0x2b02 oflag 0x7 cflag 0xcb00 lflag 0x5cb ispeed 115200 ospeed 115200
winsize: row 87 col 319 xpixel 0 ypixel 0
termios_init_in: iflag 0x2b02 oflag 0x3 cflag 0xcb00 lflag 0x5cb ispeed 115200 ospeed 115200
termios_init_out: iflag 0x2b02 oflag 0x3 cflag 0xcb00 lflag 0x5cb ispeed 115200 ospeed 115200
termios_lock_in: iflag 0x0 oflag 0x0 cflag 0x0 lflag 0x0 ispeed 0 ospeed 0
termios_lock_out: iflag 0x0 oflag 0x0 cflag 0x0 lflag 0x0 ispeed 0 ospeed 0
devsw: uart_tty_class (0xffffffff818d0a08)
hook: 0 (0)
pgrp: 0xfffff800036f6080 gid 1447 jobc 1
session: 0xfffff80004587b80 count 2 leader 0xfffff80006b43000 tty 0xfffff80004075000 sid 1443 login root
So I guess there is some race that results in uart(4) not handling an interrupt, so ttydisc_getc() isn't getting called to drain the outq.