220358 – panic in tcp_lro_flush_all

Bug 220358 - panic in tcp_lro_flush_all

Summary: panic in tcp_lro_flush_all

Status:	Closed FIXED

Alias:	None

Product:	Base System
Classification:	Unclassified
Component:	kern (show other bugs)
Version:	CURRENT
Hardware:	i386 Any

Importance:	--- Affects Only Me
Assignee:	freebsd-net (Nobody)

URL:
Keywords:	regression

Duplicates (1):	220404 (view as bug list)
Depends on:
Blocks:

Reported:	2017-06-29 13:08 UTC by rz-rpi03
Modified:	2022-06-20 23:56 UTC (History)
CC List:	7 users (show)

See Also:

Attachments
Fix for panic (383 bytes, patch) 2017-07-04 08:44 UTC, Hans Petter Selasky	no flags	Details \| Diff
View All Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description rz-rpi03 2017-06-29 13:08:15 UTC

Hi,

a recent (r320396) CURRENT kernel crashes repeatable in tcp_lro_flush_all()
after connecting to the network via cable.
A three weeks old r319620 kernel is stable in the same environment (hardware, network).

Regards, Ralf

Excerpt from core0.txt:

FreeBSD  12.0-CURRENT FreeBSD 12.0-CURRENT #1 r320396: Wed Jun 28 09:14:27 CEST 
2017     root@IZ-T193196065251a:/usr/obj/usr/src/sys/E4300  i386

panic: privileged instruction fault

GNU gdb (GDB) 7.12.1 [GDB v7.12.1 for FreeBSD]
Copyright (C) 2017 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "i386-portbld-freebsd12.0".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /boot/kernel/kernel...Reading symbols from /usr/lib/debug//
boot/kernel/kernel.debug...done.
done.

Unread portion of the kernel message buffer:


Fatal trap 1: privileged instruction fault while in kernel mode
cpuid = 1; apic id = 01
instruction pointer     = 0x20:0xc7efd41b
stack pointer           = 0x28:0xe37d979c
frame pointer           = 0x28:0xe37d97e8
code segment            = base rx0, limit 0xfffff, type 0x1b
                        = DPL 0, pres 1, def32 1, gran 1
processor eflags        = interrupt enabled, resume, IOPL = 0
current process         = 0 (if_io_tqg_1)
trap number             = 1
panic: privileged instruction fault
cpuid = 1
time = 1498722247
KDB: stack backtrace:
#0 0xc07dadaf at kdb_backtrace+0x4f
#1 0xc079ccb3 at vpanic+0x133
#2 0xc079cb7b at panic+0x1b
#3 0xc0ae38fe at trap_fatal+0x31e
#4 0xc0ae2e5e at trap+0xce
#5 0xc0ad1fea at calltrap+0x6
#6 0xc096bb4f at tcp_do_segment+0x219f
#7 0xc0968d67 at tcp_input+0x13a7
#8 0xc08f39a6 at ip_input+0x256
#9 0xc089328c at netisr_dispatch_src+0xcc
#10 0xc0893550 at netisr_dispatch+0x20
#11 0xc087d9b0 at ether_demux+0x140
#12 0xc087e65b at ether_nh_input+0x35b
#13 0xc089328c at netisr_dispatch_src+0xcc
#14 0xc0893550 at netisr_dispatch+0x20
#15 0xc087dc3a at ether_input+0x2a
#16 0xc096dfc5 at tcp_lro_flush+0x1d5
#17 0xc096e161 at tcp_lro_flush_all+0x141
Uptime: 4m50s

Physical memory: 3523 MB
Dumping 144 MB: 129 113 97 81 65 49 33 17 1

Reading symbols from /boot/kernel/snd_hda.ko...Reading symbols from /usr/lib/debug//boot/kernel/snd_hda.ko.debug...done.
done.
Reading symbols from /boot/kernel/sound.ko...Reading symbols from /usr/lib/debug//boot/kernel/sound.ko.debug...done.
done.
Reading symbols from /boot/kernel/cuse.ko...Reading symbols from /usr/lib/debug//boot/kernel/cuse.ko.debug...done.
done.
Reading symbols from /boot/kernel/ums.ko...Reading symbols from /usr/lib/debug//boot/kernel/ums.ko.debug...done.
done.
Reading symbols from /boot/kernel/ng_ubt.ko...Reading symbols from /usr/lib/debug//boot/kernel/ng_ubt.ko.debug...done.
done.
Reading symbols from /boot/kernel/netgraph.ko...Reading symbols from /usr/lib/debug//boot/kernel/netgraph.ko.debug...done.
done.
Reading symbols from /boot/kernel/ng_hci.ko...Reading symbols from /usr/lib/debug//boot/kernel/ng_hci.ko.debug...done.
done.
Reading symbols from /boot/kernel/ng_bluetooth.ko...Reading symbols from /usr/lib/debug//boot/kernel/ng_bluetooth.ko.debug...done.
done.
Reading symbols from /boot/kernel/ng_l2cap.ko...Reading symbols from /usr/lib/debug//boot/kernel/ng_l2cap.ko.debug...done.
done.
Reading symbols from /boot/kernel/ng_btsocket.ko...Reading symbols from /usr/lib/debug//boot/kernel/ng_btsocket.ko.debug...done.
done.
Reading symbols from /boot/kernel/ng_socket.ko...Reading symbols from /usr/lib/debug//boot/kernel/ng_socket.ko.debug...done.
done.
__curthread () at ./machine/pcpu.h:225
225             __asm("movl %%fs:%1,%0" : "=r" (td)
(kgdb) #0  __curthread () at ./machine/pcpu.h:225
#1  doadump (textdump=-949457280) at /usr/src/sys/kern/kern_shutdown.c:318
#2  0xc079c924 in kern_reboot (howto=<optimized out>)
    at /usr/src/sys/kern/kern_shutdown.c:386
#3  0xc079cceb in vpanic (fmt=<optimized out>, ap=<optimized out>)
    at /usr/src/sys/kern/kern_shutdown.c:779
#4  0xc079cb7b in panic (fmt=0xc0b23936 "%s")
    at /usr/src/sys/kern/kern_shutdown.c:710
#5  0xc0ae38fe in trap_fatal (frame=<optimized out>, eva=<optimized out>)
    at /usr/src/sys/i386/i386/trap.c:978
#6  0xc0ae2e5e in trap (frame=<optimized out>)
    at /usr/src/sys/i386/i386/trap.c:213
#7  <signal handler called>
#8  0xc7efd41b in ?? ()
#9  0xc096bb4f in tcp_do_segment (m=<optimized out>, th=<optimized out>, 
    so=<optimized out>, tp=<optimized out>, drop_hdrlen=<optimized out>, 
    tlen=<optimized out>, iptos=<optimized out>, 
    ti_locked=<error reading variable: Cannot access memory at address 0x1>)
    at /usr/src/sys/netinet/tcp_input.c:2444
#10 0xc0968d67 in tcp_input (mp=<optimized out>, offp=<optimized out>, 
    proto=<optimized out>) at /usr/src/sys/netinet/tcp_input.c:1191
#11 0xc08f39a6 in ip_input (m=0x0) at /usr/src/sys/netinet/ip_input.c:823
#12 0xc089328c in netisr_dispatch_src (proto=<optimized out>, 
    source=<optimized out>, m=0xc7efd408) at /usr/src/sys/net/netisr.c:1120
#13 0xc0893550 in netisr_dispatch (proto=1, m=0xc866f500)
    at /usr/src/sys/net/netisr.c:1211
#14 0xc087d9b0 in ether_demux (ifp=0xc77ca800, m=0x0)
    at /usr/src/sys/net/if_ethersubr.c:848
#15 0xc087e65b in ether_input_internal (ifp=0xc77ca800, m=0xc7efd408)
    at /usr/src/sys/net/if_ethersubr.c:637
#16 ether_nh_input (m=<optimized out>) at /usr/src/sys/net/if_ethersubr.c:667
#17 0xc089328c in netisr_dispatch_src (proto=<optimized out>, 
    source=<optimized out>, m=0xc7efd408) at /usr/src/sys/net/netisr.c:1120
#18 0xc0893550 in netisr_dispatch (proto=5, m=0xc866f500)
    at /usr/src/sys/net/netisr.c:1211
#19 0xc087dc3a in ether_input (ifp=0xc77ca800, m=0x0)
    at /usr/src/sys/net/if_ethersubr.c:757
#20 0xc096dfc5 in tcp_lro_flush (lc=0xc77ad424, le=<optimized out>)
    at /usr/src/sys/netinet/tcp_lro.c:394
#21 0xc096e161 in tcp_lro_rx_done (lc=0xc77ad424)
    at /usr/src/sys/netinet/tcp_lro.c:284
#22 tcp_lro_flush_all (lc=<optimized out>)
    at /usr/src/sys/netinet/tcp_lro.c:532
#23 0xc088dc90 in iflib_rxeof (budget=16, rxq=<optimized out>)
    at /usr/src/sys/net/iflib.c:2564
#24 _task_fn_rx (context=<optimized out>) at /usr/src/sys/net/iflib.c:3499
#25 0xc07d9aa8 in gtaskqueue_run_locked (queue=0xc7688000)
    at /usr/src/sys/kern/subr_gtaskqueue.c:329
#26 0xc07d97c7 in gtaskqueue_thread_loop (arg=0xc7671814)
    at /usr/src/sys/kern/subr_gtaskqueue.c:504
#27 0xc0764a16 in fork_exit (callout=0xc07d9720 <gtaskqueue_thread_loop>, 
    arg=<optimized out>, frame=<optimized out>)
    at /usr/src/sys/kern/kern_fork.c:1038
#28 <signal handler called>
(kgdb)

Comment 1 Hans Petter Selasky freebsd_committer

2017-06-29 20:29:53 UTC

Hi,

Are you using RSS?

There hasn`t been any LRO related changes recently, so the crash is likely in code outside the LRO code.

--HPS

Comment 2 rz-rpi03 2017-06-30 07:45:59 UTC

(In reply to Hans Petter Selasky from comment #1)

Hi,

not knowingly. I had to look up what RSS means.
https://wiki.freebsd.org/NetworkRSS does not mention an Intel(R) PRO/1000
network interface (em), so I thing I do not use it.

I am trying to find the change which causes this, but as the machine is
an older laptopt building new kernels take its time. 

Ralf

Comment 3 Hans Petter Selasky freebsd_committer

2017-06-30 08:11:08 UTC

RSS means "options RSS" in the kernel config

Second question: Are you using hyperthreading? Can you try to enter:

machdep.hyperthreading_allowed=0

in /boot/loader.conf

Is this issue reproducible?

--HPS

Comment 4 rz-rpi03 2017-06-30 09:28:41 UTC

There is no "option RSS" in the used kernel config. So, the answer is: No.

Hyperthreading was used.
As you suggested I disabled it via /boot/loader.conf, but the panic happend
again.
It changed its cause to "page fault while in kernel mode" but almost
not the place. "tcp_lro_flush" instead of the former "tcp_lro_flush_all".

Ralf



Unread portion of the kernel message buffer:
kernel trap 12 with interrupts disabled


Fatal trap 12: page fault while in kernel mode
cpuid = 1; apic id = 01
fault virtual address   = 0x55ea51aa
fault code              = supervisor write, page not present
instruction pointer     = 0x20:0xc7f3f21b
stack pointer           = 0x28:0xe37d97bc
frame pointer           = 0x28:0xe37d97e8
code segment            = base rx0, limit 0xfffff, type 0x1b
                        = DPL 0, pres 1, def32 1, gran 1
processor eflags        = resume, IOPL = 0
current process         = 0 (if_io_tqg_1)
trap number             = 12
panic: page fault
cpuid = 1
time = 1498813503
KDB: stack backtrace:
#0 0xc07dadaf at kdb_backtrace+0x4f
#1 0xc079ccb3 at vpanic+0x133
#2 0xc079cb7b at panic+0x1b
#3 0xc0ae38fe at trap_fatal+0x31e
#4 0xc0ae3943 at trap_pfault+0x33
#5 0xc0ae304e at trap+0x2be
#6 0xc0ad1fea at calltrap+0x6
#7 0xc096bb4f at tcp_do_segment+0x219f
#8 0xc0968d67 at tcp_input+0x13a7
#9 0xc08f39a6 at ip_input+0x256
#10 0xc089328c at netisr_dispatch_src+0xcc#13 0xc087e65b at ether_nh_input+0x35b
#14 0xc089328c at netisr_dispatch_src+0xcc
#15 0xc0893550 at netisr_dispatch+0x20
#16 0xc087dc3a at ether_input+0x2a
#17 0xc096dfc5 at tcp_lro_flush+0x1d5
Uptime: 6m23s
Physical memory: 3523 MB
Dumping 149 MB: 134 118 102 86 70 54 38 22 6

Reading symbols from /boot/kernel.r320396.crash/snd_hda.ko...Reading symbols from /usr/lib/debug//boot/kernel.r320396.crash/snd_hda.ko.debug...done.
done.
Reading symbols from /boot/kernel.r320396.crash/sound.ko...Reading symbols from /usr/lib/debug//boot/kernel.r320396.crash/sound.ko.debug...done.
done.
Reading symbols from /boot/kernel.r320396.crash/cuse.ko...Reading symbols from /usr/lib/debug//boot/kernel.r320396.crash/cuse.ko.debug...done.
done.
Reading symbols from /boot/kernel.r320396.crash/ums.ko...Reading symbols from /usr/lib/debug//boot/kernel.r320396.crash/ums.ko.debug...done.
done.
__curthread () at ./machine/pcpu.h:225
225             __asm("movl %%fs:%1,%0" : "=r" (td)
(kgdb) #0  __curthread () at ./machine/pcpu.h:225
#1  doadump (textdump=-949457280) at /usr/src/sys/kern/kern_shutdown.c:318
#2  0xc079c924 in kern_reboot (howto=<optimized out>)
    at /usr/src/sys/kern/kern_shutdown.c:386
#3  0xc079cceb in vpanic (fmt=<optimized out>, ap=<optimized out>)
    at /usr/src/sys/kern/kern_shutdown.c:779
#4  0xc079cb7b in panic (fmt=0xc0b23936 "%s")
    at /usr/src/sys/kern/kern_shutdown.c:710
#5  0xc0ae38fe in trap_fatal (frame=<optimized out>, eva=<optimized out>)
    at /usr/src/sys/i386/i386/trap.c:978
#6  0xc0ae3943 in trap_pfault (frame=<optimized out>, 
    usermode=<optimized out>, eva=<optimized out>)
    at /usr/src/sys/i386/i386/trap.c:804
#7  0xc0ae304e in trap (frame=<optimized out>)
    at /usr/src/sys/i386/i386/trap.c:512
#8  <signal handler called>
#9  0xc7f3f21b in ?? ()
#10 0xc096bb4f in tcp_do_segment (m=<optimized out>, th=<optimized out>, 
    so=<optimized out>, tp=<optimized out>, drop_hdrlen=<optimized out>, 
    tlen=<optimized out>, iptos=<optimized out>, 
    ti_locked=<error reading variable: Cannot access memory at address 0x1>)
    at /usr/src/sys/netinet/tcp_input.c:2444
#11 0xc0968d67 in tcp_input (mp=<optimized out>, offp=<optimized out>, 
    proto=<optimized out>) at /usr/src/sys/netinet/tcp_input.c:1191
#12 0xc08f39a6 in ip_input (m=0x0) at /usr/src/sys/netinet/ip_input.c:823
#13 0xc089328c in netisr_dispatch_src (proto=<optimized out>, 
    source=<optimized out>, m=0xc7f3f219) at /usr/src/sys/net/netisr.c:1120
#14 0xc0893550 in netisr_dispatch (proto=1, m=0xc8172000)
    at /usr/src/sys/net/netisr.c:1211
#15 0xc087d9b0 in ether_demux (ifp=0xc77ca800, m=0x0)
    at /usr/src/sys/net/if_ethersubr.c:848
#16 0xc087e65b in ether_input_internal (ifp=0xc77ca800, m=0xc7f3f219)
    at /usr/src/sys/net/if_ethersubr.c:637
#17 ether_nh_input (m=<optimized out>) at /usr/src/sys/net/if_ethersubr.c:667
#18 0xc089328c in netisr_dispatch_src (proto=<optimized out>, 
    source=<optimized out>, m=0xc7f3f219) at /usr/src/sys/net/netisr.c:1120
#19 0xc0893550 in netisr_dispatch (proto=5, m=0xc8172000)
    at /usr/src/sys/net/netisr.c:1211
#20 0xc087dc3a in ether_input (ifp=0xc77ca800, m=0x0)
    at /usr/src/sys/net/if_ethersubr.c:757
#21 0xc096dfc5 in tcp_lro_flush (lc=0xc77ad424, le=<optimized out>)
    at /usr/src/sys/netinet/tcp_lro.c:394
#22 0xc096e161 in tcp_lro_rx_done (lc=0xc77ad424)
    at /usr/src/sys/netinet/tcp_lro.c:284
#23 tcp_lro_flush_all (lc=<optimized out>)
    at /usr/src/sys/netinet/tcp_lro.c:532
#24 0xc088dc90 in iflib_rxeof (budget=16, rxq=<optimized out>)
    at /usr/src/sys/net/iflib.c:2564
#25 _task_fn_rx (context=<optimized out>) at /usr/src/sys/net/iflib.c:3499
#26 0xc07d9aa8 in gtaskqueue_run_locked (queue=0xc7688000)
    at /usr/src/sys/kern/subr_gtaskqueue.c:329
#27 0xc07d97c7 in gtaskqueue_thread_loop (arg=0xc7671814)
    at /usr/src/sys/kern/subr_gtaskqueue.c:504
#28 0xc0764a16 in fork_exit (callout=0xc07d9720 <gtaskqueue_thread_loop>, 
    arg=<optimized out>, frame=<optimized out>)
    at /usr/src/sys/kern/kern_fork.c:1038
#29 <signal handler called>
(kgdb) 

#11 0xc0893550 at netisr_dispatch+0x20
#12 0xc087d9b0 at ether_demux+0x140

Comment 5 Hans Petter Selasky freebsd_committer

2017-06-30 10:45:14 UTC

Adding Sean Bruno. I also notice your hardware is 32-bit. Have you seen this issue with 64-bit kernels?

--HPS

Comment 6 rz-rpi03 2017-06-30 11:39:29 UTC

> I also notice your hardware is 32-bit. Have you seen this issue with 64-bit kernels?

No, but on this hardware I have not run a 64-bit kernel yet.
On a different hardware a very current 64-bit kernel with also an "em" interface does not show this issue.

Ralf

Comment 7 rz-rpi03 2017-06-30 12:19:28 UTC

Just an intermediate result:
A r320008 32-bit kernel, no hyperthreading, panics as well with
"privileged instruction fault" in "tcp_lro_flush_all".

Ralf

Comment 8 oleg.nauman 2017-06-30 15:46:43 UTC

I'm observing crashes too ( CURRENT/i386 r320466 ), for example crash due to incoming SSH connection attempt:

__curthread () at ./machine/pcpu.h:225
225             __asm("movl %%fs:%1,%0" : "=r" (td)
(kgdb) #0  __curthread () at ./machine/pcpu.h:225
#1  doadump (textdump=-968634112) at ../../../kern/kern_shutdown.c:318
#2  0xc06e8954 in kern_reboot (howto=<optimized out>)
    at ../../../kern/kern_shutdown.c:386
#3  0xc06e8ceb in vpanic (fmt=<optimized out>,
    ap=0xea5c56ec "K\336\235\300H\325\065\306\001")
    at ../../../kern/kern_shutdown.c:779
#4  0xc06e8bab in panic (fmt=0xc092e2de "%s")
    at ../../../kern/kern_shutdown.c:710
#5  0xc08eee71 in trap_fatal (frame=0xea5c584c, eva=<optimized out>)
    at ../../../i386/i386/trap.c:978
#6  0xc08eefbb in trap_pfault (frame=0xea5c584c, usermode=0,
    eva=<optimized out>) at ../../../i386/i386/trap.c:890
#7  0xc08ee5de in trap (frame=<optimized out>)
    at ../../../i386/i386/trap.c:512
#8  <signal handler called>
#9  0xc6be0a1b in ?? ()
#10 0xc082ef73 in tcp_do_segment (m=<optimized out>, th=<optimized out>,
    so=<optimized out>, tp=<optimized out>, drop_hdrlen=<optimized out>,
    tlen=<optimized out>, iptos=<optimized out>,
    ti_locked=<error reading variable: Cannot access memory at address 0x1>)
    at ../../../netinet/tcp_input.c:2444
#11 0xc082c3a1 in tcp_input (mp=<optimized out>, offp=<optimized out>,
    proto=<optimized out>) at ../../../netinet/tcp_input.c:1191
#12 0xc0820a98 in ip_input (m=0x0) at ../../../netinet/ip_input.c:823
#13 0xc07d57db in netisr_dispatch_src (proto=<optimized out>,
    source=<optimized out>, m=0xc6be0a18) at ../../../net/netisr.c:1120
#14 0xc07d5aa0 in netisr_dispatch (proto=1, m=0xc6c01800)
    at ../../../net/netisr.c:1211
#15 0xc07c74b2 in ether_demux (ifp=0xc634e800, m=0x0)
    at ../../../net/if_ethersubr.c:848
#16 0xc07c8140 in ether_input_internal (ifp=0xc634e800, m=0xc6be0a18)
    at ../../../net/if_ethersubr.c:637
#17 ether_nh_input (m=<optimized out>) at ../../../net/if_ethersubr.c:667
#18 0xc07d57db in netisr_dispatch_src (proto=<optimized out>,
    source=<optimized out>, m=0xc6be0a18) at ../../../net/netisr.c:1120
#19 0xc07d5aa0 in netisr_dispatch (proto=5, m=0xc6c01800)
    at ../../../net/netisr.c:1211
#20 0xc07c773a in ether_input (ifp=0xc634e800, m=0x0)
    at ../../../net/if_ethersubr.c:757
#21 0xc04f5058 in age_rxeof (sc=<optimized out>, rxrd=<optimized out>)
    at ../../../dev/age/if_age.c:2442
#22 age_rxintr (rr_prod=4, count=<optimized out>, sc=<optimized out>)
    at ../../../dev/age/if_age.c:2488
#23 age_int_task (arg=<optimized out>, pending=1)
    at ../../../dev/age/if_age.c:2167
#24 0xc0735bfc in taskqueue_run_locked (queue=0xc631a300)
    at ../../../kern/subr_taskqueue.c:454
#25 0xc0736ae7 in taskqueue_thread_loop (arg=0xc6344a6c)
    at ../../../kern/subr_taskqueue.c:746
#26 0xc06b8b06 in fork_exit (callout=0xc0736a40 <taskqueue_thread_loop>,
    arg=<optimized out>, frame=<optimized out>)
    at ../../../kern/kern_fork.c:1038
#27 <signal handler called>
(kgdb)

machdep.hyperthreading_allowed is set to 0

My system also reproducible panics on named reconfigure/flush/shutdown events with ( partially ) similar backtrace , as well as panics caused by IPC ; for example:

__curthread () at ./machine/pcpu.h:225
225             __asm("movl %%fs:%1,%0" : "=r" (td)
(kgdb) #0  __curthread () at ./machine/pcpu.h:225
#1  doadump (textdump=-968633856) at ../../../kern/kern_shutdown.c:318
#2  0xc06e88c4 in kern_reboot (howto=<optimized out>)
    at ../../../kern/kern_shutdown.c:386
#3  0xc06e8c5b in vpanic (fmt=<optimized out>,
    ap=0xefd5c73c "\340\334\235\300\310\370\266\306\001")
    at ../../../kern/kern_shutdown.c:779
#4  0xc06e8b1b in panic (fmt=0xc092e18e "%s")
    at ../../../kern/kern_shutdown.c:710
#5  0xc08eed21 in trap_fatal (frame=0xefd5c878, eva=<optimized out>)
    at ../../../i386/i386/trap.c:978
#6  0xc08eea38 in trap (frame=<optimized out>)
    at ../../../i386/i386/trap.c:704
#7  <signal handler called>
#8  0xc6bcda1b in ?? ()
#9  0xc0770281 in unp_connect2 (so=<optimized out>, so2=<optimized out>,
    req=<optimized out>) at ../../../kern/uipc_usrreq.c:1497
#10 0xc076ff17 in unp_connectat (fd=<optimized out>, so=<optimized out>,
    nam=<optimized out>, td=<optimized out>)
    at ../../../kern/uipc_usrreq.c:1446
#11 0xc076d510 in unp_connect (so=0xc71c9400, nam=0xc662d500,
    td=<optimized out>) at ../../../kern/uipc_usrreq.c:1310
#12 uipc_connect (so=0xc71c9400, nam=0xc662d500, td=<optimized out>)
    at ../../../kern/uipc_usrreq.c:587
#13 0xc076a042 in kern_connectat (td=<optimized out>, dirfd=-100,
    fd=<optimized out>, sa=0xc662d500) at ../../../kern/uipc_syscalls.c:505
#14 0xc0769f49 in sys_connect (td=0xc6bcda18, uap=0xc6b6f988)
    at ../../../kern/uipc_syscalls.c:470
#15 0xc08ef679 in syscallenter (td=<optimized out>)
    at ../../../i386/i386/../../kern/subr_syscall.c:132
#16 syscall (frame=<optimized out>) at ../../../i386/i386/trap.c:1103
#17 <signal handler called>
#18 0x283a4747 in ?? ()
Backtrace stopped: Cannot access memory at address 0xbfbfe794
(kgdb)

Comment 9 Hans Petter Selasky freebsd_committer

2017-07-01 16:48:08 UTC

If this issue is a recent regression and the issue easily reproduces, can you try to bisect, I.E. binary search the exact revision which is causing this issue. Most easily this can be done when using GIT (See git bisect)

Comment 10 oleg.nauman 2017-07-02 14:17:57 UTC

Attempt to revert ( suspected ) r319722 on the top of r320466 caused conflict which I don't know how to resolve correctly
Tomorrow I will try to build plain r319721 and r319722.

Comment 11 rz-rpi03 2017-07-03 06:09:20 UTC

I can add to this, that a 32-bit ARM (Raspberrr-PI) kernel r320571 panics immediately when a ssh connect is attempted.
This behaviour is repeatable (3 panics out of 3 attempts).

Just for the record:

login: panic: Undefined instruction in kernel.

time = 1499024534
KDB: stack backtrace:
$a.6() at $a.6
         pc = 0xc05fe050  lr = 0xc01577b0 (db_trace_self_wrapper+0x30)
         sp = 0xdc65a818  fp = 0xdc65a930
db_trace_self_wrapper() at db_trace_self_wrapper+0x30
         pc = 0xc01577b0  lr = 0xc02f9130 (vpanic+0xc0)
         sp = 0xdc65a938  fp = 0xdc65a958
         r4 = 0x00000100  r5 = 0xdc65a96c
         r6 = 0xc061b004  r7 = 0x00000001
vpanic() at vpanic+0xc0
         pc = 0xc02f9130  lr = 0xc02f9070 (vpanic)
         sp = 0xdc65a960  fp = 0xdc65a964
         r4 = 0x00000000  r5 = 0x00000000
         r6 = 0x00000010  r7 = 0xffffff80
         r8 = 0xdc65a9f0  r9 = 0x00000001
        r10 = 0xc2caafdc
vpanic() at vpanic
         pc = 0xc02f9070  lr = 0xc061afec ($d.14)
         sp = 0xdc65a96c  fp = 0xdc65a9e8
         r4 = 0xffffff80  r5 = 0xdc65a9f0
         r6 = 0x00000001  r7 = 0xc2caafdc
         r8 = 0xdc65a964  r9 = 0xc02f9070
        r10 = 0xdc65a96c
$d.14() at $d.14
         pc = 0xc061afec  lr = 0xc0600b94 (exception_exit)
         sp = 0xdc65a9f0  fp = 0xdc65aa88
         r4 = 0xa0000013  r5 = 0xc2caaa78
         r6 = 0x00000010  r7 = 0xc2bd4700
         r8 = 0xc2c51824  r9 = 0xc2caa430
        r10 = 0x00000000
exception_exit() at exception_exit
         pc = 0xc0600b94  lr = 0xc0389544 (solisten_wakeup+0x28)
         sp = 0xdc65aa80  fp = 0xdc65aa88
         r0 = 0xc2caaa78  r1 = 0x00000000
         r2 = 0x00000001  r3 = 0xc2caaa90
         r4 = 0x00000001  r5 = 0xc2caaa78
         r6 = 0x00000010  r7 = 0xc2bd4700
         r8 = 0xc2c51824  r9 = 0xc2caa430
        r10 = 0x00000000 r12 = 0x00000000
_end() at 0xc2caafdc
         pc = 0xc2caafdc  lr = 0xc0389544 (solisten_wakeup+0x28)
         sp = 0xdc65aa80  fp = 0xdc65aa88
KDB: enter: panic
[ thread pid 13 tid 100033 ]
Stopped at      $d.12:  ldrb    r15, [r15, r15, ror r15]!
db> 
db> show proc   
Process 13 (usb) at 0xc288a398:
 state: NORMAL
 uid: 0  gids: 0
 parent: pid 0 at 0xc0783548
 ABI: null
 threads: 6
100031                   D       -       0xc291b02c  [usbus0]
100032                   D       -       0xc291b05c  [usbus0]
100033                   Run     CPU 0               [usbus0]
100034                   D       -       0xc291b0bc  [usbus0]
100035                   D       -       0xc291b0ec  [usbus0]
100069                   D       -       0xc2a9da28  [smsc0]
db> show thread 
Thread 100033 at 0xc2936000:
 proc (pid 13): 0xc288a398
 name: usbus0
 stack: 0xdc659000-0xdc65afff
 flags: 0x1000004  pflags: 0x200000
 state: RUNNING (CPU 0)
 priority: 28
 container lock: sched lock (0xc07713c4)
 last voluntary switch: 1 ms ago
 last involuntary switch: 0 ms ago
db>

Comment 12 Hans Petter Selasky freebsd_committer

2017-07-03 10:22:12 UTC

Hi,

I started with user-space from:

http://ftp.freebsd.org/pub/FreeBSD/snapshots/ISO-IMAGES/12.0/FreeBSD-12.0-CURRENT-arm-armv6-RPI2-20170626-r320360.img.xz

And built working kernels off the following revisions.

r320360
r320571
r320580

I cannot reproduce this issue.

Can you try to tarball your kernel, /boot/kernel, and try to flash a new image and then copy back the kernel you are using and see if it still fails?

--HPS

Comment 13 rz-rpi03 2017-07-03 13:34:49 UTC

Hi,

> I started with user-space from:
> 
> http://ftp.freebsd.org/pub/FreeBSD/snapshots/ISO-IMAGES/12.0/FreeBSD-12.0-CURRENT-arm-armv6-RPI2-20170626-r320360.img.xz
> 
> And built working kernels off the following revisions.
> 
> r320360
> r320571

r320571 ist exactly the version of kernel and userland I am using.

> r320580
> 
> I cannot reproduce this issue.

Interesting. I just paniced it again by ssh.

> Can you try to tarball your kernel, /boot/kernel, and try to flash a new image > and then copy back the kernel you are using and see if it still fails?

I will try to do so.

Comment 14 oleg.nauman 2017-07-03 19:47:36 UTC

I can confirm that r319722 is the source of network and IPC related panics on i386
I have performed three tests for r319721 and r319722 world: 
a) Incoming SSH connect attempt ( network )
b) 'rndc flush' command issue ( network )
c) Attempt to start 'hald' service ( IPC with 'dbus' service )
r319721 passed all tests ; r319722 panics on all tests

Comment 15 Sylvain Garrigues 2017-07-03 21:01:52 UTC

Please see https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=220452 and comment #5 and #8 - i.e. the kernel panic seems to disappear when INVARIANTS is used in the kernel config, or more precisely when one #ifdef INVARIANTS is removed in sys/kern/uipc_socket.c

Comment 16 rz-rpi03 2017-07-04 06:56:01 UTC

- i386:
Like Oleg Naumann I can also confirm that r319722 is the source of the problem here (i386, 32-bit, no hyperthreading and no INVARIANTS).

- armv6:
I had to use the RPi B image and started with the r320360 snapshot.
It worked well. Exchanging the original kernel with mine like Hans Petter suggested, exposes the described panic.
This armv6 kernel was crosscompiled on a i386, 32-bit machine from a seperate repository, also without INVARIANTS.

Ralf

Comment 17 rz-rpi03 2017-07-04 07:45:52 UTC

With INVARIANTS enabled the same i386 kernel from comment #16 does not panic.
So it looks like it is the same cause as mentioned in the bug report in comment #15.

Ralf

Comment 18 Hans Petter Selasky freebsd_committer

2017-07-04 08:44:08 UTC

Created attachment 184049 [details]
Fix for panic

Hi, Can you try the attached patch?

Comment 19 Hans Petter Selasky freebsd_committer

2017-07-04 08:54:11 UTC

https://reviews.freebsd.org/D11475

Comment 20 rz-rpi03 2017-07-04 10:04:21 UTC

r319722 with the patch mentioned in comment #18 does not panic any more.
I double checked that INVARIANTS is disabled in my kernel configuration, so your patch fixes the panic.
Thank you.

Ralf

Comment 21 oleg.nauman 2017-07-04 15:57:49 UTC

It possible that patch is incomplete; please see https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=220452

Comment 22 commit-hook freebsd_committer

2017-07-04 18:23:54 UTC

A commit references this bug:

Author: hselasky
Date: Tue Jul  4 18:23:18 UTC 2017
New revision: 320652
URL: https://svnweb.freebsd.org/changeset/base/320652

Log:
  After r319722 two fields were left uninitialized when transforming a
  socket structure into a listening socket. This resulted in an invalid
  instruction fault for all 32-bit platforms.

  When INVARIANTS is set the union where the two uninitialized fields
  reside gets properly zeroed. This patch ensures the two uninitialized
  fields are zeroed when INVARIANTS is undefined.

  For 64-bit platforms this issue was not visible because so->sol_upcall
  which is uninitialized overlaps with so->so_rcv.sb_state which is
  already zero during soalloc();

  For 32-bit platforms this issue was visible and resulted in an invalid
  instruction fault, because so->sol_upcall overlaps with
  so->so_rcv.sb_sel which is always initialized to a valid data pointer
  during soalloc().

  Verifying the offset locations mentioned above are identical is left
  as an exercise to the reader.

  PR: 220452
  PR: 220358
  Reviewed by:	ae (network), gallatin
  Differential Revision:	https://reviews.freebsd.org/D11475
  Sponsored by:	Mellanox Technologies

Changes:
  head/sys/kern/uipc_socket.c

Comment 23 Hans Petter Selasky freebsd_committer

2017-07-04 18:26:35 UTC

Please re-open if this is still an issue with the latest version of FreeBSD-12-current. Thank you!

Comment 24 Mark Millard 2017-07-04 19:36:56 UTC

*** Bug 220404 has been marked as a duplicate of this bug. ***