Bug 243420

Summary: page fault in if_io_tqg_0 on virtualized (VMWare) guest
Product: Base System Reporter: Mahmoud Al-Qudsi <mqudsi>
Component: kernAssignee: freebsd-net (Nobody) <net>
Status: Closed Overcome By Events    
Severity: Affects Only Me CC: markj, mqudsi, net
Priority: --- Keywords: crash, needs-qa
Version: 12.1-RELEASEFlags: mqudsi: maintainer-feedback-
Hardware: amd64   
OS: Any   

Description Mahmoud Al-Qudsi 2020-01-18 01:47:19 UTC
Greetings.

I have FreeBSD 12.1-RELEASE-p1 GENERIC (unmodified) amd64 running under ESXi 6.7, and have had no problems whatsoever for a number of years.

While using ctld to serve a zvol as an iscsi target, I ran into a kernel panic (which I'll try to reproduce), summarized below:

------------------------------------------------------------------
Fatal trap 12: page fault while in kernel mode
cpuid = 0; apic id = 00
fault virtual address   = 0x0
fault code              = supervisor write data, page not present
instruction pointer     = 0x20:0xffffffff80cef252
stack pointer           = 0x28:0xfffffe0075da18c0
frame pointer           = 0x28:0xfffffe0075da19a0
code segment            = base rx0, limit 0xfffff, type 0x1b
                        = DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags        = interrupt enabled, resume, IOPL = 0
current process         = 0 (if_io_tqg_0)
trap number             = 12
panic: page fault
cpuid = 0
time = 1579311312
KDB: stack backtrace:
#0 0xffffffff80c1d297 at kdb_backtrace+0x67
#1 0xffffffff80bd05cd at vpanic+0x19d
#2 0xffffffff80bd0423 at panic+0x43
#3 0xffffffff810a7dcc at trap_fatal+0x39c
#4 0xffffffff810a7e19 at trap_pfault+0x49
#5 0xffffffff810a740f at trap+0x29f
#6 0xffffffff81081a0c at calltrap+0x8
#7 0xffffffff80ce9be5 at _task_fn_rx+0x75
#8 0xffffffff80c1bb54 at gtaskqueue_run_locked+0x144
#9 0xffffffff80c1b7b8 at gtaskqueue_thread_loop+0x98
#10 0xffffffff80b90c23 at fork_exit+0x83
#11 0xffffffff81082a4e at fork_trampoline+0xe
------------------------------------------------------------------

I recently upgraded the host machine and in the process switched from vmxnet3 virtualized network adapters to SR-IOV from the (same) Intel T520-CR. I had no problems in the past with an exclusively vmxnet3 configuration, but before I rush to blame it on if_ixv I must note that iSCSI is being served over the usual and boring vmxnet3 interface.
Comment 1 Mark Johnston freebsd_committer freebsd_triage 2020-08-11 17:59:07 UTC
Can you try the latest stable/12?  vmx has received a number of fixes for regressions introduced in 12.1.
Comment 2 Mahmoud Al-Qudsi 2020-11-09 19:37:57 UTC
I ended up removing the iSCSI role from the server and the panics went away, so I won't be able verify this as fixed one way or the other.

I would understand if you would prefer to close this out.
Comment 3 Mark Johnston freebsd_committer freebsd_triage 2020-11-09 20:25:44 UTC
(In reply to Mahmoud Al-Qudsi from comment #2)
12.1 introduced the vmxnet/iflib merge and we had quite a few reports of regressions.  Those bugs are fixed as of 12.2, and given that you're running 12.1 under ESXi I don't think there's anything we can usefully do unless you're able to reproduce these problems with 12.2.  Please re-open if you do continue to see vmxnet3 problems in 12.2.