Bug 223813 - [mps] page fault in mps_user_pass_thru() -> copyout() on 11.1-RELEASE-p4, sys/dev/mps/mps_user.c:1040
Summary: [mps] page fault in mps_user_pass_thru() -> copyout() on 11.1-RELEASE-p4, sys...
Status: New
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: 11.1-RELEASE
Hardware: Any Any
: --- Affects Only Me
Assignee: freebsd-bugs mailing list
URL:
Keywords: panic
Depends on:
Blocks:
 
Reported: 2017-11-23 08:40 UTC by Daniel Ylitalo
Modified: 2019-02-12 14:29 UTC (History)
2 users (show)

See Also:


Attachments
Entire core dump (379.36 KB, text/plain)
2017-11-23 08:40 UTC, Daniel Ylitalo
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Daniel Ylitalo 2017-11-23 08:40:49 UTC
Created attachment 188209 [details]
Entire core dump

Hi!

I just upgraded our firewall from 11.0 to 11.1-p4, however after about ~35-45 minutes it panics. After some poking around i saw there were quite a bit of changes in the mps driver so I'm guessing a bug snuck in there somewhere.

I'm happy to apply a debug patch to get you more information to sort this out if you need to.

It panics with this stracktrace:

Unread portion of the kernel message buffer:
panic: vm_fault: fault on nofault entry, addr: fffffe00003eb000
cpuid = 4
KDB: stack backtrace:
#0 0xffffffff80aadac7 at kdb_backtrace+0x67
#1 0xffffffff80a6bba6 at vpanic+0x186
#2 0xffffffff80a6ba13 at panic+0x43
#3 0xffffffff80d58b90 at vm_fault_hold+0x2070
#4 0xffffffff80d56ad5 at vm_fault+0x75
#5 0xffffffff80edf927 at trap_pfault+0xe7
#6 0xffffffff80edf0c6 at trap+0x286
#7 0xffffffff80ec36d1 at calltrap+0x8
#8 0xffffffff8067b346 at mps_ioctl+0x2e86
#9 0xffffffff8093ae38 at devfs_ioctl_f+0x128
#10 0xffffffff80ac9415 at kern_ioctl+0x255
#11 0xffffffff80ac914f at sys_ioctl+0x16f
#12 0xffffffff80ee0394 at amd64_syscall+0x6c4
#13 0xffffffff80ec39bb at Xfast_syscall+0xfb


And here is the doadump log:
(kgdb) #0  doadump (textdump=<value optimized out>) at pcpu.h:222
#1  0xffffffff80a6b721 in kern_reboot (howto=260)
    at /usr/src/sys/kern/kern_shutdown.c:366
#2  0xffffffff80a6bbe0 in vpanic (fmt=<value optimized out>,
    ap=<value optimized out>) at /usr/src/sys/kern/kern_shutdown.c:759
#3  0xffffffff80a6ba13 in panic (fmt=<value optimized out>)
    at /usr/src/sys/kern/kern_shutdown.c:690
#4  0xffffffff80d58b90 in vm_fault_hold (map=<value optimized out>,
    vaddr=<value optimized out>, fault_type=1 '\001',
    fault_flags=<value optimized out>, m_hold=0x0)
    at /usr/src/sys/vm/vm_fault.c:524
#5  0xffffffff80d56ad5 in vm_fault (map=0xfffff80003000000,
    vaddr=<value optimized out>, fault_type=1 '\001', fault_flags=0)
    at /usr/src/sys/vm/vm_fault.c:475
#6  0xffffffff80edf927 in trap_pfault (frame=0xfffffe08595cb510, usermode=0)
    at /usr/src/sys/amd64/amd64/trap.c:708
#7  0xffffffff80edf0c6 in trap (frame=0xfffffe08595cb510)
    at /usr/src/sys/amd64/amd64/trap.c:421
#8  0xffffffff80ec36d1 in calltrap ()
    at /usr/src/sys/amd64/amd64/exception.S:236
#9  0xffffffff80edd63f in copyout () at /usr/src/sys/amd64/amd64/support.S:255
#10 0xffffffff8067b346 in mps_ioctl () at /usr/src/sys/dev/mps/mps_user.c:1040
#11 0xffffffff8093ae38 in devfs_ioctl_f (fp=0xfffff80013466e10,
    com=3224914180, data=0xfffffe08595cb870, cred=0xfffff80013892500,
    td=0xfffff8000ab48000) at /usr/src/sys/fs/devfs/devfs_vnops.c:791
#12 0xffffffff80ac9415 in kern_ioctl (td=<value optimized out>, fd=3,
    com=<value optimized out>, data=<value optimized out>) at file.h:323
#13 0xffffffff80ac914f in sys_ioctl (td=<value optimized out>,
    uap=0xfffffe08595cba30) at /usr/src/sys/kern/sys_generic.c:745
#14 0xffffffff80ee0394 in amd64_syscall (td=0xfffff8000ab48000, traced=0)
    at subr_syscall.c:135
#15 0xffffffff80ec39bb in Xfast_syscall ()
    at /usr/src/sys/amd64/amd64/exception.S:396
#16 0x0000000000446adc in ?? ()
Previous frame inner to this frame (corrupt stack?)
Comment 1 Daniel Ylitalo 2017-11-23 09:35:02 UTC
Worth noting the chip perhaps:

mps0 Adapter:
       Board Name: SAS9207-4i4e
   Board Assembly: H3-25434-00K
        Chip Name: LSISAS2308
    Chip Revision: ALL
    BIOS Revision: 7.39.00.00
Firmware Revision: 20.00.04.00
Comment 2 Mahmoud Al-Qudsi 2018-01-02 01:33:56 UTC
The technical merits of your bug report notwithstanding, I would downgrade to P19 for sanity's sake. There are a host of issues with the P20 releases even with the official P20 drivers on other platforms (ESX); it's tough to figure out what's FreeBSD's fault and what's Broadcom/Avago's. 

(Speaking from similar experience after upgrading to 11.0 from 10.1)