Bug 228332 - ipfw crashes with lookup tables or similiar configurations on ryzen
Summary: ipfw crashes with lookup tables or similiar configurations on ryzen
Status: Closed Unable to Reproduce
Alias: None
Product: Base System
Classification: Unclassified
Component: misc (show other bugs)
Version: 11.1-RELEASE
Hardware: amd64 Any
: --- Affects Some People
Assignee: freebsd-ipfw (Nobody)
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2018-05-18 13:25 UTC by SF
Modified: 2022-03-31 12:48 UTC (History)
4 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description SF 2018-05-18 13:25:31 UTC
I don't know if its ryzen which is causing this and if its the ryzen-bug or if it is something else.

Commands like this are causing kernel-panics:

ipfw table test create type number algo number:array
ipfw table test add 1001
ipfw table test add 1002
ipfw table test add 1003
ipfw table test add 1005
ipfw table test add 1007
ipfw table test add 1008
ipfw table test add 1009
ipfw table test add 1010
ipfw table test add 1011
ipfw table test add 1012
ipfw table test add 1013

ipfw add 0 deny src-ip any dst-ip 0.0.0.0 dst-port 0 keep-state lookup uid test

this also causes kernel-panic:

ipfw add 0 deny src-ip any dst-ip 0.0.0.0 dst-port 0 keep-state uid 1001 uid 1002 uid 1003 uid 1005 uid 1007 uid 1008 uid 1009 uid 1010 uid 1011 uid 1012 uid 1013

this causes no kernel-panic:

ipfw add 3 deny src-ip any dst-ip 0.0.0.0 dst-port 0 keep-state uid 1001
ipfw add 3 deny src-ip any dst-ip 0.0.0.0 dst-port 0 keep-state uid 1002
ipfw add 3 deny src-ip any dst-ip 0.0.0.0 dst-port 0 keep-state uid 1003
ipfw add 3 deny src-ip any dst-ip 0.0.0.0 dst-port 0 keep-state uid 1005
ipfw add 3 deny src-ip any dst-ip 0.0.0.0 dst-port 0 keep-state uid 1007
ipfw add 3 deny src-ip any dst-ip 0.0.0.0 dst-port 0 keep-state uid 1008
ipfw add 3 deny src-ip any dst-ip 0.0.0.0 dst-port 0 keep-state uid 1009
ipfw add 3 deny src-ip any dst-ip 0.0.0.0 dst-port 0 keep-state uid 1010
ipfw add 3 deny src-ip any dst-ip 0.0.0.0 dst-port 0 keep-state uid 1011
ipfw add 3 deny src-ip any dst-ip 0.0.0.0 dst-port 0 keep-state uid 1012
ipfw add 3 deny src-ip any dst-ip 0.0.0.0 dst-port 0 keep-state uid 1013

Adjusting hardware-settings(memory-frequency, cpu-frequency, disabling cores, disabling smt) down to lowest settings within bios stops the kernel from panicing, this indicates something i previously recognised. The hardware is causing this but there is also something wrong with software if weak changes like this are causing kernel-panics.

Please don't mark this as duplicate until its clear whats causing it.

According to another bug-report some people sayed its a software-problem but thats wrong, i repeadetly sayed the motherboard is causing this but this got ignored and they did some changes to software. It's stoping the system from crashing but also removes functionality. At least this makes me absolutely sure its not the software causing this and if some user experience this kind of bug this is the workaround.
Comment 1 Conrad Meyer freebsd_committer freebsd_triage 2018-05-18 16:55:53 UTC
For the HW component of this:

Please try running the kill-ryzen script, with all hardware settings at defaults, to see if you have a faulty processor.[0]

[0]: https://github.com/cemeyer/ryzen-test

I'd suggest running it overnight (i.e., for 8+ hrs).  It will consume all CPU so you won't be able to use the machine for much else at the same time.

If you find it reports failures, please RMA your processor to AMD and we can resolve this Not a Bug.
Comment 2 SF 2018-05-22 21:31:30 UTC
Since reconfiguring my firewall with this knowledge iam experiencing no kernel-panics anymore. My computer is completely stable.
Comment 3 Mark Linimon freebsd_committer freebsd_triage 2018-08-12 14:15:50 UTC
So should this PR be closed?
Comment 4 SF 2018-08-14 11:25:32 UTC
No confirmation this bug has been resolved from developers, its just a workaround.
Comment 5 Lucas Holt 2020-12-28 18:07:33 UTC
I've also seen this behavior on an early production ryzen 1700 with FreeBSD 11.4. Switching the CPU out to a 2700 resolves the issue. 

ipfw definitely triggers the bug in 11.x.  I could get FreeBSD 10.x to work fine on it.
Comment 6 Lutz Donnerhacke freebsd_committer freebsd_triage 2021-05-10 07:54:22 UTC
Unless we have a clue, that kind of compiled machine code will trigger the bug in the CPU, this ticket will be stalled.

It's hard for anyone else to reproduce the bug, so even trial and error modifications to the code or to the compiler settings are impossible.

Please provide some way out of this situation or close the ticket.
Comment 7 aadhya 2022-03-31 12:48:19 UTC
We have observed similar type of crash from ipfw_chk().

Environment :
===================
hw.model: Intel(R) Xeon(R) Gold 5118 CPU @ 2.30GHz
hw.machine: amd64
hw.ncpu: 24
FreeBSD 11.2-RELEASE

Here is the BT :
===========================

(kgdb) bt
#0  doadump (textdump=1) at pcpu.h:229
#1  0xffffffff80610f5b in kern_reboot (howto=260) at ../../../kern/kern_shutdown.c:395
#2  0xffffffff80611459 in vpanic (fmt=<value optimized out>, ap=<value optimized out>)
    at ../../../kern/kern_shutdown.c:799
#3  0xffffffff80611193 in panic (fmt=<value optimized out>) at ../../../kern/kern_shutdown.c:719
#4  0xffffffff808967df in trap_fatal (frame=0xfffffe1049161250, eva=2) at ../../../amd64/amd64/trap.c:875
#5  0xffffffff80896839 in trap_pfault (frame=0xfffffe1049161250, usermode=0) at pcpu.h:229
#6  0xffffffff80896028 in trap (frame=0xfffffe1049161250) at ../../../amd64/amd64/trap.c:415
#7  0xffffffff8087534e in calltrap () at ../../../amd64/amd64/exception.S:199
#8  0xffffffff807a431f in ipfw_chk (args=<value optimized out>) at ../../../netpfil/ipfw/ip_fw2.c:1287
#9  0xffffffff807ac22f in ipfw_check_packet (arg=<value optimized out>, m0=0xfffffe10491616d0,
    ifp=<value optimized out>, dir=1, inp=0x0) at ../../../netpfil/ipfw/ip_fw_pfil.c:149
#10 0xffffffff8071f9d4 in pfil_run_hooks (ph=0xffffffff8100e478, mp=<value optimized out>, ifp=0xfffff8000becf000,
    dir=1, flags=0, inp=0x0) at ../../../net/pfil.c:116
#11 0xffffffff80742a99 in ip_input (m=0xfffff802dfad9600) at ../../../netinet/ip_input.c:601
#12 0xffffffff8071ea21 in netisr_dispatch_src (proto=1, source=<value optimized out>, m=<value optimized out>)
    at ../../../net/netisr.c:1120
#13 0xffffffff80707132 in ether_demux (ifp=0xfffff8000becf000, m=<value optimized out>)
    at ../../../net/if_ethersubr.c:884
#14 0xffffffff80708237 in ether_nh_input (m=<value optimized out>) at ../../../net/if_ethersubr.c:660
#15 0xffffffff8071ea21 in netisr_dispatch_src (proto=5, source=<value optimized out>, m=<value optimized out>)
    at ../../../net/netisr.c:1120
#16 0xffffffff807074b6 in ether_input (ifp=<value optimized out>, m=0x0) at ../../../net/if_ethersubr.c:780
#17 0xffffffff803f2ecc in ixgbe_rxeof (que=0xfffff8000becac00) at ../../../dev/ixgbe/ix_txrx.c:1597
#18 0xffffffff803e72b6 in ixgbe_msix_que (arg=0xfffff8000becac00) at ../../../dev/ixgbe/if_ix.c:1960
#19 0xffffffff805e1d1f in intr_event_execute_handlers (p=<value optimized out>, ie=0xfffff8000baf8a00)
    at ../../../kern/kern_intr.c:1336
#20 0xffffffff805e23b7 in ithread_loop (arg=0xfffff8000bec3ac0) at ../../../kern/kern_intr.c:1349
#21 0xffffffff805df396 in fork_exit (callout=0xffffffff805e2300 <ithread_loop>, arg=0xfffff8000bec3ac0,
    frame=0xfffffe1049161ac0) at ../../../kern/kern_fork.c:1054
#22 0xffffffff808761ee in fork_trampoline () at ../../../amd64/amd64/exception.S:951
#23 0x0000000000000000 in ?? ()
(kgdb)