Hi, I'm experiencing complete freezes on two of my systems which started happening after updating from version 10.3-Release to 11.1-Release. Both systems have the same motherboard and memory however, disk and system configurations are quite different. System 1. has ZFS on root and is an iscsi target and additionally runs 2x jails, it is configured with 2x Intel igb NICs with 2x vlans over lagg System 2. has UFS on root and serves several drives up through NFS that are configured as Zpools. This system also runs lagg over the 2x Intel igb NICs but no vlans I'm not sure what more information I could provide as there are no kernel dumps since the systems become completely unresponsive. I have run 'top' on the systems and no high CPU load or memory usage at time of freeze. Hopefully someone can help fix this issue?? :-) The dmesg output of the systems are as follows: Copyright (c) 1992-2017 The FreeBSD Project. Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 The Regents of the University of California. All rights reserved. FreeBSD is a registered trademark of The FreeBSD Foundation. FreeBSD 11.1-RELEASE-p1 #0: Wed Aug 9 11:55:48 UTC 2017 root@amd64-builder.daemonology.net:/usr/obj/usr/src/sys/GENERIC amd64 FreeBSD clang version 4.0.0 (tags/RELEASE_400/final 297347) (based on LLVM 4.0.0) VT(vga): resolution 640x480 CPU: Intel(R) Celeron(R) CPU J1900 @ 1.99GHz (2000.05-MHz K8-class CPU) Origin="GenuineIntel" Id=0x30678 Family=0x6 Model=0x37 Stepping=8 Features=0xbfebfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE> Features2=0x41d8e3bf<SSE3,PCLMULQDQ,DTES64,MON,DS_CPL,VMX,EST,TM2,SSSE3,CX16,xTPR,PDCM,SSE4.1,SSE4.2,MOVBE,POPCNT,TSCDLT,RDRAND> AMD Features=0x28100800<SYSCALL,NX,RDTSCP,LM> AMD Features2=0x101<LAHF,Prefetch> Structured Extended Features=0x2282<TSCADJ,SMEP,ERMS,NFPUSG> VT-x: PAT,HLT,MTF,PAUSE,EPT,UG,VPID TSC: P-state invariant, performance statistics real memory = 8589934592 (8192 MB) avail memory = 8102825984 (7727 MB) Event timer "LAPIC" quality 600 ACPI APIC Table: <SUPERM SMCI--MB> WARNING: L1 data cache covers less APIC IDs than a core 0 < 1 FreeBSD/SMP: Multiprocessor System Detected: 4 CPUs FreeBSD/SMP: 1 package(s) x 4 core(s) random: unblocking device. ACPI BIOS Warning (bug): 32/64X length mismatch in FADT/Gpe0Block: 128/32 (20170303/tbfadt-748) ioapic0 <Version 2.0> irqs 0-86 on motherboard SMP: AP CPU #2 Launched! SMP: AP CPU #3 Launched! SMP: AP CPU #1 Launched! Timecounter "TSC" frequency 2000049576 Hz quality 1000 random: entropy device external interface kbd1 at kbdmux0 netmap: loaded module module_register_init: MOD_LOAD (vesa, 0xffffffff80f5b220, 0) error 19 random: registering fast source Intel Secure Key RNG random: fast provider: "Intel Secure Key RNG" nexus0 vtvga0: <VT VGA driver> on motherboard cryptosoft0: <software crypto> on motherboard acpi0: <SUPERM SMCI--MB> on motherboard acpi0: Power Button (fixed) unknown: I/O range not supported cpu0: <ACPI CPU> on acpi0 cpu1: <ACPI CPU> on acpi0 cpu2: <ACPI CPU> on acpi0 cpu3: <ACPI CPU> on acpi0 atrtc0: <AT realtime clock> port 0x70-0x77 on acpi0 atrtc0: Warning: Couldn't map I/O. Event timer "RTC" frequency 32768 Hz quality 0 hpet0: <High Precision Event Timer> iomem 0xfed00000-0xfed003ff irq 8 on acpi0 Timecounter "HPET" frequency 14318180 Hz quality 950 Event timer "HPET" frequency 14318180 Hz quality 450 Event timer "HPET1" frequency 14318180 Hz quality 440 Event timer "HPET2" frequency 14318180 Hz quality 440 attimer0: <AT timer> port 0x40-0x43,0x50-0x53 irq 0 on acpi0 Timecounter "i8254" frequency 1193182 Hz quality 0 Event timer "i8254" frequency 1193182 Hz quality 100 Timecounter "ACPI-safe" frequency 3579545 Hz quality 850 acpi_timer0: <24-bit timer at 3.579545MHz> port 0x408-0x40b on acpi0 pcib0: <ACPI Host-PCI bridge> port 0xcf8-0xcff on acpi0 pcib0: _OSC returned error 0x10 pci0: <ACPI PCI bus> on pcib0 vgapci0: <VGA-compatible display> port 0xe080-0xe087 mem 0x90000000-0x903fffff,0x80000000-0x8fffffff irq 16 at device 2.0 on pci0 vgapci0: Boot video device ahci0: <AHCI SATA controller> port 0xe070-0xe077,0xe060-0xe063,0xe050-0xe057,0xe040-0xe043,0xe020-0xe03f mem 0x90a06000-0x90a067ff irq 19 at device 19.0 on pci0 ahci0: AHCI v1.30 with 2 3Gbps ports, Port Multiplier not supported ahcich1: <AHCI channel> at channel 1 on ahci0 pci0: <encrypt/decrypt> at device 26.0 (no driver attached) hdac0: <Intel BayTrail HDA Controller> mem 0x90a00000-0x90a03fff irq 22 at device 27.0 on pci0 pcib1: <ACPI PCI-PCI bridge> irq 16 at device 28.0 on pci0 pcib1: [GIANT-LOCKED] pcib2: <ACPI PCI-PCI bridge> irq 18 at device 28.2 on pci0 pcib2: [GIANT-LOCKED] pci1: <ACPI PCI bus> on pcib2 igb0: <Intel(R) PRO/1000 Network Connection, Version - 2.5.3-k> port 0xd000-0xd01f mem 0x90900000-0x9097ffff,0x90980000-0x90983fff irq 18 at device 0.0 on pci1 igb0: Using MSIX interrupts with 5 vectors igb0: Ethernet address: 0c:c4:7a:b0:5f:30 igb0: Bound queue 0 to cpu 0 igb0: Bound queue 1 to cpu 1 igb0: Bound queue 2 to cpu 2 igb0: Bound queue 3 to cpu 3 igb0: netmap queues/slots: TX 4/1024, RX 4/1024 pcib3: <ACPI PCI-PCI bridge> irq 19 at device 28.3 on pci0 pcib3: [GIANT-LOCKED] pci2: <ACPI PCI bus> on pcib3 pcib4: <ACPI PCI-PCI bridge> mem 0x90800000-0x90803fff irq 19 at device 0.0 on pci2 pci3: <ACPI PCI bus> on pcib4 pcib5: <PCI-PCI bridge> irq 16 at device 1.0 on pci3 pci4: <PCI bus> on pcib5 igb1: <Intel(R) PRO/1000 Network Connection, Version - 2.5.3-k> port 0xc000-0xc01f mem 0x90700000-0x9077ffff,0x90780000-0x90783fff irq 16 at device 0.0 on pci4 igb1: Using MSIX interrupts with 5 vectors igb1: Ethernet address: 0c:c4:7a:b0:5f:31 igb1: Bound queue 0 to cpu 0 igb1: Bound queue 1 to cpu 1 igb1: Bound queue 2 to cpu 2 igb1: Bound queue 3 to cpu 3 igb1: netmap queues/slots: TX 4/1024, RX 4/1024 pcib6: <PCI-PCI bridge> irq 17 at device 2.0 on pci3 pci5: <PCI bus> on pcib6 pcib7: <PCI-PCI bridge> irq 18 at device 3.0 on pci3 pci6: <PCI bus> on pcib7 ahci1: <Marvell 88SE9230 AHCI SATA controller> port 0xb050-0xb057,0xb040-0xb043,0xb030-0xb037,0xb020-0xb023,0xb000-0xb01f mem 0x90610000-0x906107ff irq 18 at device 0.0 on pci6 ahci1: AHCI v1.20 with 8 6Gbps ports, Port Multiplier not supported ahci1: quirks=0x900<NOBSYRES,ALTSIG> ahcich2: <AHCI channel> at channel 0 on ahci1 ahcich3: <AHCI channel> at channel 1 on ahci1 ahcich4: <AHCI channel> at channel 2 on ahci1 ahcich5: <AHCI channel> at channel 3 on ahci1 ahcich6: <AHCI channel> at channel 4 on ahci1 ahcich7: <AHCI channel> at channel 5 on ahci1 ahcich8: <AHCI channel> at channel 6 on ahci1 ahcich9: <AHCI channel> at channel 7 on ahci1 ehci0: <Intel BayTrail USB 2.0 controller> mem 0x90a05000-0x90a053ff irq 23 at device 29.0 on pci0 usbus0: EHCI version 1.0 usbus0 on ehci0 usbus0: 480Mbps High Speed USB v2.0 isab0: <PCI-ISA bridge> at device 31.0 on pci0 isa0: <ISA bus> on isab0 acpi_button0: <Power Button> on acpi0 acpi_button1: <Sleep Button> on acpi0 acpi_tz0: <Thermal Zone> on acpi0 uart0: <16950 or compatible> port 0x3f8-0x3ff irq 4 flags 0x10 on acpi0 uart2: <16950 or compatible> port 0x3e0-0x3e7 irq 3 on acpi0 uart3: <16950 or compatible> port 0x3e8-0x3ef irq 4 on acpi0 uart4: <16950 or compatible> port 0x2e0-0x2e7 irq 3 on acpi0 orm0: <ISA Option ROM> at iomem 0xd2000-0xd2fff on isa0 atkbdc0: <Keyboard controller (i8042)> at port 0x60,0x64 on isa0 atkbd0: <AT Keyboard> irq 1 on atkbdc0 kbd0 at atkbd0 atkbd0: [GIANT-LOCKED] fdc0: <Enhanced floppy controller> at port 0x3f0-0x3f5,0x3f7 irq 6 drq 2 on isa0 ppc0: cannot reserve I/O port range est0: <Enhanced SpeedStep Frequency Control> on cpu0 est1: <Enhanced SpeedStep Frequency Control> on cpu1 est2: <Enhanced SpeedStep Frequency Control> on cpu2 est3: <Enhanced SpeedStep Frequency Control> on cpu3 ZFS filesystem version: 5 ZFS storage pool version: features support (5000) Timecounters tick every 1.000 msec tcp_init: WARNING: TCB hash size not a power of 2, clipped from 32000 to 32768. nvme cam probe device init hdacc0: <Realtek ALC888 HDA CODEC> at cad 0 on hdac0 hdaa0: <Realtek ALC888 Audio Function Group> at nid 1 on hdacc0 pcm0: <Realtek ALC888 (Front Analog)> at nid 27 and 25 on hdaa0 pcm1: <Realtek ALC888 (Internal Digital)> at nid 17 on hdaa0 hdacc1: <Intel (0x2882) HDA CODEC> at cad 2 on hdac0 hdaa1: <Intel (0x2882) Audio Function Group> at nid 1 on hdacc1 hdaa1: hdaa_audio_as_parse: Duplicate pin 0 (5) in association 1! Disabling association. pcm2: <Intel (0x2882) (HDMI/DP 8ch)> at nid 6 on hdaa1 ugen0.1: <Intel EHCI root HUB> at usbus0 uhub0: <Intel EHCI root HUB, class 9/0, rev 2.00/1.00, addr 1> on usbus0 ada0 at ahcich1 bus 0 scbus0 target 0 lun 0 ada0: <SAMSUNG MZ7LM240HMHQ-00005 GXT5204Q> ACS-2 ATA SATA 3.x device ada0: Serial Number S2TWNX0J300288 ada0: 300.000MB/s transfers (SATA 2.x, UDMA6, PIO 8192bytes) ada0: Command Queueing enabled ada0: 228936MB (468862128 512 byte sectors) ada0: quirks=0x3<4K,NCQ_TRIM_BROKEN> ada1 at ahcich2 bus 0 scbus1 target 0 lun 0 ada1: <WDC WD101KRYZ-01JPDB0 01.01H01> ACS-2 ATA SATA 3.x device ada1: Serial Number 7JHLZY0C ada1: 600.000MB/s transfers (SATA 3.x, UDMA6, PIO 8192bytes) ada1: Command Queueing enabled ada1: 9537536MB (19532873728 512 byte sectors) ada2 at ahcich3 bus 0 scbus2 target 0 lun 0 ada2: <WDC WD101KRYZ-01JPDB0 01.01H01> ACS-2 ATA SATA 3.x device ada2: Serial Number 7JGJPUSC ada2: 600.000MB/s transfers (SATA 3.x, UDMA6, PIO 8192bytes) ada2: Command Queueing enabled ada2: 9537536MB (19532873728 512 byte sectors) pass3 at ahcich9 bus 0 scbus8 target 0 lun 0 pass3: <Marvell Console 1.01> Removable Processor SCSI device pass3: Serial Number HKDP221516WL pass3: 150.000MB/s transfers (SATA 1.x, UDMA4, ATAPI 12bytes, PIO 8192bytes) Trying to mount root from zfs:zroot/ROOT/default []... Root mount waiting for: usbus0 Root mount waiting for: usbus0 Root mount waiting for: usbus0 uhub0: 8 ports with 8 removable, self powered ugen0.2: <vendor 0x8087 product 0x07e6> at usbus0 uhub1 on uhub0 uhub1: <vendor 0x8087 product 0x07e6, class 9/0, rev 2.00/0.14, addr 2> on usbus0 Root mount waiting for: usbus0 uhub1: 4 ports with 4 removable, self powered ugen0.3: <vendor 0x0409 product 0x005a> at usbus0 uhub2 on uhub1 uhub2: <vendor 0x0409 product 0x005a, class 9/0, rev 2.00/1.00, addr 3> on usbus0 Root mount waiting for: usbus0 uhub2: 4 ports with 4 removable, self powered lagg0: link state changed to DOWN igb0: link state changed to UP lagg0: link state changed to UP lagg0.192: link state changed to UP lagg0.300: link state changed to UP igb1: link state changed to UP
I've done some extra testing by disabling all the tuneables in /boot/loader.conf and /etc/sysctl.conf which didn't have any effect. System 2 is still freezing :-( I also took out the lagg and currently just have 1x NIC hooked up to see if perhaps the lagg is the issue as there was a mention of updated code in the bugtracker reports. For this one I'm currently waiting to see the results as I've just changed the setup; though on first reboot the system wasn't up for even 40 minutes before it become completely unresponsive??? Out of both systems I have the 'freeze' issue seems more prominent on System 2 then System 1, meaning that something on the 2nd machine is triggering the bug/issue more then on the 1st system..... but what?? Lagg?? NFS?? ZFS?? Currently 'top' output looks fine though NFS seems a little high... between 2 -7% of CPU: last pid: 9346; load averages: 0.12, 0.18, 0.14 up 0+00:15:12 16:10:00 37 processes: 1 running, 36 sleeping CPU: % user, % nice, % system, % interrupt, % idle Mem: 44M Active, 34M Inact, 6646M Wired, 15M Buf, 1061M Free ARC: 6246M Total, 865M MFU, 5338M MRU, 2051K Anon, 13M Header, 29M Other 6178M Compressed, 6448M Uncompressed, 1.04:1 Ratio Swap: 2327M Total, 2327M Free PID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU COMMAND 673 root 128 20 0 8328K 4008K rpcsvc 3 0:46 5.08% nfsd 699 root 1 20 0 72120K 16404K select 3 0:01 0.00% snmpd 2783 root 1 20 0 20160K 3628K select 1 0:01 0.00% top 847 munin 1 20 0 46300K 14320K select 0 0:00 0.00% perl 678 root 1 20 0 12524K 3176K rpcsvc 1 0:00 0.00% rpc.lockd 735 root 1 20 0 20568K 12476K select 3 0:00 0.00% ntpd 9319 root 1 21 0 62480K 7840K select 0 0:00 0.00% sshd 1863 root 1 20 0 19660K 3808K pause 3 0:00 0.00% csh 671 root 1 20 0 10376K 2996K select 0 0:00 0.00% nfsd 545 root 1 20 0 10492K 2436K select 0 0:00 0.00% syslogd 959 root 1 24 0 43764K 3056K wait 2 0:00 0.00% login 839 root 1 20 0 42472K 8748K kqread 1 0:00 0.00% master 9324 root 1 20 0 19660K 3812K pause 0 0:00 0.00% csh 841 postfix 1 20 0 44584K 8824K kqread 2 0:00 0.00% qmgr
(In reply to kayasaman from comment #1) Hi, do you fix your problem?
^Triage: close as OBE. I'm sorry that this PR did not get addressed in a timely fashion. Please let us know if this still occurs on a supported OSVERSION.