181529 – [panic] sysutils/devcpu-data: Panic after CPU microcode update

Bug 181529 - [panic] sysutils/devcpu-data: Panic after CPU microcode update

Summary: [panic] sysutils/devcpu-data: Panic after CPU microcode update

Status:	Closed Overcome By Events

Alias:	None

Product:	Base System
Classification:	Unclassified
Component:	kern (show other bugs)
Version:	CURRENT
Hardware:	Any Any

Importance:	Normal Affects Only Me
Assignee:	freebsd-bugs (Nobody)

URL:
Keywords:

Depends on:
Blocks:

Reported:	2013-08-25 18:10 UTC by dimka
Modified:	2017-01-09 07:37 UTC (History)
CC List:	3 users (show)

See Also:

Attachments
Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description dimka 2013-08-25 18:10:00 UTC

# ./microcode_update start
Updating cpucodes...
/usr/local/share/cpucontrol/2185-m04f650b.fw: updating cpu /dev/cpuctl0 from rev
 0x7 to rev 0xb... done.
Done.

Approximately 50% of attempts to update microcode, lead to panic.

If device cpuctl is a kernel module, i got three fully identical crashes, immediately after manual microcode update:

Fatal trap 12: page fault while in kernel mode
cpuid = 0; apic id = 00
fault virtual address   = 0xc79a2ba8
fault code              = supervisor read, page not present
instruction pointer     = 0x20:0xc0b99f60
stack pointer           = 0x28:0xeb414764
frame pointer           = 0x28:0xeb41478c
code segment            = base rx0, limit 0xfffff, type 0x1b
                        = DPL 0, pres 1, def32 1, gran 1
processor eflags        = interrupt enabled, resume, IOPL = 0
current process         = 12 (irq21: rl0)
trap number             = 12
panic: page fault
cpuid = 0
KDB: stack backtrace:
#0 0xc0ac2f48 at kdb_backtrace+0x4b
#1 0xc0a91a09 at panic+0x16f
#2 0xc0dee9a7 at trap_fatal+0x324
#3 0xc0deeda6 at trap_pfault+0x3f3
#4 0xc0def9fb at trap+0x451
#5 0xc0dd9b5c at calltrap+0x6
#6 0xc0b9a03d at in_pcbinshash_nopcbgroup+0x10
#7 0xc0c2e50b at syncache_expand+0x947
#8 0xc0c25a1c at tcp_input+0xce4
#9 0xc0bb38fd at ip_input+0x688
#10 0xc0b4d08d at netisr_dispatch_src+0x90
#11 0xc0b4d32c at netisr_dispatch+0x20
#12 0xc0b43199 at ether_demux+0x16d
#13 0xc0b435bb at ether_nh_input+0x365
#14 0xc0b4d08d at netisr_dispatch_src+0x90
#15 0xc0b4d32c at netisr_dispatch+0x20
#16 0xc0b42d12 at ether_input+0x19
#17 0xc0b4bd7e at vlan_input+0x1ad
Uptime: 1m19s
Physical memory: 1486 MB

If device cpuctl is compiled in kernel, and microcode loaded in startup time from /usr/local/etc/rc.d/microcode_update, i got two different crashes, after ~10 minutes:

Fatal trap 12: page fault while in kernel mode
cpuid = 0; apic id = 00
fault virtual address   = 0x18
fault code              = supervisor read, page not present
instruction pointer     = 0x20:0xc0a7d5d9
stack pointer           = 0x28:0xed6878f4
frame pointer           = 0x28:0xed687908
code segment            = base rx0, limit 0xfffff, type 0x1b
                        = DPL 0, pres 1, def32 1, gran 1
processor eflags        = interrupt enabled, resume, IOPL = 0
current process         = 1176 (clamd)
trap number             = 12
panic: page fault
cpuid = 0
KDB: stack backtrace:
#0 0xc0ac2f48 at kdb_backtrace+0x4b
#1 0xc0a91a09 at panic+0x16f
#2 0xc0dee957 at trap_fatal+0x324
#3 0xc0deea5f at trap_pfault+0xfc
#4 0xc0def9ab at trap+0x451
#5 0xc0dd9b0c at calltrap+0x6
#6 0xc0b050ed at allocbuf+0xa3
#7 0xc0b07862 at getnewbuf+0x42b
#8 0xc0b08e47 at getblk+0x3f4
#9 0xc0b0d8bd at cluster_read+0xf5
#10 0xc0cd9889 at ffs_read+0x2be
#11 0xc0e0e782 at VOP_READ_APV+0x44
#12 0xc0b30f96 at vn_read+0x2b2
#13 0xc0ad3e7c at dofileread+0x93
#14 0xc0ad41c6 at kern_readv+0x62
#15 0xc0ad42a4 at sys_read+0x51
#16 0xc0def0cd at syscall+0x34e
#17 0xc0dd9b71 at Xint0x80_syscall+0x21
Uptime: 11m27s
Physical memory: 1486 MB

panic: free: address 0xc7b61800(0xc7b61000) has not been allocated.

cpuid = 0
KDB: stack backtrace:
#0 0xc0ac2f48 at kdb_backtrace+0x4b
#1 0xc0a91a09 at panic+0x16f
#2 0xc0a7d601 at free+0x80
#3 0xc0b050ed at allocbuf+0xa3
#4 0xc0b07862 at getnewbuf+0x42b
#5 0xc0b08e47 at getblk+0x3f4
#6 0xc0b0d8bd at cluster_read+0xf5
#7 0xc0cd9889 at ffs_read+0x2be
#8 0xc0e0e782 at VOP_READ_APV+0x44
#9 0xc0b30f96 at vn_read+0x2b2
#10 0xc0ad3e7c at dofileread+0x93
#11 0xc0ad41c6 at kern_readv+0x62
#12 0xc0ad42a4 at sys_read+0x51
#13 0xc0def0cd at syscall+0x34e
#14 0xc0dd9b71 at Xint0x80_syscall+0x21
Uptime: 10m51s
Physical memory: 1486 MB

I have not got any problems on four other machines, that have different hardware.

Affected system:

Mainboard: Intel D101GGC
(I can not yet see the revision BIOS, it is probably the last)

CPU: Intel(R) Pentium(R) 4 CPU 3.00GHz (3000.18-MHz 686-class CPU)
Origin = "GenuineIntel"  Id = 0xf65  Family = f Model = 6  Stepping = 5

Microcode database: devcpu-data-0.6 (/usr/ports/sysutils/devcpu-data)

Kernel config:

include         GENERIC
ident           HOSTING

makeoptions     DEBUG=

device          cpuctl
# device          ichwd

options         IPFIREWALL
options         IPFIREWALL_VERBOSE
options         IPFIREWALL_VERBOSE_LIMIT=100
options         IPFIREWALL_FORWARD
options         IPFIREWALL_DEFAULT_TO_ACCEPT

options         DUMMYNET

Fix: 

Don't known.
Maybe, update changes addressing mechanism or TLB cache behavior.
How-To-Repeat: Approximately 50%.

Comment 1 Mark Linimon freebsd_committer

2013-08-26 00:39:45 UTC

Responsible Changed
From-To: freebsd-bugs->freebsd-ports-bugs

although this is a kernel problem, the port maintainer ought to at 
least be informed too.

Comment 2 John Marino freebsd_committer

2014-08-11 09:59:35 UTC

Now that the maintainer has been informed (although this has not yet been acknowledged) shouldn't the PR get assigned to kernel?

Who needs to make changes to fix this?

Comment 3 John Clark 2014-08-11 20:46:28 UTC

If this issue still exists, it should be assigned to kernel, specifically the cpuctl driver.

Comment 4 John Marino freebsd_committer

2014-08-11 20:49:00 UTC

reclassifying PR per recommendation of sysutils/devcpu-data maintainer

Comment 5 John Marino freebsd_committer

2014-08-11 20:52:20 UTC

Assigned it to 11.0 current -- don't know if this PR is valid for current, submitter should speak up about which releases this is happening with.

Comment 6 Hiren Panchasara freebsd_committer

2017-01-06 00:18:05 UTC

Is this still valid?

Comment 7 dimka 2017-01-09 07:35:39 UTC

No more problems with this, at few years.

Comment 8 Hiren Panchasara freebsd_committer

2017-01-09 07:37:42 UTC

(In reply to dimka from comment #7)
Thanks for confirming.