while the processor is clearing pages in vm_page_zero_idle, interrupts are enabled. This might cause an attempt to call pmap_zero_page while it cannot be called. db> trace vget ffs_sync sync boot panic pmap_zero_page (probably via inlined vm_page_zero_fill) vm_fault trap_pfault trap calltrap trap 0xc, ... ip_input ipintr swi_net_next vm_page_zero_idle idle_loop Fix: Alternative 1: Block some interrupts during the call to pmap_zero_page in vm_page_zero_idle. Alternative 2: Introduce a modified clone of pmap zero_page, using CMAP3/CADDR3 instead of CMAP2/CADDR2, and call that clone in vm_page_zero_idle instead of the original.--diduatwmzxiLtPiSdISKudtt4NgD6sL0C43PBodpDHAEuGpv Content-Type: text/plain; name="file.diff" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="file.diff" *** vm_machdep.c.orig Mon Apr 7 02:34:29 1997 --- vm_machdep.c Mon Apr 7 03:07:49 1997 *************** *** 883,889 **** --- 883,891 ---- --(*vm_page_queues[m->queue].lcnt); TAILQ_REMOVE(vm_page_queues[m->queue].pl, m, pageq); splx(s); + (void)splvm(); pmap_zero_page(VM_PAGE_TO_PHYS(m)); + splx(s); (void)splvm(); m->queue = PQ_ZERO + m->pc; ++(*vm_page_queues[m->queue].lcnt); How-To-Repeat: High system load, causing the queues of pages to be zero-filled to be long. Then lower load, causing the idle_loop to be activated. Then a interrupt while pmap_zero_page is using CMAP2.
Responsible Changed From-To: freebsd-bugs->tegge Tor, you appied your patch in rev 1.82 of vm_machdep.c but left this PR open and I guess only you know why :-)
State Changed From-To: open->closed According to Tor: The patch only hides the real problem and was backed out. But the problem might have been solved by one of the added spl*() protections around critical sections or checks for malloc returning 0. The panic has not been seen recently and this PR is considered timed-out.