Bug 235846 - panic: pmap_demote_pde: firstpte and newpte map different physical addresses
Summary: panic: pmap_demote_pde: firstpte and newpte map different physical addresses
Status: Closed Unable to Reproduce
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: CURRENT
Hardware: Any Any
: --- Affects Only Me
Assignee: Mark Johnston
URL:
Keywords: panic, patch, stress2
Depends on:
Blocks:
 
Reported: 2019-02-18 20:49 UTC by Peter Holm
Modified: 2019-02-27 15:57 UTC (History)
1 user (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Peter Holm freebsd_committer 2019-02-18 20:49:34 UTC
With FreeBSD 13.0-CURRENT #0 r344247: Mon Feb 18 07:56:47 CET 2019 I got:

20190218 18:41:14 all (127/614): callout_reset_on.sh
panic: pmap_demote_pde: firstpte and newpte map different physical addresses
cpuid = 9
time = 1550511727
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe0182bcb460
vpanic() at vpanic+0x1b4/frame 0xfffffe0182bcb4c0
panic() at panic+0x43/frame 0xfffffe0182bcb520
pmap_demote_pde_locked() at pmap_demote_pde_locked+0xd4a/frame 0xfffffe0182bcb5d0
pmap_remove() at pmap_remove+0x457/frame 0xfffffe0182bcb640
vm_map_delete() at vm_map_delete+0x321/frame 0xfffffe0182bcb6b0
vm_map_remove() at vm_map_remove+0x81/frame 0xfffffe0182bcb6e0
pipeclose() at pipeclose+0x2e1/frame 0xfffffe0182bcb720
pipe_close() at pipe_close+0x60/frame 0xfffffe0182bcb750
_fdrop() at _fdrop+0x1a/frame 0xfffffe0182bcb770
closef() at closef+0x1ec/frame 0xfffffe0182bcb800
fdescfree_fds() at fdescfree_fds+0x8c/frame 0xfffffe0182bcb850
fdescfree() at fdescfree+0x39e/frame 0xfffffe0182bcb910
exit1() at exit1+0x4fe/frame 0xfffffe0182bcb980
sys_sys_exit() at sys_sys_exit+0xd/frame 0xfffffe0182bcb990
amd64_syscall() at amd64_syscall+0x291/frame 0xfffffe0182bcbab0

https://people.freebsd.org/~pho/stress/log/mark080.txt
Comment 1 Mark Johnston freebsd_committer 2019-02-19 01:34:34 UTC
Could you post the vmcore for this one?
Comment 2 Peter Holm freebsd_committer 2019-02-19 05:46:12 UTC
(In reply to Mark Johnston from comment #1)
Sure.
https://people.freebsd.org/~pho/kernel+vmcore.19.r344247.mercat1.txz
Comment 3 Mark Johnston freebsd_committer 2019-02-19 21:12:58 UTC
Could you please try to reproduce the panic with the assertion below?  We don't dump page table pages in minidumps, which makes this one a little harder to track down.

diff --git a/sys/amd64/amd64/pmap.c b/sys/amd64/amd64/pmap.c
index 94ad7d1d856a..261500adac65 100644
--- a/sys/amd64/amd64/pmap.c
+++ b/sys/amd64/amd64/pmap.c
@@ -4103,6 +4103,8 @@ pmap_demote_pde_locked(pmap_t pmap, pd_entry_t *pde, vm_offset_t va,
                KASSERT((oldpde & PG_W) == 0,
                    ("pmap_demote_pde: page table page for a wired mapping"
                    " is missing"));
+               KASSERT(mpte != NULL || pmap != kernel_pmap,
+                   ("pmap_demote_pde: missing PT page for va %#lx", va));
 
                /*
                 * Invalidate the 2MB page mapping and return "failure" if the
Comment 4 Peter Holm freebsd_committer 2019-02-20 06:08:10 UTC
(In reply to Mark Johnston from comment #3)
This does not seem right to me:

startup_alloc from "UMA Hash", 2 boot pages left
startup_alloc from "UMA Zones", 1 boot pages left
Entering uma_startup1 with 0 boot pages left
Entering uma_startup2 with 0 boot pages left
panic: pmap_demote_pde: missing PT page for va 0xfffff800000a0000
cpuid = 0
time = 1
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xffffffff827238e0
vpanic() at vpanic+0x1b4/frame 0xffffffff82723940
panic() at panic+0x43/frame 0xffffffff827239a0
pmap_demote_pde_locked() at pmap_demote_pde_locked+0xd69/frame 0xffffffff82723a50
pmap_change_attr_locked() at pmap_change_attr_locked+0x37b/frame 0xffffffff82723ad0
pmap_init() at pmap_init+0x424/frame 0xffffffff82723b10
vm_mem_init() at vm_mem_init+0x60/frame 0xffffffff82723b20
mi_startup() at mi_startup+0x25f/frame 0xffffffff82723b70
btext() at btext+0x2c
KDB: enter: panic
[ thread pid 0 tid 0 ]
Stopped at      kdb_enter+0x3b: movq    $0,kdb_why
db> x/s version
version:        FreeBSD 13.0-CURRENT #0 r344337M: Wed Feb 20 06:59:00 CET 2019\012    pho@t2.osted.lan:/usr/src/sys/amd64/compile/PHO\012
db>
Comment 5 Mark Johnston freebsd_committer 2019-02-20 16:13:22 UTC
(In reply to Peter Holm from comment #4)
I see, sorry about that.  Please try this one instead:

Is the panic reproducible in general?

diff --git a/sys/amd64/amd64/pmap.c b/sys/amd64/amd64/pmap.c
index c45b4316622b..c6c640dc733d 100644
--- a/sys/amd64/amd64/pmap.c
+++ b/sys/amd64/amd64/pmap.c
@@ -4129,10 +4129,10 @@ pmap_demote_pde_locked(pmap_t pmap, pd_entry_t *pde, vm_offset_t va,
        if (mpte->wire_count == 1) {
                mpte->wire_count = NPTEPG;
                pmap_fill_ptp(firstpte, newpte);
-       }
-       KASSERT((*firstpte & PG_FRAME) == (newpte & PG_FRAME),
-           ("pmap_demote_pde: firstpte and newpte map different physical"
-           " addresses"));
+       } else
+               KASSERT((*firstpte & PG_FRAME) == (newpte & PG_FRAME),
+                   ("pmap_demote_pde: firstpte and newpte map different physical"
+                   " addresses: %#lx and %#lx", *firstpte, newpte));
 
        /*
         * If the mapping has changed attributes, update the page table
Comment 6 Peter Holm freebsd_committer 2019-02-20 16:39:39 UTC
(In reply to Mark Johnston from comment #5)
I have tried to reproduce the problem on a different host (t2) for 10 hours, without luck. Once the original host (mercat1) is free, I will try to reproduce it there.
I'll add this patch to the new hosts, just in case.
Comment 7 Peter Holm freebsd_committer 2019-02-27 11:05:15 UTC
(In reply to Peter Holm from comment #6)
I have failed to reproduce this problem. Running the same test on the same host for more than a day did not trigger any problems.

As a last attempt I returned to r344247 where I ran the same test for 3 hours. Also no problems.
Comment 8 Mark Johnston freebsd_committer 2019-02-27 15:57:44 UTC
(In reply to Peter Holm from comment #7)
It is indeed a rather strange panic, of the sort that could be caused by a bit-flip.  Please reopen the bug if you manage to hit it again.