Summary: | Kernel panic in vm_reserv_populate() | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Product: | Base System | Reporter: | Ivan Kosarev <ivan> | ||||||||||
Component: | kern | Assignee: | freebsd-bugs (Nobody) <bugs> | ||||||||||
Status: | Closed FIXED | ||||||||||||
Severity: | Affects Some People | CC: | emaste, gonzo, ivan, kib, op | ||||||||||
Priority: | --- | ||||||||||||
Version: | CURRENT | ||||||||||||
Hardware: | amd64 | ||||||||||||
OS: | Any | ||||||||||||
Attachments: |
|
Created attachment 153662 [details]
Core dumped
Worth mentioning that this bug prevents the Clang's memory sanitizer (Msan) from working on FreeBSD 11. I tried to reproduce this on sandy bridge machine on real hardware, and was not able to. The program was run dozen times without causing the issue. It is curious how limited are the CPU features reported compared to the CPU herald string. In particular, the popcnt support is not claimed, which is used by amd64 pmap when available. Created attachment 153727 [details]
picture from stack-trace
Try to run this program more than one time (~8-10 run). I'm able to reproduce this error on a Haswell based system, but there are no core dump, only a stacktrace. Created attachment 153772 [details]
VirtualBox config used
I've attached the VirtualBox config used. I can reproduce the defect on a clean just-installed system, r277486 specifically. A commit references this bug: Author: alc Date: Thu Mar 19 01:40:44 UTC 2015 New revision: 280238 URL: https://svnweb.freebsd.org/changeset/base/280238 Log: Fix the root cause of the "vm_reserv_populate: reserv <address> is already promoted" panics. The sequence of events that leads to a panic is rather long and circuitous. First, suppose that process P has a promoted superpage S within vm object O that it can write to. Then, suppose that P forks, which leads to S being write protected. Now, before P's child exits, suppose that P writes to another virtual page within O. Since the pages within O are copy on write, a shadow object for O is created to house the new physical copy of the faulted on virtual page. Then, before P can fault on S, P's child exists. Now, when P faults on S, it will follow the "optimized" path for copy-on-write faults in vm_fault(), wherein the underlying physical page is moved from O to its shadow object rather than allocating a new page and copying the new page's contents from the old page. Moreover, suppose that every 4 KB physical page making up S is moved to the shadow object in this way. However, the optimized path does not move the underlying superpage reservation, which is the root cause of the panics! Ultimately, P performs vm_object_collapse() on O's shadow object, which destroys O and in doing so breaks any reservations still belonging to O. This leaves the reservation underlying S in an inconsistent state: It's simultaneously not in use and promoted. Breaking a reservation does not demote it because I never intended for a promoted reservation to be broken. It makes little sense. Finally, this inconsistency leads to an assertion failure the next time that the reservation is used. The failing assertion does not (currently) exist in FreeBSD 10.x or earlier. There, we will quietly break the promoted reservation. While illogical and unintended, breaking the reservation is essentially harmless. PR: 198163 Reviewed by: kib Tested by: pho X-MFC after: r267213 Sponsored by: EMC / Isilon Storage Division Changes: head/sys/vm/vm_fault.c A commit references this bug: Author: alc Date: Thu Apr 2 19:10:34 UTC 2015 New revision: 281001 URL: https://svnweb.freebsd.org/changeset/base/281001 Log: MFC r280238 Fix the root cause of the "vm_reserv_populate: reserv <address> is already promoted" panics. PR: 198163 Changes: _U stable/10/ stable/10/sys/vm/vm_fault.c There is a commit referencing this PR, but it's still not closed and has been inactive for some time. Closing the PR as fixed but feel free to re-open it if the issue hasn't been completely resolved. Thanks |
Created attachment 153661 [details] The test source. See attached for the minimzed source. How to reproduce: $ clang fork.cc -o fork.cc.tmp $ ./fork.cc.tmp