Summary: | Kernel trap #9 in sys_semop | ||
---|---|---|---|
Product: | Base System | Reporter: | Olef <o.vandestadt> |
Component: | kern | Assignee: | Konstantin Belousov <kib> |
Status: | Closed FIXED | ||
Severity: | Affects Only Me | CC: | emaste, kib, markj |
Priority: | --- | Keywords: | crash |
Version: | 12.1-RELEASE | ||
Hardware: | amd64 | ||
OS: | Any |
Description
Olef
2020-10-15 08:56:48 UTC
Can you provide a minimal reproducer for the issue ? Hi, I've tried to create something that replicates the behavior, but unfortunately I dont have any luck with it... What happens internally is that a main process forks off 50 ish smaller processes that need to do system maintenance on 50 database files, and when attaching to the shared memory this segfault occurs. Without reproducer I cannot say anything. Perhaps try on 12.2, there were a fix that might be relevant, r358242 MFC of r357984. Fair enough. Is there a way to find out what the calling process was actually doing to cause this? kgdb only gives me the kernel fault, but doesn't give me anything on the state of the calling process. You can try to do something with e.g. ktrace, but this would be hard because system panics and records are not written. Might be sync NFS mount from other machine help, but I do not expect it. So are you able to reproduce it at will, even with complex scenario ? Try 12.2 or HEAD, you can install only kernel. Perhaps enable INVARIANTS when doing so. I can reproduce it with relative ease, i'm on 7 vmcore files so far. Updating now to 12.2-RC2 I suspect I figured it out, please try the patch from https://reviews.freebsd.org/D26826 That said, I am curious why do you need to adjust semume. Hi, Thanks, though I did not receive the panic this morning after upgrading to 12.2, will check again tomorrow. If this fault still persists I'll patch in your suggestion. I needed to increase SEMUME as some processes were complaining they could not semget (EINVAL) (In reply to Olef from comment #8) I am quite sure that there is the issue I described in the review, and since it is a memory corruption kind of bug, it is quite specific to the kernel/machine/ load when and how it manifests itself. I suggest you to add the patch to your kernel and try the procedure that caused panic, manually, several time. I will, would it also manifest itself in 12.2 or shall I create a new VM ? (In reply to Olef from comment #10) The issue that patch fixes is in HEAD, stable/12, and all 12.x releases. But since it is memory corruption, specific manifestation of it can be arbitrary, for instance you might get data corruption instead of panic. Yes, you can test with 12.2 VM. Hi, So, in 12.2 RC2 I indeed still got the kernel panics after initial upgrade. After patching the kernel I've not received this anymore in the last 3 days so all seems to work fine. Thanks for your help! PS: Increasing SEMUSZ would have also done the trick ? (In reply to Olef from comment #12) Yes increasing kern.ipc.semusz would also help, but you need to carefully calculate how large to set it. For instance, it is arch-dependent. A commit references this bug: Author: kib Date: Thu Oct 22 09:28:12 UTC 2020 New revision: 366932 URL: https://svnweb.freebsd.org/changeset/base/366932 Log: sysv_sem: semusz depends on semume. Size of the per-process semaphore undo structure (semusz) depends on the number of the per-process undos. If kern.ipc.semume is adjusted, semusz must be adjusted as well, and it makes no sense to delegate adjustment to user. Make it automatic. Reported and tested by: Olef <o.vandestadt@gmail.com> PR: 250361 Reviewed by: jhb, markj Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D26826 Changes: head/sys/kern/sysv_sem.c A commit references this bug: Author: kib Date: Thu Oct 29 11:09:48 UTC 2020 New revision: 367128 URL: https://svnweb.freebsd.org/changeset/base/367128 Log: MFC r366932: sysv_sem: semusz depends on semume. PR: 250361 Changes: _U stable/12/ stable/12/sys/kern/sysv_sem.c A commit references this bug: Author: kib Date: Thu Oct 29 11:19:48 UTC 2020 New revision: 367129 URL: https://svnweb.freebsd.org/changeset/base/367129 Log: MFC r366932: sysv_sem: semusz depends on semume. PR: 250361 Changes: _U stable/11/ stable/11/sys/kern/sysv_sem.c FYI, I was able to reproduce this problem on an Intel NUC, applying the patch solved the problem. |