| Summary: | Auditdistd does not recover from TLS errors and just stops | ||
|---|---|---|---|
| Product: | Base System | Reporter: | Peter Wemm <peter> |
| Component: | bin | Assignee: | freebsd-bugs (Nobody) <bugs> |
| Status: | Closed FIXED | ||
| Severity: | Affects Only Me | CC: | brueffer, emaste, gonzo, pjd, rwatson |
| Priority: | --- | ||
| Version: | 10.2-STABLE | ||
| Hardware: | Any | ||
| OS: | Any | ||
|
Description
Peter Wemm
2015-07-28 23:20:44 UTC
For what its worth, we still see this every now and then. Just to cross reference the two sets of bug reports: https://github.com/openbsm/openbsm/issues/3 https://github.com/openbsm/openbsm/issues/2 Filed by brueffer. A commit references this bug: Author: pjd Date: Thu Oct 4 05:54:58 UTC 2018 New revision: 339177 URL: https://svnweb.freebsd.org/changeset/base/339177 Log: When the adist_free list is empty and we lose connection to the receiver we move all elements from the adist_send and adist_recv lists back onto the adist_free list, but we don't wake consumers waitings for the adist_free list to become non-empty. This can lead to the sender process stopping audit trail files distribution and waiting forever. Fix the problem by adding the missing wakeup. While here slow down spinning on CPU in case of a short race in sender_disconnect() and add an explaination when it can occur. PR: 201953 Reported by: peter Approved by: re (kib) Changes: head/contrib/openbsm/bin/auditdistd/auditdistd.h head/contrib/openbsm/bin/auditdistd/sender.c There is a commit referencing this PR, but it's still not closed and has been inactive for some time. Closing the PR as fixed but feel free to re-open it if the issue hasn't been completely resolved. Thanks |