I'm not sure why, but we are routinely getting an auditidistd failure in the freebsd.org cluster. A couple of log samples and data points.. root@ref11-amd64:~ # ls -l /var/audit/dist/ total 262049 -r--r----- 2 root audit 10535025 May 11 23:00 20150511225958.20150511230009 -r--r----- 2 root audit 11280061 May 11 23:23 20150511232340.20150511232357 -r--r----- 2 root audit 10521949 May 11 23:24 20150511232417.20150511232437 -r--r----- 2 root audit 15688526 May 11 23:25 20150511232533.20150511232553 -r--r----- 2 root audit 15477527 May 11 23:27 20150511232646.20150511232704 -r--r----- 2 root audit 13751547 May 11 23:27 20150511232741.20150511232753 -r--r----- 2 root audit 9764920 May 11 23:28 20150511232840.20150511232849 -r--r----- 2 root audit 10504847 May 11 23:29 20150511232904.20150511232912 -r--r----- 2 root audit 9942299 May 11 23:30 20150511233004.20150511233010 -r--r----- 2 root audit 23953683 May 11 23:30 20150511233010.20150511233025 -r--r----- 2 root audit 10894790 May 11 23:30 20150511233031.20150511233038 -r--r----- 2 root audit 7714981 May 11 23:31 20150511233056.20150511233101 -r--r----- 2 root audit 7938553 May 11 23:31 20150511233138.20150511233145 -r--r----- 2 root audit 12498960 May 11 23:31 20150511233145.20150511233156 -r--r----- 2 root audit 10511961 May 11 23:33 20150511233258.20150511233308 -r--r----- 2 root audit 11620541 May 11 23:33 20150511233308.20150511233319 -r--r----- 2 root audit 10507759 May 11 23:33 20150511233343.20150511233352 -r--r----- 2 root audit 12708772 May 11 23:35 20150511233446.20150511233500 -r--r----- 2 root audit 10542898 May 11 23:35 20150511233524.20150511233538 -r--r----- 2 root audit 10564417 May 11 23:36 20150511233641.20150511233655 -r--r----- 2 root audit 10499802 May 11 23:37 20150511233728.20150511233739 -r--r----- 2 root audit 10571378 May 11 23:38 20150511233809.20150511233818 -r--r----- 2 root audit 8769659 May 12 02:59 20150512011254.not_terminated root@ref11-amd64:~ # grep 20150511233809.20150511233818 /var/log/messages root@ref11-amd64:~ # bzgrep 20150511233809.20150511233818 /var/log/messages* /var/log/messages.0.bz2:May 11 23:38:24 ref11-amd64 auditd[797]: renamed /var/audit/20150511233809.not_terminated to /var/audit/20150511233809.20150511233818 /var/log/messages.0.bz2:May 11 23:38:24 ref11-amd64 root: audit warning: closefile /var/audit/20150511233809.20150511233818 root@ref11-amd64:~ # bzgrep 20150511233809.20150511233818 /var/log/messages* /var/log/messages.0.bz2:May 11 23:38:24 ref11-amd64 auditd[797]: renamed /var/audit/20150511233809.not_terminated to /var/audit/20150511233809.20150511233818 /var/log/messages.0.bz2:May 11 23:38:24 ref11-amd64 root: audit warning: closefile /var/audit/20150511233809.20150511233818 root@ref11-amd64:~ # bzgrep auditdist /var/log/messages* /var/log/messages.0.bz2:May 11 23:00:16 ref11-amd64 auditdistd[84495]: [audit.ysv.FreeBSD.org] (sender) Unable to open file "/var/audit/dist/20150511225958.not_terminated", skipping: No such file or directory. /var/log/messages.0.bz2:May 11 23:24:15 ref11-amd64 auditdistd[84495]: [audit.ysv.FreeBSD.org] (sender) Unable to open file "/var/audit/dist/20150511232340.not_terminated", skipping: No such file or directory. /var/log/messages.0.bz2:May 11 23:24:51 ref11-amd64 auditdistd[84495]: [audit.ysv.FreeBSD.org] (sender) Unable to open file "/var/audit/dist/20150511232417.not_terminated", skipping: No such file or directory. /var/log/messages.0.bz2:May 11 23:26:10 ref11-amd64 auditdistd[84495]: [audit.ysv.FreeBSD.org] (sender) Unable to open file "/var/audit/dist/20150511232533.not_terminated", skipping: No such file or directory. /var/log/messages.0.bz2:May 11 23:27:18 ref11-amd64 auditdistd[84495]: [audit.ysv.FreeBSD.org] (sender) Unable to open file "/var/audit/dist/20150511232646.not_terminated", skipping: No such file or directory. /var/log/messages.0.bz2:May 11 23:28:07 ref11-amd64 auditdistd[84495]: [audit.ysv.FreeBSD.org] (sender) Unable to open file "/var/audit/dist/20150511232741.not_terminated", skipping: No such file or directory. /var/log/messages.0.bz2:May 11 23:29:02 ref11-amd64 auditdistd[84495]: [audit.ysv.FreeBSD.org] (sender) Unable to open file "/var/audit/dist/20150511232840.not_terminated", skipping: No such file or directory. /var/log/messages.0.bz2:May 11 23:29:35 ref11-amd64 auditdistd[84495]: [audit.ysv.FreeBSD.org] (sender) Unable to open file "/var/audit/dist/20150511232904.not_terminated", skipping: No such file or directory. /var/log/messages.0.bz2:May 11 23:30:22 ref11-amd64 auditdistd[84495]: [audit.ysv.FreeBSD.org] (sender) Unable to open file "/var/audit/dist/20150511233004.not_terminated", skipping: No such file or directory. /var/log/messages.0.bz2:May 11 23:30:28 ref11-amd64 auditdistd[84495]: [audit.ysv.FreeBSD.org] (sender) Unable to open file "/var/audit/dist/20150511233010.not_terminated", skipping: No such file or directory. /var/log/messages.0.bz2:May 11 23:30:52 ref11-amd64 auditdistd[84495]: [audit.ysv.FreeBSD.org] (sender) Unable to open file "/var/audit/dist/20150511233031.not_terminated", skipping: No such file or directory. /var/log/messages.0.bz2:May 11 23:31:08 ref11-amd64 auditdistd[84495]: [audit.ysv.FreeBSD.org] (sender) Unable to open file "/var/audit/dist/20150511233056.not_terminated", skipping: No such file or directory. /var/log/messages.0.bz2:May 11 23:31:54 ref11-amd64 auditdistd[84495]: [audit.ysv.FreeBSD.org] (sender) Unable to open file "/var/audit/dist/20150511233138.not_terminated", skipping: No such file or directory. /var/log/messages.0.bz2:May 11 23:32:00 ref11-amd64 auditdistd[84495]: [audit.ysv.FreeBSD.org] (sender) Unable to open file "/var/audit/dist/20150511233145.not_terminated", skipping: No such file or directory. /var/log/messages.0.bz2:May 11 23:33:18 ref11-amd64 auditdistd[84495]: [audit.ysv.FreeBSD.org] (sender) Unable to open file "/var/audit/dist/20150511233258.not_terminated", skipping: No such file or directory. /var/log/messages.0.bz2:May 11 23:33:24 ref11-amd64 auditdistd[84495]: [audit.ysv.FreeBSD.org] (sender) Unable to open file "/var/audit/dist/20150511233308.not_terminated", skipping: No such file or directory. /var/log/messages.0.bz2:May 11 23:33:58 ref11-amd64 auditdistd[84495]: [audit.ysv.FreeBSD.org] (sender) Unable to open file "/var/audit/dist/20150511233343.not_terminated", skipping: No such file or directory. /var/log/messages.0.bz2:May 11 23:35:08 ref11-amd64 auditdistd[84495]: [audit.ysv.FreeBSD.org] (sender) Unable to open file "/var/audit/dist/20150511233446.not_terminated", skipping: No such file or directory. /var/log/messages.0.bz2:May 11 23:35:49 ref11-amd64 auditdistd[84495]: [audit.ysv.FreeBSD.org] (sender) Unable to open file "/var/audit/dist/20150511233524.not_terminated", skipping: No such file or directory. /var/log/messages.0.bz2:May 11 23:37:02 ref11-amd64 auditdistd[84495]: [audit.ysv.FreeBSD.org] (sender) Unable to open file "/var/audit/dist/20150511233641.not_terminated", skipping: No such file or directory. /var/log/messages.0.bz2:May 11 23:37:44 ref11-amd64 auditdistd[84495]: [audit.ysv.FreeBSD.org] (sender) Unable to open file "/var/audit/dist/20150511233728.not_terminated", skipping: No such file or directory. /var/log/messages.0.bz2:May 11 23:38:24 ref11-amd64 auditdistd[84495]: [audit.ysv.FreeBSD.org] (sender) Unable to open file "/var/audit/dist/20150511233809.not_terminated", skipping: No such file or directory. This is the most telling: /var/log/messages.0.bz2:May 11 23:00:05 ref11-amd64 auditd[797]: renamed /var/audit/20150511225947.not_terminated to /var/audit/20150511225947.20150511225958 /var/log/messages.0.bz2:May 11 23:00:05 ref11-amd64 auditd[797]: New audit file is /var/audit/20150511225958.not_terminated /var/log/messages.0.bz2:May 11 23:00:05 ref11-amd64 root: audit warning: closefile /var/audit/20150511225947.20150511225958 /var/log/messages.0.bz2:May 11 23:00:15 ref11-amd64 auditd[797]: renamed /var/audit/20150511225958.not_terminated to /var/audit/20150511225958.20150511230009 /var/log/messages.0.bz2:May 11 23:00:16 ref11-amd64 root: audit warning: closefile /var/audit/20150511225958.20150511230009 /var/log/messages.0.bz2:May 11 23:00:16 ref11-amd64 auditdistd[84495]: [audit.ysv.FreeBSD.org] (sender) Unable to open file "/var/audit/dist/20150511225958.not_terminated", skipping: No such file or directory. It looks like when it loses a rename race (under high loads), it ignores that audit chunk forever.
Is this still happening?
A commit references this bug: Author: pjd Date: Thu Oct 4 05:48:10 UTC 2018 New revision: 339176 URL: https://svnweb.freebsd.org/changeset/base/339176 Log: When we look for a new trail file there might be a race between find trail file name and opening it. This race was not properly handled, because we were copying new name before checking for openat(2) error and when we were trying again we were starting with the next trail file. This could result in skipping distribution of such a trail file. Fix this problem by checking for ENOENT first (only for .not_terminated files) and then updating (or not) tr_filename before restarting the search. PR: 200139 Reported by: peter Approved by: re (kib) Changes: head/contrib/openbsm/bin/auditdistd/trail.c
There is a commit referencing this PR, but it's still not closed and has been inactive for some time. Closing the PR as fixed but feel free to re-open it if the issue hasn't been completely resolved. Thanks