Since 13.0-RELEASE long running idle TCP connections are silently terminated without notice. So, the endpoints encounter transmission timeouts later on and bad things happen.
It seems that net.inet.ip.fw.dyn_keepalive doesn't work anymore.
Due to the nature of the problem (_long running_ and _idle_), it's hard to pin down. Further, I have the following rule for egress packets:
allow ip from any to any out keep-state
So, if ipfw really forgets a connection, it gets recreated automatically if the next packet is egress.
A simple verification is to compare TCP connection known to tcp stack and states known to ipfw, e.g.
# sockstat -6 | grep 636
postfix smtpd 96270 19 tcp6 XXX:15743 YYY:636
postfix smtpd 95918 19 tcp6 XXX:41957 YYY:636
root saslauthd 91056 8 tcp6 XXX:43828 YYY:636
root saslauthd 91055 8 tcp6 XXX:43830 YYY:636
root saslauthd 91054 8 tcp6 XXX:43826 YYY:636
root saslauthd 91053 8 tcp6 XXX:43825 YYY:636
root saslauthd 91052 8 tcp6 XXX:17216 YYY:636
ipfw -d show | grep 636
61005 81 26938 (169s) STATE tcp XXX::13 41957 <-> YYY::10 636 :default
61005 58 19762 (113s) STATE tcp XXX::13 15743 <-> YYY::10 636 :default
You see that all connections of saslauthd got lost.
Normally, ipfw should send keepalive packets some seconds before removing dynamic states, but when doing a packet capture, no keepalives of affected connections could be found.
Seems to be a duplicate of bug 253476
*** This bug has been marked as a duplicate of bug 253476 ***