Created attachment 200557 [details] fight race with mutex Hi! lib/libnetgraph/msg.c defines static int gMsgId and public functions NgSendMsg() and NgSendAsciiMsg() that both increment gMsgId in racy way in attempt to produce unique id for a request sent over AF_NETGRAPH socket. For long-lived multi-threaded application like net/mpd5 daemon: first thread can increase gMsgId upto INT_MAX and next moment another thread can increate gMsgId again to become -1. Then it is copied to unsigned msg.header.token and returned as signed integer. This means false error status returned with errno==0 and this breaks workflow of the daemon. I get this problem "in wild" from time to time. I have very straightforward and naive patch protecting the variable with simple mutex (attached) but it has its penalty for performance. Usage of atomic operations should be better approach but I'm not familiar with FreeBSD atomic operations. Any help will be appreciated.
Created attachment 200960 [details] atomically increment gMsgId The attached patch uses C11 atomics in an attempt to fix the problem. It even compiles and appears to generate the intended code with gcc 4.2.
(In reply to Mark Johnston from comment #1) Note, the patch permits a msgid of 0 instead of starting from 1. The netgraph(3) man page says that tokens only need to be non-negative, so I think this is fine, but I did not test it.
Ping? Did you try testing the patch?
(In reply to Mark Johnston from comment #3) Yes, thanks. Due to the nature of the race it's hard to be exact but at least the change does not make it worse. And the problem has not repeated yet with patch applied.
(In reply to Eugene Grosbein from comment #4) Thanks. I will commit the change then, with a long MFC timeout.
A commit references this bug: Author: markj Date: Fri May 10 16:43:47 UTC 2019 New revision: 347439 URL: https://svnweb.freebsd.org/changeset/base/347439 Log: Atomically update the global gMsgId in libnetgraph. Otherwise concurrently running threads may inadvertently use the same token for different messages. Preserve the behaviour of disallowing negative message tokens, but allow a message token value of zero since this simplifies the code a bit and tokens are documented to be non-negative. PR: 234442 Reported and tested by: eugen MFC after: 1 month Sponsored by: The FreeBSD Foundation Changes: head/lib/libnetgraph/msg.c
A commit references this bug: Author: markj Date: Sun Jun 9 03:31:08 UTC 2019 New revision: 348827 URL: https://svnweb.freebsd.org/changeset/base/348827 Log: MFC r347439: Atomically update the global gMsgId in libnetgraph. PR: 234442 Changes: _U stable/12/ stable/12/lib/libnetgraph/msg.c
I don't have the means to test this on stable/11, but please feel free to MFC the change there if you can verify that it works.
A commit references this bug: Author: eugen Date: Fri Oct 11 18:05:06 UTC 2019 New revision: 353445 URL: https://svnweb.freebsd.org/changeset/base/353445 Log: MFC r347439 by markj: Atomically update the global gMsgId in libnetgraph. Otherwise concurrently running threads may inadvertently use the same token for different messages. Preserve the behaviour of disallowing negative message tokens, but allow a message token value of zero since this simplifies the code a bit and tokens are documented to be non-negative. PR: 234442 Changes: _U stable/11/ stable/11/lib/libnetgraph/msg.c
^Triage: Track MFC's