Tracing from crashing xfdesktop, I found that the following program crashes in seconds (in environment, set LANG=zh_CN.UTF-8): #include <nl_types.h> #include <stdio.h> #include <errno.h> #include <string.h> #include <locale.h> #include <pthread.h> void* work(void *arg) { while (1) { nl_catd catd = catopen("libc", NL_CAT_LOCALE); catgets(catd, 1, 2, "No such file or directory"); catclose(catd); } } int main() { setlocale(LC_MESSAGES, ""); pthread_t thr1, thr2, thr3, thr4; pthread_create(&thr1, NULL, &work, NULL); pthread_create(&thr2, NULL, &work, NULL); pthread_create(&thr3, NULL, &work, NULL); pthread_create(&thr4, NULL, &work, NULL); pthread_join(thr1, NULL); pthread_join(thr2, NULL); pthread_join(thr3, NULL); pthread_join(thr4, NULL); } It always crashes somewhere in catgets: Program received signal SIGSEGV, Segmentation fault. [Switching to Thread 801016000 (LWP 101899/test)] catgets (catd=0x801615170, set_id=1, msg_id=2, s=0x400a9b "No such file or directory") at /usr/src/lib/libc/nls/msgcat.c:280 280 u = ntohl((u_int32_t)cat_hdr->__nsets) - 1; and it looks like that catd is freed: (gdb) p *catd $2 = {__data = 0x5a5a5a5a5a5a5a5a, __size = 1515870810} After a closer look, it looks like that the increments to np->refcount is racy. See https://svnweb.freebsd.org/base/head/lib/libc/nls/msgcat.c?annotate=278530. It is only protected by a read lock, so multiple threads may change the refcount at the same time, thus it is a race condition and may corrupt the refcount field. Proposed fix: Index: lib/libc/nls/msgcat.c =================================================================== --- lib/libc/nls/msgcat.c (版本 287028) +++ lib/libc/nls/msgcat.c (工作副本) @@ -141,7 +141,7 @@ } /* Try to get it from the cache first */ - RLOCK(NLERR); + WLOCK(NLERR); SLIST_FOREACH(np, &cache, list) { if ((strcmp(np->name, name) == 0) && ((lang != NULL && np->lang != NULL && @@ -376,7 +376,7 @@ * One more try in cache; if it was not found by name, * it might still be found by absolute path. */ - RLOCK(NLERR); + WLOCK(NLERR); SLIST_FOREACH(np, &cache, list) { if ((np->path != NULL) && (strcmp(np->path, path) == 0)) { np->refcount++; This patch fixes the case in catopen(). This also changes the access in load_msgcat(), which also seems to be incorrect.
I can reproduce this on -CURRENT with LANG=zh_CN.UTF-8 (but not with my default en_CA.UTF-8) and the analysis seems sound. Note that the proposed patch changes the only uses of RLOCK to WLOCK so this change is in effect removing r/w locking.
A commit references this bug: Author: delphij Date: Thu Mar 19 06:33:06 UTC 2020 New revision: 359118 URL: https://svnweb.freebsd.org/changeset/base/359118 Log: Fix race condition in catopen(3). The current code uses a rwlock to protect the cached list, which in turn holds a list of catentry objects, and increments reference count while holding only read lock. Fix this by converting the reference counter to use atomic operations. While I'm there, also perform some clean ups around memory operations. PR: 202636 Reported by: Henry Hu <henry.hu.sh@gmail.com> Reviewed by: markj MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D24095 Changes: head/lib/libc/nls/msgcat.c