Bug 202636

Summary: race in lib/libc/nls/msgcat.c
Product: Base System Reporter: Henry Hu <henry.hu.sh>
Component: threadsAssignee: Xin LI <delphij>
Status: Closed FIXED    
Severity: Affects Only Me CC: delphij, emaste, gabor
Priority: --- Keywords: patch
Version: CURRENT   
Hardware: Any   
OS: Any   

Description Henry Hu 2015-08-25 03:21:04 UTC
Tracing from crashing xfdesktop, I found that the following program crashes in seconds (in environment, set LANG=zh_CN.UTF-8):

#include <nl_types.h>
#include <stdio.h>
#include <errno.h>
#include <string.h>
#include <locale.h>
#include <pthread.h>

void* work(void *arg) {
    while (1) {
        nl_catd catd = catopen("libc", NL_CAT_LOCALE);
        catgets(catd, 1, 2, "No such file or directory");
        catclose(catd);
    }
}

int main() {
    setlocale(LC_MESSAGES, "");
    pthread_t thr1, thr2, thr3, thr4;
    pthread_create(&thr1, NULL, &work, NULL);
    pthread_create(&thr2, NULL, &work, NULL);
    pthread_create(&thr3, NULL, &work, NULL);
    pthread_create(&thr4, NULL, &work, NULL);
    pthread_join(thr1, NULL);
    pthread_join(thr2, NULL);
    pthread_join(thr3, NULL);
    pthread_join(thr4, NULL);
}

It always crashes somewhere in catgets:

Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 801016000 (LWP 101899/test)]
catgets (catd=0x801615170, set_id=1, msg_id=2, s=0x400a9b "No such file or directory") at /usr/src/lib/libc/nls/msgcat.c:280
280             u = ntohl((u_int32_t)cat_hdr->__nsets) - 1;

and it looks like that catd is freed:
(gdb) p *catd
$2 = {__data = 0x5a5a5a5a5a5a5a5a, __size = 1515870810}

After a closer look, it looks like that the increments to np->refcount is racy. See https://svnweb.freebsd.org/base/head/lib/libc/nls/msgcat.c?annotate=278530. It is only protected by a read lock, so multiple threads may change the refcount at the same time, thus it is a race condition and may corrupt the refcount field. 

Proposed fix:

Index: lib/libc/nls/msgcat.c
===================================================================
--- lib/libc/nls/msgcat.c       (版本 287028)
+++ lib/libc/nls/msgcat.c       (工作副本)
@@ -141,7 +141,7 @@
        }
 
        /* Try to get it from the cache first */
-       RLOCK(NLERR);
+       WLOCK(NLERR);
        SLIST_FOREACH(np, &cache, list) {
                if ((strcmp(np->name, name) == 0) &&
                    ((lang != NULL && np->lang != NULL &&
@@ -376,7 +376,7 @@
         * One more try in cache; if it was not found by name,
         * it might still be found by absolute path.
         */
-       RLOCK(NLERR);
+       WLOCK(NLERR);
        SLIST_FOREACH(np, &cache, list) {
                if ((np->path != NULL) && (strcmp(np->path, path) == 0)) {
                        np->refcount++;


This patch fixes the case in catopen().
This also changes the access in load_msgcat(), which also seems to be incorrect.
Comment 1 Ed Maste freebsd_committer freebsd_triage 2018-04-13 13:42:09 UTC
I can reproduce this on -CURRENT with LANG=zh_CN.UTF-8 (but not with my default en_CA.UTF-8) and the analysis seems sound. Note that the proposed patch changes the only uses of RLOCK to WLOCK so this change is in effect removing r/w locking.
Comment 2 commit-hook freebsd_committer freebsd_triage 2020-03-19 06:34:07 UTC
A commit references this bug:

Author: delphij
Date: Thu Mar 19 06:33:06 UTC 2020
New revision: 359118
URL: https://svnweb.freebsd.org/changeset/base/359118

Log:
  Fix race condition in catopen(3).

  The current code uses a rwlock to protect the cached list, which
  in turn holds a list of catentry objects, and increments reference
  count while holding only read lock.

  Fix this by converting the reference counter to use atomic operations.

  While I'm there, also perform some clean ups around memory operations.

  PR:		202636
  Reported by:	Henry Hu <henry.hu.sh@gmail.com>
  Reviewed by:	markj
  MFC after:	2 weeks
  Differential Revision:	https://reviews.freebsd.org/D24095

Changes:
  head/lib/libc/nls/msgcat.c