| Summary: | A threaded read(2) from a socketpair(2) fd can sometimes fail with errno 19 (ENODEV) | ||
|---|---|---|---|
| Product: | Base System | Reporter: | grubba <grubba> |
| Component: | kern | Assignee: | freebsd-threads (Nobody) <threads> |
| Status: | Closed FIXED | ||
| Severity: | Affects Only Me | ||
| Priority: | Normal | ||
| Version: | 4.0-RELEASE | ||
| Hardware: | Any | ||
| OS: | Any | ||
Responsible Changed From-To: freebsd-bugs->jasone Over to maintainer. Responsible Changed From-To: jasone->freebsd-bugs State Changed From-To: open->feedback Does this problem still occur on more recent releases? Yes, it does. The pike developers have a build farm, similar to tinderbox, and my -current machine just failed the testsuite with the error "read(2) failed with ENODEV!". It seems to be very infrequent; it's probably run a couple dozen builds with no problem. I'm going to add a PTHREAD_ASSERT in uthread_read.c to see if I other programs are also getting ENODEV but ignoring it. I haven't been able to get crashdumps working on my -current box, so I can't put a panic in the kernel's read(). State Changed From-To: feedback->open Feedback has been requested and received; throw this PR back open. Adding to audit trail: Date: Mon, 9 Jun 2003 12:41:32 +0200 (MET DST) From: Henrik Grubbstr <grubba@roxen.com> Message-ID: <Pine.GSO.4.21.0306091233430.13083-100000@jms.roxen.com> Well, since the last followup was from august last year, I can inform you that the bug was last triggered on Dan's FreeBSD 5.1-BETA machine yesterday: Fatal error 'read(2) may not return ENODEV' at line 98 in file /usr/src/lib/libc_r/uthread/uthread_read.c (errno = 19) Abort trap (core dumped) Core was generated by `pike'. Program terminated with signal 6, Aborted. #0 0x2826239f in kill () at {standard input}:15 in {standard input} Active threads Current language: auto; currently asm * 1 process 33497 0x2826239f in kill () at {standard input}:15 Backtrace #0 0x2826239f in kill () at {standard input}:15 #1 0x282c219a in abort () at /usr/src/lib/libc/stdlib/abort.c:72 #2 0x2820f443 in _thread_exit () at /usr/src/lib/libc_r/uthread/uthread_exit.c:99 #3 0x28209d65 in _read (fd=12, buf=0xbf966fe8, nbytes=3) at /usr/src/lib/libc_r/uthread/uthread_read.c:98 #4 0x28209d9b in __read (fd=12, buf=0xbf966fe8, nbytes=3) at /usr/src/lib/libc_r/uthread/uthread_read.c:108 #5 0x080b725a in f_create_process (args=1) at /usr/tmp/xenofarm/pike-7.5/dan.emsphone.com/buildtmp/Pike7.5-20030608-215603/src/signal_handler.c:3512 #6 0x0807073d in low_mega_apply (type=APPLY_LOW, args=1, arg1=0x85dced8, arg2=0x6) at apply_low.h:195 #7 0x08071734 in mega_apply (type=APPLY_LOW, args=1, arg1=0x85dced8, arg2=0x6) at /usr/tmp/xenofarm/pike-7.5/dan.emsphone.com/buildtmp/Pike7.5-20030608-215603/src/interpret.c:1702 #8 0x080cd7a5 in call_pike_initializers (o=0x85dced8, args=1) at /usr/tmp/xenofarm/pike-7.5/dan.emsphone.com/buildtmp/Pike7.5-20030608-215603/src/object.c:326 #9 0x080cd894 in debug_clone_object (p=0x5, args=1) at /usr/tmp/xenofarm/pike-7.5/dan.emsphone.com/buildtmp/Pike7.5-20030608-215603/src/object.c:352 #10 0x080711fe in low_mega_apply (type=APPLY_SVALUE_STRICT, args=1, arg1=0x8533554, arg2=0x0) at /usr/tmp/xenofarm/pike-7.5/dan.emsphone.com/buildtmp/Pike7.5-20030608-215603/src/interpret.c:1500 #11 0x0806e77d in opcode_F_APPLY (arg1=33496) at interpret_functions.h:1873 #12 0x08533166 in ?? () #13 0x08071750 in mega_apply (type=APPLY_STACK, args=1, arg1=0x0, arg2=0x0) at /usr/tmp/xenofarm/pike-7.5/dan.emsphone.com/buildtmp/Pike7.5-20030608-215603/src/interpret.c:1704 #14 0x08071874 in f_call_function (args=1) at /usr/tmp/xenofarm/pike-7.5/dan.emsphone.com/buildtmp/Pike7.5-20030608-215603/src/interpret.c:1769 #15 0x080f6fed in new_thread_func (data=0xbfbff404) at /usr/tmp/xenofarm/pike-7.5/dan.emsphone.com/buildtmp/Pike7.5-20030608-215603/src/threads.c:788 #16 0x28204e6d in _thread_start () at /usr/src/lib/libc_r/uthread/uthread_create.c:275 #17 0xbf91c000 in ?? () sysname: FreeBSD release: 5.1-BETA version: FreeBSD 5.1-BETA #271: Thu May 29 16:33:28 CDT 2003 dan@dan.emsphone.com:/usr/src/sys/i386/compile/DANSMP machine: i386 nodename: dan.emsphone.com testname: default command: make xenofarm clientversion: $Id: client.sh,v 1.73 2003/05/20 12:48:33 mani Exp $ putversion: $Id: put.c,v 1.14 2003/01/12 21:14:16 ceder Exp $ contact: dnelson@allantgroup.com Thanks, -- Henrik Grubbström grubba@roxen.com Roxen Internet Software AB Responsible Changed From-To: freebsd-bugs->freebsd-threads Assign to threads mailing list State Changed From-To: open->suspended In RELENG_5,6 and HEAD libc_r is deprecated in favour of libpthread and libthr. Nobody is working on libc_r bugs so mark this PR as suspended. State Changed From-To: suspended->closed libc_r is no longer supported |
In the testsuite for a threaded application, a process spawning test that spawns 1000 /bin/cat /dev/null and waits for them sometimes fails because read(2) returns -1 with errno set to 19 (ENODEV). ENODEV is not a documented error code for read(2). Down-stripped code that triggs the bug: { pid_t pid=-2; int control_pipe[2]; /* Used for communication with the child. */ char buf[4]; if (socketpair(AF_UNIX, SOCK_STREAM, 0, control_pipe) < 0) { error("Failed to create child communication pipe.\n"); } { int loop_cnt = 0; sigset_t new_sig, old_sig; sigfillset(&new_sig); while(sigprocmask(SIG_BLOCK, &new_sig, &old_sig)) ; do { pid=fork(); if (pid == -1) { if (errno == EAGAIN) { /* Process table full or similar. * Try sleeping for a bit. */ if (loop_cnt++ < 60) { /* Don't sleep for too long... */ poll(NULL, 0, 100); /* Try again */ continue; } } else if (errno == EINTR) { /* Try again */ continue; } } break; } while(1); while(sigprocmask(SIG_SETMASK, &old_sig, 0)) ; } if(pid == -1) { int e = errno; /* * fork() failed */ while(close(control_pipe[0]) < 0 && errno==EINTR); while(close(control_pipe[1]) < 0 && errno==EINTR); error("Process.create_process(): fork() failed. errno:%d\n", e); } else if(pid) { int olderrno; /* * The parent process */ /* Close our child's end of the pipe. */ while(close(control_pipe[1]) < 0 && errno==EINTR); /* Wake up the child. */ buf[0] = 0; while (((e = write(control_pipe[0], buf, 1)) < 0) && (errno == EINTR)) ; if(e!=1) { /* Paranoia in case close() sets errno. */ olderrno = errno; while(close(control_pipe[0]) < 0 && errno==EINTR) ; error("Child process died prematurely. (e=%d errno=%d)\n", e, olderrno); } /* Wait for exec or error */ while (((e = read(control_pipe[0], buf, 3)) < 0) && (errno == EINTR)) ; /* Paranoia in case close() sets errno. */ olderrno = errno; while(close(control_pipe[0]) < 0 && errno==EINTR) ; if (!e) { /* OK! */ pop_n_elems(args); push_int(0); return; } else { /* Something went wrong. */ switch(buf[0]) { /* ... */ case 0: /* read() probably failed. */ default: /****************************************************************** * This point is reached with buf = {0, 4, 0}, e = -1, olderrno=19. *****************************************************************/ error("Process.create_process(): " "Child failed: %d, %d, %d, %d, %d!\n", buf[0], buf[1], buf[2], e, olderrno); break; } } }else{ /* * The child process */ /* Close our parent's end of the pipe. */ while(close(control_pipe[0]) < 0 && errno==EINTR); /* Ensure that the pipe will be closed when the child starts. */ if(set_close_on_exec(control_pipe[1], 1) < 0) PROCERROR(PROCE_CLOEXEC, 0); /* Wait for parent to get ready... */ while ((( e = read(control_pipe[1], buf, 1)) < 0) && (errno == EINTR)) ; /* ... */ execvp(argv[0], argv); PROCERROR(PROCE_EXEC, 0); exit(99); } } For the full source, please check src/signal_handler.c:f_create_process() in a Pike distribution. Testsuite report: testsuite: Test 9406 (shift 0) (CRNL) failed. 1: mixed a() { for(int x=0;x<10;x++) { for(int e=0;e<100;e++) if(Process.create_process(({"/bin/cat","/dev/null"}))->wait()) return e; __signal_watchdog(); } return -1;; } 2: mixed b() { return -1; } Error: Process.create_process(): Child failed: 0, 4, 0, -1, 19! __builtin.create_process: create(({"/bin/cat","/dev/null"})) __builtin: create_process() testsuite: Test 9406 (shift 0) (CRNL):1: a() /tmp/autobuild/pike7.1-20001021082826.tar/bin/test_pike.pike:572: main(3,({"/tmp/autobuild/pike7.1-20001021082826.tar/bin/test_pike.pike","modules/CommonLog/module_testsuite","modules/Gdbm/module_testsuite","modules/Gettext/module_testsuite","modules/Gmp/module_testsuite",,,34})) How-To-Repeat: Unfortunately, the problem is intermittent. It may be triggered by resource exhaustion.