The device /dev/cuad[0/1] cannot be accessed by mgetty during startup (mgetty entry in /etc/tty). mgetty log shows this output: 10/10 18:32:46 ad1 mgetty: interim release 1.1.33-Apr10 10/10 18:32:46 ad1 check for lockfiles 10/10 18:32:46 ad1 locking the line 10/10 18:32:47 ad1 mod: cannot make /dev/cuad1 stdin: Bad file descriptor 10/10 18:32:47 ad1 open device /dev/cuad1 failed: Bad file descriptor 10/10 18:32:47 ad1 cannot get terminal line dev=cuad1, exiting: Bad file descriptor Fix: i tried changing cuad1 to ttyd1 in /etc/ttys entry and sending HUP signal to init. first attempt always fails (the usual bad file descriptor error), second attempt sometimes succeeds. then switching back ttyd1 to cuad1 again in /etc/ttys and sending HUP signal will error (cuad1: Device busy). so i killed mgetty and again send HUP signal to init and it will succeed at this time. How-To-Repeat: setting "cuad1 "/usr/local/sbin/mgetty" unknown on insecure" in /etc/ttys and rebooting the system.
In this case, mgetty open a /dev/cuad? and dup(2) to stdin. int fd; fd = open(devname, O_RDWR | O_NDELAY | O_NOCTTY ); /* make new fd == stdin if it isn't already */ if (fd > 0) { (void) close(0); ---> if (dup(fd) != 0) { lprintf( L_FATAL, "mod: cannot make %s stdin", devname ); return ERROR; } } Bad dup() was not return descriptor 0. Is this a dup(3)'s bug? (or imcompatible change?) Workaround: mgetty use dup2(3) instead of use dup(3). dup2(fd, 0) . . dup2(0, 1) . . dup2(0, 2) . .
Hello! I'm CCing this follow-up to freebsd-stable because this problem can prevent use of RELENG_6 machines in production (mgetty is quite usual example of such a use). This bug is a regression vs. RELENG_5/4. My analysis shows that it isn't only dup() problem. File descriptor 0 get somehow "reserved" in RELENG_6, but only IF process has been started by the init via /etc/ttys! Look at this simple program: #include <unistd.h> #include <syslog.h> #include <fcntl.h> #include <stdio.h> #include <string.h> #include <stdarg.h> main() { int res; while((res=open("/dev/null",O_RDONLY)) < 3) if (res == -1) syslog(LOG_ERR,"open(): %m"); syslog(LOG_ERR,"Started"); sleep(10); if (close(0) == -1) syslog(LOG_ERR,"close(0): %m"); if (close(2) == -1) syslog(LOG_ERR,"close(2): %m"); if ((res=dup(1)) == -1) syslog(LOG_ERR,"dup(1): %m"); syslog(LOG_ERR,"dup() gave %d\n",res); sleep(10); return 0; } One can watch the file descriptor usage in two points where program is sleeping: first after program has opened enough files to use descriptor #3, and second after closing descriptors #0 and #2 and copying descriptor #1. So, when I start this program under 6.0-RELEASE in usual way (./a.out), in first point lsof shows me the following (I'll show only plain descriptors and omit cwd/rtd/txt information): At first sleep: a.out 837 root 0u VCHR 0,70 0t77713 70 /dev/ttyv1 a.out 837 root 1u VCHR 0,70 0t77713 70 /dev/ttyv1 a.out 837 root 2u VCHR 0,70 0t77713 70 /dev/ttyv1 a.out 837 root 3r VCHR 0,13 0t0 13 /dev/null a.out 837 root 4u unix 0xc1c7b9bc 0t0 ->0xc1bf7de8 (descriptor #4 has been created by syslog()). Program logged the following: a.out: dup() gave 0 At the second sleep: a.out 837 root 0u VCHR 0,70 0t77713 70 /dev/ttyv1 a.out 837 root 1u VCHR 0,70 0t77713 70 /dev/ttyv1 a.out 837 root 3r VCHR 0,13 0t0 13 /dev/null a.out 837 root 4u unix 0xc1c7b9bc 0t0 ->0xc1bf7de8 So all OK in this mode: there were 3 standard files open at the beginning (descr. 0-2), program has opened descr. 3 (and 4), closed 0 and 2 successfully, and copied 1 to 0. Now let's start this program from the /etc/ttys: cuad0 "/root/tmp/a.out" unknown on insecure Now we have the following at the first sleep(): a.out 817 root 1r VCHR 0,13 0t0 13 /dev/null a.out 817 root 2r VCHR 0,13 0t0 13 /dev/null a.out 817 root 3r VCHR 0,13 0t0 13 /dev/null a.out 817 root 4u unix 0xc1c7bde8 0t0 ->0xc1bf7de8 Note that open() has also skipped descr. 0! Then program tries to close it, gives an error: close(0): Bad file descriptor dup() gave 2 Note that descriptor 0 isn't open: close() refuses to close it. But dup() doesn't "see" it and returns descr. 2 instead. At the second sleep, we have exactly the same open file table: descr. 0 is not in use, 1-3 point at /dev/null. So it seems to me that open() suffers from the same problem here as a dup(): descriptor 0 becomes "reserved" somehow. Sincerely, Dmitry -- Atlantis ISP, System Administrator e-mail: dmitry@atlantis.dp.ua nic-hdl: LYNX-RIPE
Responsible Changed From-To: freebsd-i386->freebsd-bugs Probably not i386 specific.
Hello, this problem is preventing production use here. Currently I can use /dev/cuad0 if I have the entry cuad0 "/usr/local/sbin/mgetty" unknown on insecure twice(*) in "/etc/ttys" and issue "kill -HUP 1" after booting to multi-user. Having only the first entry, sending SIGHUP to init won't work, but with both entries, so far the first SIGHUP to init gets everything working. Maybe this is helpful in finding the culprit. This is on a ASRock CPU EX Upgrade Board (K7UPGRADE-880/A/ASR) with AMD Athlon(tm) XP 2800+ and 512MB Memory, running FreeBSD 6.0-STABLE #3: Sun Nov 20 19:50:43 CET 2005 (*) one entry comes before pseudo terminal entries, the other afterwards. Regards, Holger Kipp
Hi, Problem is caused by sys/kern/kern_descrip.c 1.279.2.1. When the changes are undone, mgetty works. I am still figuring out if the kernel patch is wrong, or that mgetty is doing something iffy. Peter
Ok, it seems I have found the problem. Please, test the patch below: Index: sys/kern/kern_descrip.c =================================================================== RCS file: /usr/local/arch/ncvs/src/sys/kern/kern_descrip.c,v retrieving revision 1.289 diff -u -r1.289 kern_descrip.c --- sys/kern/kern_descrip.c 30 Nov 2005 05:12:03 -0000 1.289 +++ sys/kern/kern_descrip.c 19 Dec 2005 16:36:44 -0000 @@ -1512,6 +1512,8 @@ newfdp->fd_freefile = i; } } + if (newfdp->fd_freefile == -1) + newfdp->fd_freefile = i; FILEDESC_UNLOCK_FAST(fdp); FILEDESC_LOCK(newfdp); for (i = 0; i <= newfdp->fd_lastfile; ++i) @@ -1519,9 +1521,9 @@ fdused(newfdp, i); FILEDESC_UNLOCK(newfdp); FILEDESC_LOCK_FAST(fdp); - if (newfdp->fd_freefile == -1) - newfdp->fd_freefile = i; newfdp->fd_cmask = fdp->fd_cmask; + KASSERT(fd_first_free(newfdp, 0, newfdp->fd_nfiles) == newfdp->fd_freefile, + ("fd_first_free != fd_freefile fdp %p newfdp %p p %p", fdp, newfdp, curproc)); FILEDESC_UNLOCK_FAST(fdp); return (newfdp); }
seems the workaround is already commited in the ports tree. it makes mgetty use dup2(2) instead of dup(2). mgetty works fine now.
Yes, workaround just hide real kernel bug, that I'm trying to fix in the submitted patch.
Responsible Changed From-To: freebsd-bugs->des Dag-Erling, please handle this. Looks like you have introduced the problem.
Responsible Changed From-To: des->csjp I will take ownership of this PR as I am working on a fix.
State Changed From-To: open->patched An experimental fix has been commited to -CURRENT, once it's testing period expires, we will merge it into RELENG_6
State Changed From-To: patched->closed Merged to RELENG_6