| Summary: | fstat gives signal 10 (SIGBUS) when outputting data | ||
|---|---|---|---|
| Product: | Base System | Reporter: | greg <greg> |
| Component: | bin | Assignee: | freebsd-bugs (Nobody) <bugs> |
| Status: | Closed FIXED | ||
| Severity: | Affects Only Me | ||
| Priority: | Normal | ||
| Version: | 3.4-STABLE | ||
| Hardware: | Any | ||
| OS: | Any | ||
Hey .. I was playing around a little more with gdb, isolated it a little
more to the exact line, and have some variable context information ..
(gdb) step
350 bcopy(filed0.fd_dfiles, ofiles,
(filed.fd_lastfile+1) * FPSIZE);
(gdb) p filed0
$3 = {fd_fd = {fd_ofiles = 0xc8128d80, fd_ofileflags = 0x0, fd_cdir = 0x0,
fd_rdir = 0x0, fd_nfiles = 0, fd_lastfile = 6922, fd_freefile = 12635,
fd_cmask = 12859, fd_refcnt = 29236}, fd_dfiles = {0x32325b1b,
0x1b48313b,
0x20204b5b, 0x20202020, 0x20202020, 0x20202020, 0x20202020,
0x2f232020,
0x4057753c, 0x23205469, 0x57753c2f, 0x20546940, 0x753c2f23,
0x54694057,
0x3c2f2320, 0x69405775, 0x2f232054, 0x4057753c, 0x23205469,
0x57753c2f},
fd_dfileflags = "@iT\e[K\e[1;22r\e[22;1H"}
(gdb) p filed0.fd_fd.fd_lastfile
$4 = 6922
(gdb) p ofiles
$5 = (struct file **) 0x8068000
(gdb) p *ofiles
$6 = (struct file *) 0x0
(gdb) p (filed0.fd_fd.fd_lastfile+1)
$7 = 6923
[note: FPSIZE must be a define, I had several errors printing the whole
expression]
(gdb) p filed0.fd_dfiles
$8 = {0x32325b1b, 0x1b48313b, 0x20204b5b, 0x20202020, 0x20202020,
0x20202020,
0x20202020, 0x2f232020, 0x4057753c, 0x23205469, 0x57753c2f, 0x20546940,
0x753c2f23, 0x54694057, 0x3c2f2320, 0x69405775, 0x2f232054, 0x4057753c,
0x23205469, 0x57753c2f}
I'm still puzzled .. if no information comes back regarding more requests
for info without the next 30 minutes - hour, i'm going to kill the
offending pid, and end this. (note: I tracked down the pid by fstat'ing
each of the user's processes).
/gp
.... .. . ... . . . . .
g r e g @ s t r a y n e t . c o m
.-----.----.-----.-----. senior administrator, straynet online
| _ | _| -__| _ | head network administrator, wen dot net
|___ |__| |_____|___ | staff consultant, micro web company
|_____| |_____| icq: 10405504 / aol im: xysters
State Changed From-To: open->feedback Have you seen this problem occur again since? I seem to remember seeing something similar quite a while ago, but I never attempted to track it down. State Changed From-To: feedback->closed Feedback timeout. |
This problem popped up when I was running sockstat, as stated earlier, and was then isolated to fstat specifically. It appeared to core when listing information for a specific user (it was the last user in the list that appeared onscreen when doing a plain 'fstat' before it SIGBUS'd), and the behaviour repeats when I use fstat -u username. Example follows with gdb output. [root@voyager] /usr/src/usr.bin/fstat: make clean all [root@voyager] /usr/src/usr.bin/fstat: cd /usr/obj/usr/src/usr.bin/fstat [root@voyager] /usr/obj/usr/src/usr.bin/fstat: gdb ./fstat GNU gdb 4.18 Copyright 1998 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "i386-unknown-freebsd"... (gdb) run -u bin2ooo Starting program: /usr/obj/usr/src/usr.bin/fstat/./fstat -u bin2ooo USER CMD PID FD MOUNT INUM MODE SZ|DV R/W bin2ooo bnc 10558 root / 2 drwxr-xr-x 1024 r bin2ooo bnc 10558 wd /home 1714378 drwxr-xr-x 512 r bin2ooo bnc 10558 text /usr 3190287 -rwxr-xr-x 79658 r bin2ooo bnc 10558 0 / 6766 crw--w---- ttyp2 rw bin2ooo bnc 10558 1 / 6766 crw--w---- ttyp2 rw bin2ooo bnc 10558 2 / 6766 crw--w---- ttyp2 rw bin2ooo bnc 10558 3* internet stream tcp dc5b9180 bin2ooo bnc 10558 4 /home 1714190 -rw-r--r-- 25717 w bin2ooo bnc 10558 6* internet stream tcp dc5d4840 bin2ooo bnc 10558 7* internet stream tcp dc5ff2a0 bin2ooo bash 10547 text /usr 3190357 -rwxr-xr-x 367780 r Program received signal SIGBUS, Bus error. 0x280cd832 in bcopy () from /usr/lib/libc.so.3 (gdb) bt #0 0x280cd832 in bcopy () from /usr/lib/libc.so.3 #1 0x5 in ?? () #2 0x8048e80 in main (argc=3, argv=0xbfbfdbe8) at /usr/src/usr.bin/fstat/fstat.c:265 #3 0x80489f5 in _start () (gdb) up #1 0x5 in ?? () (gdb) up #2 0x8048e80 in main (argc=3, argv=0xbfbfdbe8) at /usr/src/usr.bin/fstat/fstat.c:265 265 dofiles(p); (gdb) list 260 putchar('\n'); 261 262 for (plast = &p[cnt]; p < plast; ++p) { 263 if (p->kp_proc.p_stat == SZOMB) 264 continue; 265 dofiles(p); 266 } 267 exit(0); 268 } 269 (gdb) quit The program is running. Exit anyway? (y or n) y [root@voyager] /usr/obj/usr/src/usr.bin/fstat: ps uwxU bin2ooo USER PID %CPU %MEM VSZ RSS TT STAT STARTED TIME COMMAND bin2ooo 10547 0.0 0.0 0 0 p2 IEs+ - 0:00.00 (bash) bin2ooo 10558 0.0 0.1 1000 576 ?? Is Thu08PM 0:03.63 bnc [root@voyager] /usr/obj/usr/src/usr.bin/fstat: Fix: I'm wondering if this is a memory failure somewhere in fstat? I'm no FreeBSD hacker, so I don't have the slightest clue. How-To-Repeat: I'm not sure if this can be reproduced on other systems, I can't seem to track down this error myself, so I can't pinpoint where it's failing and thus reproduce it elsewhere, but for the last ten minutes the same action has caused this to happen again and again. More info available upon request (that's if it's still failing when you request it :))