| Summary: | /usr/sbin/dev_mkdb dumps core | ||
|---|---|---|---|
| Product: | Base System | Reporter: | Ryan Dooley <dooleyr> |
| Component: | bin | Assignee: | Yar Tikhiy <yar> |
| Status: | Closed FIXED | ||
| Severity: | Affects Only Me | ||
| Priority: | Normal | ||
| Version: | Unspecified | ||
| Hardware: | Any | ||
| OS: | Any | ||
On Wed, Jan 16, 2002 at 08:52:41AM -0600, Ryan Dooley wrote: > > >From GDB: dev_mkdb was compiled with -g here. > > Program received signal SIGSEGV, Segmentation fault. > 0x280cc8df in __free_ovflpage () from /usr/lib/libc.so.4 > (gdb) where > #0 0x280cc8df in __free_ovflpage () from /usr/lib/libc.so.4 > #1 0x280ccf9f in __big_delete () from /usr/lib/libc.so.4 > #2 0x280cb8e5 in __delpair () from /usr/lib/libc.so.4 > #3 0x280ce5c8 in __hash_open () from /usr/lib/libc.so.4 > #4 0x280ce284 in __hash_open () from /usr/lib/libc.so.4 > #5 0x8048a10 in main (argc=1, argv=0xbfbffc28) > at /usr/src/usr.sbin/dev_mkdb/dev_mkdb.c:153 > #6 0x8048739 in _start () > Try compiling a debugging version of libc and linking dev_mkdb statically with it. You can run a stripped down version of it, and the dump could still be used with the unstripped version for post mortem analysis. Cheers, -- Ruslan Ermilov Oracle Developer/DBA, ru@sunbay.com Sunbay Software AG, ru@FreeBSD.org FreeBSD committer, +380.652.512.251 Simferopol, Ukraine http://www.FreeBSD.org The Power To Serve http://www.oracle.com Enabling The Information Age > Try compiling a debugging version of libc and linking dev_mkdb
> statically with it. You can run a stripped down version of it,
> and the dump could still be used with the unstripped version
> for post mortem analysis.
Unfortunatly, I don't have that option off hand right now. The curious
thing I just installed on a similar workstation and the problem doesn't
exist there.
The only difference(s) are: (new vs. old where problem exists)
1) Generic Mach32 video card vs. 3Dfx Voodoo3 PCI video card,
2) 128MB ram vs. 384MB ram, and
3) generic newfs options vs. -b 32768 and -f 4096.
Ryan
On Thu, 17 Jan 2002 12:00:05 PST, Ryan Dooley wrote:
> The only difference(s) are: (new vs. old where problem exists)
>
> 1) Generic Mach32 video card vs. 3Dfx Voodoo3 PCI video card,
> 2) 128MB ram vs. 384MB ram, and
> 3) generic newfs options vs. -b 32768 and -f 4096.
I'd be _very_ careful trying a block size anything larger than 16384.
I've heard horrible things about larger block sizes. I'm pretty sure
Matt Dillon warned that >16384 block sizes would cause undesirable
behaviour in the VM sysystem.
Certainly, VM problems could account for your SEGV.
Matt? Am I smoking crack, or did you say Very Bad Things about the VM
system and block sizes >16384?
Ciao,
Sheldon.
Hey, > I'd be _very_ careful trying a block size anything larger than 16384. > I've heard horrible things about larger block sizes. I'm pretty sure > Matt Dillon warned that >16384 block sizes would cause undesirable > behaviour in the VM sysystem. > > Certainly, VM problems could account for your SEGV. > > Matt? Am I smoking crack, or did you say Very Bad Things about the VM > system and block sizes >16384? Uh oh, I have a server then with 65536/8192 (bs,fr) for a 953GB fiber channel raid. I've not noticed anything bad off hand (it was CVSup'd on Saturday around midnight CST. This actually concearns me more than my workstation :-) We changed the block/frag size to speed up file system checks when we had to fsck that partition (which holds 41000+ user home directories). We went from fsck's taking about 120 minutes to 15 minutes which we drastically needed. I have had reports that reads over NFS (a client running a program on a file (SAS data)) took two or three attempts to initialy access the file before sas ran with it. Sounded like a NFS cache issue, but I couldn't reproduce the error myself. (AIX client to FreeBSD server). As the semester is about to start, I can't reformat that array right now. Ryan :On Thu, 17 Jan 2002 12:00:05 PST, Ryan Dooley wrote:
:
:> The only difference(s) are: (new vs. old where problem exists)
:>
:> 1) Generic Mach32 video card vs. 3Dfx Voodoo3 PCI video card,
:> 2) 128MB ram vs. 384MB ram, and
:> 3) generic newfs options vs. -b 32768 and -f 4096.
:
:I'd be _very_ careful trying a block size anything larger than 16384.
:I've heard horrible things about larger block sizes. I'm pretty sure
:Matt Dillon warned that >16384 block sizes would cause undesirable
:behaviour in the VM sysystem.
:
:Certainly, VM problems could account for your SEGV.
:
:Matt? Am I smoking crack, or did you say Very Bad Things about the VM
:system and block sizes >16384?
:
:Ciao,
:Sheldon.
It should work fine as long as the filesystem frag ratio is 8:1. The
buffer cache is optimized for 16384 byte buffers and can become
fragmented if larger block sizes are used, leading to inefficient
operation, but should have no other adverse effects. I would not use
a block size greater then 65536 though because you start to hit up
against internal limitations. Remember, the buffer cache has to reserve
KVA for each buffer, so the system's cache efficiency is going to drop
as the buffer size increases.
-Matt
Matthew Dillon
<dillon@backplane.com>
On Thu, 17 Jan 2002 15:13:34 CST, Ryan Dooley wrote:
> I have a server then with 65536/8192 (bs,fr) for a 953GB
> fiber channel raid. I've not noticed anything bad off hand
> (it was CVSup'd on Saturday around midnight CST.
>
> This actually concearns me more than my workstation :-)
Wait for feedback from Matt. I might just be horribly confused.
Ciao,
Sheldon.
: Uh oh,
:
: I have a server then with 65536/8192 (bs,fr) for a 953GB
: fiber channel raid. I've not noticed anything bad off hand
: (it was CVSup'd on Saturday around midnight CST.
That should be fine.
: This actually concearns me more than my workstation :-)
:
: We changed the block/frag size to speed up file system checks
: when we had to fsck that partition (which holds 41000+ user
: home directories). We went from fsck's taking about 120 minutes
: to 15 minutes which we drastically needed.
:
: I have had reports that reads over NFS (a client running a program
: on a file (SAS data)) took two or three attempts to initialy
: access the file before sas ran with it. Sounded like a NFS cache
: issue, but I couldn't reproduce the error myself. (AIX client to
: FreeBSD server).
:
: As the semester is about to start, I can't reformat that array
: right now.
:
: Ryan
This would depend on the NFS block size, which is independant of the
filesystem block size. Even a standard NFS block size of 8K requires
7 IP fragments to construct a packet (with a standard ethernet's MTU).
A larger NFS block size would result in even more fragments and
potentially overload the client's packet buffers.
It is usually possible to mitigate NFS 'packet storm' issues by using
TCP NFS mounts rather then UDP.
-Matt
Matthew Dillon
<dillon@backplane.com>
On Thu, 17 Jan 2002 13:41:53 PST, Matthew Dillon wrote:
> It should work fine as long as the filesystem frag ratio is 8:1. The
> buffer cache is optimized for 16384 byte buffers and can become
> fragmented if larger block sizes are used, leading to inefficient
> operation, but should have no other adverse effects.
Okay, so there are no nasty surprises beyond what's already documented
in newfs(8) and tuning(7).
Sorry for the false alarm, Ryan. Back to the drawing board on trying to
find your problem.
It'd be interesting to see whether the problematic box makes it through
a buildworld without any problems.
Ciao,
Sheldon.
> That should be fine. *whew* :-) > This would depend on the NFS block size, which is independant of the > filesystem block size. Even a standard NFS block size of 8K requires > 7 IP fragments to construct a packet (with a standard ethernet's MTU). > A larger NFS block size would result in even more fragments and > potentially overload the client's packet buffers. Right, we saw this with 32k packet sizes and we just left the default 8k. > It is usually possible to mitigate NFS 'packet storm' issues by using > TCP NFS mounts rather then UDP. For our IRIX and AIX clients that nfsv3/tcp works just fine. With Linux however, the only thing we've got is nfsv3/udp.... That darn linux :-) Thanks for getting back with me on this. Cheers, Ryan > Okay, so there are no nasty surprises beyond what's already documented > in newfs(8) and tuning(7). :-) > Sorry for the false alarm, Ryan. Back to the drawing board on trying to > find your problem. No problem. I'd rather know if there was something up. I could have engineered a little down time at oh-dark-thirty during our maintaince window. The server has a twin machine (different size raid though) and rsync's over gigE don't take too long :-) > It'd be interesting to see whether the problematic box makes it through > a buildworld without any problems. Acutally, it doesn't have any issues doing a buildworld (but the system drive has the defaults set for bs/fg. Cheers, Ryan :
:
:> Okay, so there are no nasty surprises beyond what's already documented
:> in newfs(8) and tuning(7).
:
: :-)
:
:> Sorry for the false alarm, Ryan. Back to the drawing board on trying to
:> find your problem.
:
: No problem. I'd rather know if there was something up. I could
: have engineered a little down time at oh-dark-thirty during
: our maintaince window. The server has a twin machine (different
: size raid though) and rsync's over gigE don't take too long :-)
:
:> It'd be interesting to see whether the problematic box makes it through
:> a buildworld without any problems.
:
: Acutally, it doesn't have any issues doing a buildworld (but the
: system drive has the defaults set for bs/fg.
:
: Cheers,
: Ryan
Ryan, if you can make the dev_mkdb core dump and (-g compiled) binary
available for download I will take a look at it.
Also check for duplicate device nodes in /dev or device nodes that
exist on the machine exhibiting the problem that do not exist on
machines that do not exhibit the problem.
It sounds like a program bug to me rather then an OS bug.
-Matt
Matthew Dillon
<dillon@backplane.com>
> Ryan, if you can make the dev_mkdb core dump and (-g compiled) binary > available for download I will take a look at it. I'll see what I can do. > Also check for duplicate device nodes in /dev or device nodes that > exist on the machine exhibiting the problem that do not exist on > machines that do not exhibit the problem. No dup /dev entries (and they both have the same list). > It sounds like a program bug to me rather then an OS bug. > Yeah. The system itself runs as expected. Cheers, Ryan On Thu, Jan 17, 2002 at 01:56:29PM -0600, Ryan Dooley wrote: > > Try compiling a debugging version of libc and linking dev_mkdb > > statically with it. You can run a stripped down version of it, > > and the dump could still be used with the unstripped version > > for post mortem analysis. > > Unfortunatly, I don't have that option off hand right now. The curious > thing I just installed on a similar workstation and the problem doesn't > exist there. > > The only difference(s) are: (new vs. old where problem exists) > > 1) Generic Mach32 video card vs. 3Dfx Voodoo3 PCI video card, > 2) 128MB ram vs. 384MB ram, and > 3) generic newfs options vs. -b 32768 and -f 4096. > This is unlikely to be the case. Cheers, -- Ruslan Ermilov Oracle Developer/DBA, ru@sunbay.com Sunbay Software AG, ru@FreeBSD.org FreeBSD committer, +380.652.512.251 Simferopol, Ukraine http://www.FreeBSD.org The Power To Serve http://www.oracle.com Enabling The Information Age State Changed From-To: open->closed dev_mkdb has been removed from all supported branches after the advent of devfs. Responsible Changed From-To: freebsd-bugs->yar So I can see feedback. |
Dell Optiplex GX1 FreeBSD 4.5-RC (cvsup'd 15-Jan-2002) dev_mkdb dumps core when run How-To-Repeat: Can't. I've got two other -STABLE machines which don't exhibit the same behavior. From GDB: dev_mkdb was compiled with -g here. Program received signal SIGSEGV, Segmentation fault. 0x280cc8df in __free_ovflpage () from /usr/lib/libc.so.4 (gdb) where #0 0x280cc8df in __free_ovflpage () from /usr/lib/libc.so.4 #1 0x280ccf9f in __big_delete () from /usr/lib/libc.so.4 #2 0x280cb8e5 in __delpair () from /usr/lib/libc.so.4 #3 0x280ce5c8 in __hash_open () from /usr/lib/libc.so.4 #4 0x280ce284 in __hash_open () from /usr/lib/libc.so.4 #5 0x8048a10 in main (argc=1, argv=0xbfbffc28) at /usr/src/usr.sbin/dev_mkdb/dev_mkdb.c:153 #6 0x8048739 in _start ()