A lightly-loaded news server (but full newsfeed) crashes most nights while running backups from remote machines to local DLT tape drive. Crashes have only occurred while doing I/O to tape. Newsfeed activity or newsreader load doesn't seem to matter. crash.4: IdlePTD 208000 current pcb at 1f58c0 panic: page fault #0 boot (howto=256) at ../../i386/i386/machdep.c:894 894 dumppcb.pcb_ptd = rcr3(); (kgdb) bt #0 boot (howto=256) at ../../i386/i386/machdep.c:894 #1 0xf01134c3 in panic (fmt=0xf01a2ecc "page fault") at ../../kern/subr_prf.c:124 #2 0xf01a39ce in trap_fatal (frame=0xefbffd04) at ../../i386/i386/trap.c:746 #3 0xf01a3540 in trap_pfault (frame=0xefbffd04, usermode=0) at ../../i386/i386/trap.c:668 #4 0xf01a31df in trap (frame={tf_es = 16, tf_ds = 16, tf_edi = -230151168, tf_esi = -221137920, tf_ebp = -272630412, tf_isp = -266940001, tf_ebx = -1073610748, tf_edx = -266940004, tf_ecx = -1073543014, tf_eax = 1, tf_trapno = 12, tf_err = 2, tf_eip = -266940001, tf_cs = 8, tf_eflags = 66178, tf_esp = -267151551, tf_ss = -230151168}) at ../../i386/i386/trap.c:308 #5 0xf019937d in calltrap () #6 0xf016d19f in tulip_addr_filter (sc=0xf2482c00) at ../../pci/if_de.c:1847 #7 0xf01455d6 in ip_output (m0=0xf2d1b400, opt=0x0, ro=0xf2d5e9ac, flags=0, imo=0x0) at ../../netinet/ip_output.c:324 #8 0xf01494ee in tcp_output (tp=0xf2abe400) at ../../netinet/tcp_output.c:668 #9 0xf014a2e2 in tcp_usrreq (so=0xf2a1ba00, req=8, m=0x0, nam=0x0, control=0x0) at ../../netinet/tcp_usrreq.c:272 #10 0xf01207c7 in soreceive (so=0xf2a1ba00, paddr=0x0, uio=0xefbfff2c, mp0=0x0, controlp=0x0, flagsp=0x0) at ../../kern/uipc_socket.c:786 #11 0xf01158a9 in soo_read (fp=0xf2e29880, uio=0xefbfff2c, cred=0xf1c75000) at ../../kern/sys_socket.c:63 #12 0xf01146e7 in read (p=0xf2a74200, uap=0xefbfff94, retval=0xefbfff8c) at ../../kern/sys_generic.c:112 #13 0xf01a3c9b in syscall (frame={tf_es = 39, tf_ds = 39, tf_edi = 9, tf_esi = -272639912, tf_ebp = -272640008, tf_isp = -272629788, tf_ebx = 5, tf_edx = 0, tf_ecx = 5, tf_eax = 3, tf_trapno = 514, tf_err = 514, tf_eip = 134865573, tf_cs = 31, tf_eflags = 514, tf_esp = -272656416, tf_ss = 39}) at ../../i386/i386/trap.c:906 #14 0xf01993cb in Xsyscall () #15 0x9aa3 in ?? () #16 0x4139 in ?? () #17 0x3e9b in ?? () #18 0x3896 in ?? () #19 0x3142 in ?? () #20 0x2d22 in ?? () #21 0x10d3 in ?? () crash.5: Copyright 1994 Free Software Foundation, Inc... IdlePTD 208000 current pcb at 1f58c0 panic: page fault #0 boot (howto=256) at ../../i386/i386/machdep.c:894 894 dumppcb.pcb_ptd = rcr3(); (kgdb) bt #0 boot (howto=256) at ../../i386/i386/machdep.c:894 #1 0xf01134c3 in panic (fmt=0xf01a2ecc "page fault") at ../../kern/subr_prf.c:124 #2 0xf01a39ce in trap_fatal (frame=0xefbffcd8) at ../../i386/i386/trap.c:746 #3 0xf01a3540 in trap_pfault (frame=0xefbffcd8, usermode=0) at ../../i386/i386/trap.c:668 #4 0xf01a31df in trap (frame={tf_es = 16, tf_ds = 16, tf_edi = -230151168, tf_esi = -238595840, tf_ebp = -272630488, tf_isp = 1963065219, tf_ebx = -230151168, tf_edx = -266932336, tf_ecx = 1017, tf_eax = 1963065219, tf_trapno = 12, tf_err = 0, tf_eip = 1963065219, tf_cs = 8, tf_eflags = 66050, tf_esp = -266902197, tf_ss = -230151168}) at ../../i386/i386/trap.c:308 #5 0xf019937d in calltrap () #6 0x7501ff83 in ?? () #7 0xf016f2c6 in ncr_complete (np=0xf2482c00, cp=0xf2d35680) at ../../pci/ncr.c:4317 #8 0xf01455d6 in ip_output (m0=0xf2d35680, opt=0x0, ro=0xf28ff82c, flags=0, imo=0x0) at ../../netinet/ip_output.c:324 #9 0xf01494ee in tcp_output (tp=0xf289c100) at ../../netinet/tcp_output.c:668 #10 0xf014a2e2 in tcp_usrreq (so=0xf2884300, req=8, m=0x0, nam=0x0, control=0x0) at ../../netinet/tcp_usrreq.c:272 #11 0xf01207c7 in soreceive (so=0xf2884300, paddr=0x0, uio=0xefbfff2c, mp0=0x0, controlp=0x0, flagsp=0x0) at ../../kern/uipc_socket.c:786 #12 0xf01158a9 in soo_read (fp=0xf2e12dc0, uio=0xefbfff2c, cred=0xf1c75000) at ../../kern/sys_socket.c:63 #13 0xf01146e7 in read (p=0xf2b25000, uap=0xefbfff94, retval=0xefbfff8c) at ../../kern/sys_generic.c:112 #14 0xf01a3c9b in syscall (frame={tf_es = 39, tf_ds = 39, tf_edi = 9, tf_esi = -272639912, tf_ebp = -272640008, tf_isp = -272629788, tf_ebx = 5, tf_edx = 0, tf_ecx = 5, tf_eax = 3, tf_trapno = 514, tf_err = 514, tf_eip = 134865573, tf_cs = 31, tf_eflags = 514, tf_esp = -272656416, tf_ss = 39}) at ../../i386/i386/trap.c:906 #15 0xf01993cb in Xsyscall () #16 0x9aa3 in ?? () #17 0x4139 in ?? () #18 0x3e9b in ?? () #19 0x3896 in ?? () #20 0x3142 in ?? () #21 0x2d22 in ?? () #22 0x10d3 in ?? () A colleague commented: > Notice that both have this in common: > #7 0xf01455d6 in ip_output (m0=0xf2d1b400, opt=0x0, ro=0xf2d5e9ac, flags=0, > imo=0x0) at ../../netinet/ip_output.c:324 > Both are followed by a completely unrelated procedure call > (ncr_complete and tulip_addr_filter). Either they're interrupt > handlers or random jumps. tulip_addr_filter is called in two places: > to add or delete multicast addresses to the DEC21140 board's filter > list, and when resetting the board (which happens when initializing > it, when recovering from certain errors in the interrupt handler, when > changing the physical port, etc.). I would expect to see a function > on the call stack before it though, because tulip_addr_filter doesn't > seem to be an interrupt handler itself. > Here's the relevant part of ip_output.c: > sendit: > /* > * If small enough for interface, can just send directly. > */ > if ((u_short)ip->ip_len <= ifp->if_mtu) { > ip->ip_len = htons((u_short)ip->ip_len); > ip->ip_off = htons((u_short)ip->ip_off); > ip->ip_sum = 0; > ip->ip_sum = in_cksum(m, hlen); > 324: error = (*ifp->if_output)(ifp, m, > (struct sockaddr *)dst, ro->ro_rt); > goto done; > } > Perhaps ifp->if_output has been corrupted somehow? Fix: None. How-To-Repeat: Run backups. :-\
State Changed From-To: open->closed Was supposed to close this months ago according to Originator