Bug 32184

Summary: Kernel crashes in ufs code
Product: Base System Reporter: Roman Shterenzon <roman>
Component: kernAssignee: freebsd-bugs (Nobody) <bugs>
Status: Closed FIXED    
Severity: Affects Only Me CC: stable
Priority: Normal    
Version: 4.4-STABLE   
Hardware: Any   
OS: Any   

Description Roman Shterenzon 2001-11-22 11:20:00 UTC
When heavy traffic passes through xl0 interface the system crashes in ufs code.

Script started on Thu Nov 22 13:04:33 2001
alchemy:/usr/src/sys/compile/ALCHEMY# gdb -k kernel.debug /var/crash/vmcore.0 
GNU gdb 4.18
Copyright 1998 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "i386-unknown-freebsd"...
IdlePTD 3870720
initial pcb at 30f4a0
panicstr: page fault
panic messages:
---
Fatal trap 12: page fault while in kernel mode
fault virtual address	= 0xf03e7837
fault code		= supervisor read, page not present
instruction pointer	= 0x8:0xc01eb738
stack pointer	        = 0x10:0xc02cca28
frame pointer	        = 0x10:0xc02cca34
code segment		= base rx0, limit 0xfffff, type 0x1b
			= DPL 0, pres 1, def32 1, gran 1
processor eflags	= interrupt enabled, resume, IOPL = 0
current process		= Idle
interrupt mask		= net 
trap number		= 12
panic: page fault

syncing disks... 

Fatal trap 12: page fault while in kernel mode
fault virtual address	= 0x10
fault code		= supervisor read, page not present
instruction pointer	= 0x8:0xc021cb17
stack pointer	        = 0x10:0xc02cc6c0
frame pointer	        = 0x10:0xc02cc718
code segment		= base rx0, limit 0xfffff, type 0x1b
			= DPL 0, pres 1, def32 1, gran 1
processor eflags	= interrupt enabled, resume, IOPL = 0
current process		= Idle
interrupt mask		= net 
trap number		= 12
panic: page fault
Uptime: 3m44s

dumping to dev #ad/0x30001, offset 532448
dump ata0: resetting devices .. done
255 254 253 252 251 250 249 248 247 246 245 244 243 242 241 240 239 238 237 236 235 234 233 232 231 230 229 228 227 226 225 224 223 222 221 220 219 218 217 216 215 214 213 212 211 210 209 208 207 206 205 204 203 202 201 200 199 198 197 196 195 194 193 192 191 190 189 188 187 186 185 184 183 182 181 180 179 178 177 176 175 174 173 172 171 170 169 168 167 166 165 164 163 162 161 160 159 158 157 156 155 154 153 152 151 150 149 148 147 146 145 144 143 142 141 140 139 138 137 136 135 134 133 132 131 130 129 128 127 126 125 124 123 122 121 120 119 118 117 116 115 114 113 112 111 110 109 108 107 106 105 104 103 102 101 100 99 98 97 96 95 94 93 92 91 90 89 88 87 86 85 84 83 82 81 80 79 78 77 76 75 74 73 72 71 70 69 68 67 66 65 64 63 62 61 60 59 58 57 56 55 54 53 52 51 50 49 48 47 46 45 44 43 42 41 40 39 38 37 36 35 34 33 32 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 
---
#0  dumpsys () at ../../kern/kern_shutdown.c:473
473		if (dumping++) {
(kgdb) bt
#0  dumpsys () at ../../kern/kern_shutdown.c:473
#1  0xc017151c in boot (howto=260) at ../../kern/kern_shutdown.c:313
#2  0xc01718f8 in poweroff_wait (junk=0xc02c452c, howto=-1070841777)
    at ../../kern/kern_shutdown.c:581
#3  0xc027558b in trap_fatal (frame=0xc02cc680, eva=16)
    at ../../i386/i386/trap.c:956
#4  0xc027524d in trap_pfault (frame=0xc02cc680, usermode=0, eva=16)
    at ../../i386/i386/trap.c:849
#5  0xc0274e2f in trap (frame={tf_fs = 16, tf_es = 16, tf_ds = 16, tf_edi = 1, 
      tf_esi = 0, tf_ebp = -1070807272, tf_isp = -1070807380, tf_ebx = 0, 
      tf_edx = 160, tf_ecx = 4096, tf_eax = 0, tf_trapno = 12, tf_err = 0, 
      tf_eip = -1071527145, tf_cs = 8, tf_eflags = 66118, tf_esp = -716537984, 
      tf_ss = 0}) at ../../i386/i386/trap.c:448
#6  0xc021cb17 in vnode_pager_generic_putpages (vp=0xd54a7f80, m=0xc02cc7b8, 
    bytecount=4096, flags=0, rtvals=0xc02cc788) at ../../vm/vnode_pager.c:1003
#7  0xc020753b in ffs_putpages (ap=0xc02cc74c)
    at ../../ufs/ufs/ufs_readwrite.c:722
#8  0xc021c96e in vnode_pager_putpages (object=0xd54b0180, m=0xc02cc7b8, 
    count=1, sync=0, rtvals=0xc02cc788) at vnode_if.h:1147
#9  0xc0219885 in vm_pageout_flush (mc=0xc02cc7b8, count=1, flags=0)
    at ../../vm/vm_pager.h:145
#10 0xc0216d72 in vm_object_page_clean (object=0xd54b0180, start=0, end=0, 
    flags=4) at ../../vm/vm_object.c:680
---Type <return> to continue, or q <return> to quit---
#11 0xc01a13de in vfs_msync (mp=0xc0ddd200, flags=2)
    at ../../kern/vfs_subr.c:2599
#12 0xc01a2350 in sync (p=0xc03252a0, uap=0x0) at ../../kern/vfs_syscalls.c:546
#13 0xc01712ce in boot (howto=256) at ../../kern/kern_shutdown.c:234
#14 0xc01718f8 in poweroff_wait (junk=0xc02c452c, howto=-1070841777)
    at ../../kern/kern_shutdown.c:581
#15 0xc027558b in trap_fatal (frame=0xc02cc9e8, eva=4030625847)
    at ../../i386/i386/trap.c:956
#16 0xc027524d in trap_pfault (frame=0xc02cc9e8, usermode=0, eva=4030625847)
    at ../../i386/i386/trap.c:849
#17 0xc0274e2f in trap (frame={tf_fs = 16, tf_es = 16, tf_ds = 16, 
      tf_edi = -1062710272, tf_esi = 6685184, tf_ebp = -1070806476, 
      tf_isp = -1070806508, tf_ebx = -1062710272, tf_edx = -1063047168, 
      tf_ecx = -264341449, tf_eax = 389973, tf_trapno = 12, tf_err = 0, 
      tf_eip = -1071728840, tf_cs = 8, tf_eflags = 66050, 
      tf_esp = -1062819840, tf_ss = -1059785052}) at ../../i386/i386/trap.c:448
#18 0xc01eb738 in xl_newbuf (sc=0xc0d4f000, c=0xc0d4f6a4)
    at ../../pci/if_xl.c:1724
#19 0xc01eb90a in xl_rxeof (sc=0xc0d4f000) at ../../pci/if_xl.c:1825
#20 0xc01ec048 in xl_intr (arg=0xc0d4f000) at ../../pci/if_xl.c:2060
#21 0xc027d917 in intr_mux (arg=0xc0a34800)
    at ../../i386/isa/intr_machdep.c:582
(kgdb) up 6
#6  0xc021cb17 in vnode_pager_generic_putpages (vp=0xd54a7f80, m=0xc02cc7b8, 
    bytecount=4096, flags=0, rtvals=0xc02cc788) at ../../vm/vnode_pager.c:1003
1003		error = VOP_WRITE(vp, &auio, ioflags, curproc->p_ucred);
(kgdb) print vp
$1 = (struct vnode *) 0xd54a7f80
(kgdb) print *vp
$2 = {v_flag = 8192, v_usecount = 5, v_writecount = 2, v_holdcnt = 0, 
  v_id = 1115, v_mount = 0xc0ddd200, v_op = 0xc0d60300, v_freelist = {
    tqe_next = 0x0, tqe_prev = 0x0}, v_nmntvnodes = {tqe_next = 0xd54a7ec0, 
    tqe_prev = 0xd54a8064}, v_cleanblkhd = {tqh_first = 0x0, 
    tqh_last = 0xd54a7fac}, v_dirtyblkhd = {tqh_first = 0x0, 
    tqh_last = 0xd54a7fb4}, v_synclist = {le_next = 0xd54a8040, 
    le_prev = 0xd54a7efc}, v_numoutput = 0, v_type = VREG, v_un = {
    vu_mountedhere = 0x0, vu_socket = 0x0, vu_spec = {vu_specinfo = 0x0, 
      vu_specnext = {sle_next = 0x0}}, vu_fifoinfo = 0x0}, v_lease = 0x0, 
  v_lastw = 0, v_cstart = 0, v_lasta = 199904, v_clen = 15, 
  v_object = 0xd54b0180, v_interlock = {lock_data = 0}, v_vnlock = 0xc0eccc00, 
  v_tag = VT_UFS, v_data = 0xc0eccc00, v_cache_src = {lh_first = 0x0}, 
  v_cache_dst = {tqh_first = 0xc0ec97c0, tqh_last = 0xc0ec97d0}, 
  v_dd = 0xd54a7f80, v_ddid = 0, v_pollinfo = {vpi_lock = {lock_data = 0}, 
    vpi_selinfo = {si_pid = 0, si_note = {slh_first = 0x0}, si_flags = 0}, 
    vpi_events = 0, vpi_revents = 0}, v_vxproc = 0x0}
(kgdb) print auio
$3 = {uio_iov = 0xc02cc6dc, uio_iovcnt = 1, uio_offset = 0, uio_resid = 4096, 
  uio_segflg = UIO_NOCOPY, uio_rw = UIO_WRITE, uio_procp = 0x0}
(kgdb) print ioflags
$4 = 0
(kgdb) print curproc->p_ucred
Attempt to extract a component of a value that is not a structure pointer.
(kgdb) print curproc->p_ucred         
$5 = 0
(kgdb) up
#7  0xc020753b in ffs_putpages (ap=0xc02cc74c)
    at ../../ufs/ufs/ufs_readwrite.c:722
722             return vnode_pager_generic_putpages(ap->a_vp, ap->a_m, ap->a_count,
(kgdb) print ap
$1 = (struct vop_putpages_args *) 0x0

(kgdb) quit
alchemy:/usr/src/sys/compile/ALCHEMY#

Script done on Thu Nov 22 13:05:48 2001

Fix: 

No fix is known to me.
How-To-Repeat: I can repeat this crash, verified that.
Comment 1 iedowse 2001-11-22 13:58:49 UTC
In message <200111221112.fAMBCs300652@alchemy.oven.org>, Roman Shterenzon write
s:
>
>>Synopsis:       Kernel crashes in ufs code

The first trap occurs at frame #18 in xl_newbuf(). This is the
frame of interest. The UFS trap is probably caused by the kernel
attempting to sync the disks before rebooting at an inappropriate
time.

The line in question is

	MCLGET(m_new, M_DONTWAIT);

so some sort of mbuf cluster free-list corruption has probably
occurred. Tracking down such problems is not particularly easy...

Ian

>#17 0xc0274e2f in trap (frame={tf_fs = 16, tf_es = 16, tf_ds = 16, 
>      tf_edi = -1062710272, tf_esi = 6685184, tf_ebp = -1070806476, 
>      tf_isp = -1070806508, tf_ebx = -1062710272, tf_edx = -1063047168, 
>      tf_ecx = -264341449, tf_eax = 389973, tf_trapno = 12, tf_err = 0, 
>      tf_eip = -1071728840, tf_cs = 8, tf_eflags = 66050, 
>      tf_esp = -1062819840, tf_ss = -1059785052}) at ../../i386/i386/trap.c:44
>8
>#18 0xc01eb738 in xl_newbuf (sc=0xc0d4f000, c=0xc0d4f6a4)
>    at ../../pci/if_xl.c:1724
>#19 0xc01eb90a in xl_rxeof (sc=0xc0d4f000) at ../../pci/if_xl.c:1825
>#20 0xc01ec048 in xl_intr (arg=0xc0d4f000) at ../../pci/if_xl.c:2060
>#21 0xc027d917 in intr_mux (arg=0xc0a34800)
>    at ../../i386/isa/intr_machdep.c:582
Comment 2 iedowse freebsd_committer freebsd_triage 2002-12-01 20:32:33 UTC
State Changed
From-To: open->feedback


Does this still happen with a more recent -STABLE?
Comment 3 iedowse freebsd_committer freebsd_triage 2002-12-01 20:51:43 UTC
State Changed
From-To: feedback->closed


Mail to submitter bounces.