Summary: | [mutex] _mtx_unlock_sleep() race condition if ADAPTIVE_MUTEXES is defined | ||
---|---|---|---|
Product: | Base System | Reporter: | Stephan Uphoff <ups> |
Component: | kern | Assignee: | John Baldwin <jhb> |
Status: | Closed FIXED | ||
Severity: | Affects Only Me | ||
Priority: | Normal | ||
Version: | Unspecified | ||
Hardware: | Any | ||
OS: | Any |
Description
Stephan Uphoff
2004-04-12 02:10:17 UTC
Responsible Changed From-To: freebsd-bugs->jhb jhb is likely to be interested in this The same race in 8.0-STABLE amd64 (~middle of december'09). It happens very rarely (once a week). This router runs on dual Intel Xeon X5570. It traps in _mtx_lock_sleep 'if (TD_IS_RUNNING(owner)) {' with owner (ecx) equal NULL (0x0), and very strange, that m->mtx_lock (eax) is also 0x0, but why it is happened ? Fatal trap 12: page fault while in kernel mode cpuid = 1; apic id = 02 fault virtual address = 0x288 fault code = supervisor read data, page not present instruction pointer = 0x20:0xffffffff802ef1be stack pointer = 0x28:0xffffff80761d97c0 frame pointer = 0x28:0xffffff80761d97e0 code segment = base rx0, limit 0xfffff, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = 4461 (bgpd) [thread pid 4461 tid 100169 ] Stopped at _mtx_lock_sleep+0x4e: movl 0x288(%rcx),%esi db:0:kdb.enter.default> show pcpu cpuid = 1 dynamic pcpu = 0xffffff807f909c80 curthread = 0xffffff0003439ae0: pid 4461 "bgpd" curpcb = 0xffffff80761d9d40 fpcurthread = none idlethread = 0xffffff00014ac000: pid 10 "idle: cpu1" curpmap = 0 tssp = 0xffffffff80794ee8 commontssp = 0xffffffff80794ee8 rsp0 = 0xffffff80761d9d40 gs32p = 0xffffffff80793d20 ldt = 0xffffffff80793d60 tss = 0xffffffff80793d50 db:0:kdb.enter.default> show reg cs 0x20 WAKEUP_efer ds 0x3b WAKEUP_lstar+0x3 es 0x3b003b fs 0x288001b0013 gs 0x288001b ss 0x28 WAKEUP_pat rax 0 rcx 0 rdx 0xffffff0003439ae0 rbx 0xffffff800001c810 rsp 0xffffff80761d97c0 rbp 0xffffff80761d97e0 rsi 0xffffff0003439ae0 rdi 0xffffff800001c810 r8 0 r9 0xffffff011b721d00 r10 0x9 WAKEUP_xpcb+0x1 r11 0x2 r12 0xffffff0003439ae0 r13 0xffffff800001c918 r14 0x3 r15 0xffffff0150480d00 rip 0xffffffff802ef1be _mtx_lock_sleep+0x4e rflags 0x10246 _mtx_lock_sleep+0x4e: movl 0x288(%rcx),%esi db:0:kdb.enter.default> bt Tracing pid 4461 tid 100169 td 0xffffff0003439ae0 _mtx_lock_sleep() at _mtx_lock_sleep+0x4e netisr_queue_internal() at netisr_queue_internal+0xe1 netisr_queue_src() at netisr_queue_src+0x3c route_output() at route_output+0x11b sosend_generic() at sosend_generic+0x3f6 soo_write() at soo_write+0x37 dofilewrite() at dofilewrite+0x85 kern_writev() at kern_writev+0x60 writev() at writev+0x41 syscall() at syscall+0x1da Xfast_syscall() at Xfast_syscall+0xe1 --- syscall (121, FreeBSD ELF64, writev), rip = 0x80075679c, rsp = 0x7fffffffe998, rbp = 0x8013e3040 --- -- best regards, Pavel Nikiforov ARTX network administrator Mail to: pavel@co.artx.ru On Monday 11 January 2010 6:00:11 am Pavel Nikiforov wrote: > The following reply was made to PR kern/65448; it has been noted by GNATS. > > From: Pavel Nikiforov <pavel@artx.ru> > To: bug-followup@FreeBSD.org, ups@tree.com > Cc: > Subject: Re: kern/65448: _mtx_unlock_sleep() race condition if ADAPTIVE_MUTEXES > is defined > Date: Mon, 11 Jan 2010 13:39:26 +0300 > > The same race in 8.0-STABLE amd64 (~middle of december'09). > It happens very rarely (once a week). This router runs on dual Intel Xeon X5570. > > It traps in _mtx_lock_sleep 'if (TD_IS_RUNNING(owner)) {' > with owner (ecx) equal NULL (0x0), and very strange, that m->mtx_lock (eax) > is also 0x0, but why it is happened ? If the mtx_lock is zero that sounds like a different bug where the memory containing the lock has been cleared via bzero() or the like. Perhaps a use- after-free bug? -- John Baldwin I believe that https://svnweb.freebsd.org/changeset/base/289661 should address this. I think that the r289661 fixes the first issue, about accessing the invalid memory. The second issue, about absense of barriers in the spoin loop, seems to be already covered by the owner local variable declared volatile. |