Bug 195000 - [patch] rsync -a to fuse fs may cause kernel panic
Summary: [patch] rsync -a to fuse fs may cause kernel panic
Status: Closed FIXED
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: CURRENT
Hardware: Any Any
: --- Affects Some People
Assignee: Rick Macklem
URL:
Keywords:
: 187261 (view as bug list)
Depends on:
Blocks:
 
Reported: 2014-11-14 04:49 UTC by Henry Hu
Modified: 2017-12-04 01:03 UTC (History)
2 users (show)

See Also:


Attachments
temporary fix for the crash (672 bytes, patch)
2014-11-14 04:49 UTC, Henry Hu
no flags Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Henry Hu 2014-11-14 04:49:51 UTC
Created attachment 149389 [details]
temporary fix for the crash

I've hit a crash in the fuse module when doing a rsync to an ntfs volume mounted with ntfs-3g. To reproduce, just use "rsync -a" to sync a dir to a NTFS volume mounted with ntfs-3g, and put a socket in that dir.

The crash is the same as ones reported before, in

https://lists.freebsd.org/pipermail/freebsd-current/2013-October/045993.html

and there are other similar reports:

http://www.bsdforen.de/threads/probleme-mit-rsync-und-sshfs.29323/

I'm posting their backtrace here:

Fatal trap 12: page fault while in kernel mode
cpuid = 1; apic id = 01
fault virtual address	= 0x64
fault code		= supervisor read, page not present
instruction pointer	= 0x20:0xcae6adb6
stack pointer	        = 0x28:0xf0ac29a0
frame pointer	        = 0x28:0xf0ac2a0c
code segment		= base rx0, limit 0xfffff, type 0x1b
			= DPL 0, pres 1, def32 1, gran 1
processor eflags	= interrupt enabled, resume, IOPL = 0
current process		= 14116 (conftest)
trap number		= 12
panic: page fault
cpuid = 1
KDB: stack backtrace:
#0 0xc0aed942 at kdb_backtrace+0x52
#1 0xc0ab37e1 at panic+0x121
#2 0xc0f8df09 at trap_fatal+0x339
#3 0xc0f8e23d at trap_pfault+0x31d
#4 0xc0f8d819 at trap+0x519
#5 0xc0f776ec at calltrap+0x6
#6 0xc0fb2864 at VOP_CREATE_APV+0x94
#7 0xc0b355ab at uipc_bindat+0x36b
#8 0xc0b33307 at uipc_bind+0x27
#9 0xc0b2c277 at kern_bindat+0x147
#10 0xc0b2c064 at sys_bind+0x74
#11 0xc0f8e939 at syscall+0x479
#12 0xc0f77781 at Xint0x80_syscall+0x21
Uptime: 1d23h57m34s
Physical memory: 2027 MB
<...>
(kgdb) #0  doadump (textdump=-961984384) at pcpu.h:233
#1  0xc0ab3459 in kern_reboot (howto=260)
    at /usr/src/sys/kern/kern_shutdown.c:447
#2  0xc0ab381f in panic (fmt=<value optimized out>)
    at /usr/src/sys/kern/kern_shutdown.c:754
#3  0xc0f8df09 in trap_fatal (frame=<value optimized out>, eva=100)
    at /usr/src/sys/i386/i386/trap.c:1047
#4  0xc0f8e23d in trap_pfault (frame=0x0, usermode=<value optimized out>, 
    eva=0) at /usr/src/sys/i386/i386/trap.c:859
#5  0xc0f8d819 in trap (frame=0xf0ac2960) at /usr/src/sys/i386/i386/trap.c:556
#6  0xc0f776ec in calltrap () at /usr/src/sys/i386/i386/exception.s:170
#7  0xcae6adb6 in fuse_vnop_create (ap=0x0)
    at /usr/src/sys/modules/fuse/../../fs/fuse/fuse_vnops.c:368
#8  0xc0fb2864 in VOP_CREATE_APV (vop=<value optimized out>, a=0xf0ac2b88)
    at vnode_if.c:265
#9  0xc0b355ab in uipc_bindat (so=0xf0ac2b20, nam=<value optimized out>, 
    td=<value optimized out>) at vnode_if.h:109
#10 0xc0b33307 in uipc_bind (so=0xc80ab9f0, nam=0xc8580e80, td=0xce271620)
    at /usr/src/sys/kern/uipc_usrreq.c:573
#11 0xc0b2c277 in kern_bindat (td=0xce271620, dirfd=<value optimized out>, 
    fd=<value optimized out>, sa=0xce271620)
    at /usr/src/sys/kern/uipc_syscalls.c:283
#12 0xc0b2c064 in sys_bind (td=0x0, uap=<value optimized out>)
    at /usr/src/sys/kern/uipc_syscalls.c:297
#13 0xc0f8e939 in syscall (frame=<value optimized out>) at subr_syscall.c:134
#14 0xc0f77781 in Xint0x80_syscall ()
    at /usr/src/sys/i386/i386/exception.s:270
#15 0x00000033 in ?? ()
Previous frame inner to this frame (corrupt stack?)
Current language:  auto; currently minimal
(kgdb) 

This may be related to bug 167362, but in that case, RIP is 0, which means that it should be a different problem. Bug 182739 is similar to bug 167362.

After digging it a bit, I found that the problem is in fuse_vnop_create().
Check https://github.com/freebsd/freebsd/blame/master/sys/fs/fuse/fuse_vnops.c#L337.
At line 337, it checks if vap->va_type is VREG, and if it is not, it goes to label bringup.
Then, feo is assigned with fdip->answ and used. But fdip which points to fdi is initialized after the goto. As a result, when vap->va_type != VREG, fdi is not initialized and feo is invalid.

I made a patch and it works for me. In my case, the problematic file is a socket.

But I think that fuse filesystems may support file types other than VREG, so maybe we should remove that check completely?

In fuse4x for mac, https://github.com/fuse4x/kext/blob/master/fuse_vnops.c#L375, the logic is similar, but the flow is different.
Comment 1 commit-hook freebsd_committer freebsd_triage 2016-05-18 22:24:02 UTC
A commit references this bug:

Author: rmacklem
Date: Wed May 18 22:23:20 UTC 2016
New revision: 300169
URL: https://svnweb.freebsd.org/changeset/base/300169

Log:
  If a local (AF_LOCAL, AF_UNIX) socket creation (bind) is attempted
  on a fuse mounted file system, it will crash. Although it may be
  possible to make this work correctly, this patch avoids the crash
  in the meantime.
  I removed the MPASS(), since panicing for the FIFO case didn't make
  a lot of sense when it returns an error for the others.

  PR:		195000
  Submitted by:	henry.hu.sh@gmail.com (earlier version)
  MFC after:	2 weeks

Changes:
  head/sys/fs/fuse/fuse_vnops.c
Comment 2 Henry Hu 2017-12-03 21:50:07 UTC
Should this PR be closed? head, stable/11 and stable/10 all have this patch already.
Comment 3 Mark Linimon freebsd_committer freebsd_triage 2017-12-04 00:30:23 UTC
Already merged to all relevant branches.
Comment 4 Conrad Meyer freebsd_committer freebsd_triage 2017-12-04 01:03:25 UTC
*** Bug 187261 has been marked as a duplicate of this bug. ***