1) The system may panic when accessing devices over nullfs: mount_null(fs) /dev /mnt dd if=/mnt/zero of=/tmp/t.dat The system will panic. 2) It is impossible to work with unix sockets (PF_LOCAL) over nullfs: To reproduce: start mysqld. It creates socket /tmp/mysql.sock to communicate with local mysql client. mount_null(fs) /tmp /mnt Try to use this socket: mysql -S /mnt/mysql.sock mysql Connection to mysqld will fail. Fix: I am attaching patches for both -current and -stable. I am not sure how correct they are, but they do work. -STABLE: -CURRENT:
Responsible Changed From-To: freebsd-bugs->tjr I'll check the patch out and commit it if it seems ok.
Responsible Changed From-To: tjr->freebsd-bugs Unassign. I haven't had time to fix this.
State Changed From-To: open->feedback Is this still a problem with more recent versions of -stable and -current, or has it been fixed on one or the other?
State Changed From-To: feedback->open It's still a problem.
Adding to audit trail from email received: From: Jeremie Le Hen <jeremie.le-hen@epita.fr> I tried the attached patch, and it's appear to be only a half-fix. To test it, I made a small server which listens on a named Unix socket and a client sending an hello message through it. Here are the results on a 5-CURRENT (about 50 days ago) (`lower' means the program opens the socket on the lower filesystem while `upper' means it is on the null layer) : Without patch With Dmitry Sivachenko's patch client(lower) / OK OK server(lower) client(upper) / Failed OK server(lower) client(lower) / Failed Failed server(upper) client(upper) / OK OK server(upper) I hope this can help. I tried to understand why it fails, but my programming skills are not good enough. Regards, -- Jeremie LE HEN aka TtZ/TataZ jeremie.le-hen@epita.fr ttz@epita.fr
The reason it fails in case #3 is that vp->v_socket is being set on the upper layer by unp_bind(), but not on the lower layer, because that code is ignorant of filesystem layering. I'm not sure what the best fix is for this. Is it acceptable to have null_lookup simply return the lower vnode when it's of type VSOCK, or would that break something else?
This patch is working on my 5.3-STABLE Oct 22 (referring to Mark Linimon's classification, I have "client(upper)/server(lower)" situation), but "find" and "ls -R" on nullfs mount point after outputing some amount of files exit with "fts_read: No such file or directory".
I confirm that this is still a problem on 6.1-RELEASE (-p11). In particular, it prevents using nullfs to run X clients through Unix domain sockets from a chroot or jail (i.e. connecting via /tmp/.X11-unix/X0). (http://lists.freebsd.org/pipermail/freebsd-emulation/2006-December/002912.html) The Linimon/Le Hen tests can be verified using the net/netcat port: setup ----- mkdir lower mkdir upper mount -t nullfs lower upper touch lower/testport test.sh -------- #!/bin.sh rm lower/testport nc -lU $BIND/testport & SERVER=$! echo test | nc -U $CONNECT/testport echo $? kill $SERVER 2>/dev/null tests ----- BIND=lower CONNECT=lower ./test.sh # works BIND=upper CONNECT=upper ./test.sh # works BIND=lower CONNECT=upper ./test.sh # FAILS before patch, works after BIND=upper CONNECT=lower ./test.sh # FAILS before and after patch The Sivachenko patch corrects the following situation: analysis: BIND=lower CONNECT=upper ---------------------------------- 1. s = socket(AF_UNIX, SOCK_STREAM, 0) falloc: creates a new open file in the process descriptor table socreate: associates a socket with this file and associates the unix domain protosw functions 2. connect(s, name, namelen) kern_connect -> so_connect -> pru_connect -> unp_connect * unp_connect (src/sys/kern/uipc_usrreq.c, v1.155.2.3) * call namei to retrieve the requested vnode -calls null_lookup in src/sys/fs/null_vnops.c -in turn calls null_nodeget in src/sys/fs/null_subr.c returns the UPPER vnode DOES NOT copy the v_un field from the lower vnode. * line 962: so2 = vp->v_socket; (vp->v_un.vu_socket) DIRECT ACCESS to v_un field of (UPPER) vnode. The Buchanan analysis refers to a different location: analysis: BIND=upper CONNECT=lower ---------------------------------- 1. s = socket(AF_UNIX, SOCK_STREAM, 0) 2. bind(s, name, namelen) * unp_bind (src/sys/kern/uipc_usrreq.c, v1.155.2.3) * creates new vnodes * line 902: vp->v_socket = unp->unp_socket; (vp->v_un.vu_socket) DIRECT ACCESS to v_un field of (UPPER) vnode. Not fixed by the submitted patch. Messy. The socket information in the upper (nullfs) vnode must somehow be passed down into the lower vnode... A proper fix is beyond me.
Responsible Changed From-To: freebsd-bugs->freebsd-fs Over to maintainer(s).
Author: trociny Date: Wed Feb 29 21:38:31 2012 New Revision: 232317 URL: http://svn.freebsd.org/changeset/base/232317 Log: Introduce VOP_UNP_BIND(), VOP_UNP_CONNECT(), and VOP_UNP_DETACH() operations for setting and accessing vnode's v_socket field. The operations are necessary to implement proper unix socket handling on layered file systems like nullfs(5). This change fixes the long standing issue with nullfs(5) being in that unix sockets did not work between lower and upper layers: if we bound to a socket on the lower layer we could connect only to the lower path; if we bound to the upper layer we could connect only to the upper path. The new behavior is one can connect to both the lower and the upper paths regardless what layer path one binds to. PR: kern/51583, kern/159663 Suggested by: kib Reviewed by: arch MFC after: 2 weeks Modified: head/UPDATING head/sys/kern/uipc_usrreq.c head/sys/kern/vfs_default.c head/sys/kern/vnode_if.src head/sys/sys/vnode.h Modified: head/UPDATING ============================================================================== --- head/UPDATING Wed Feb 29 21:11:02 2012 (r232316) +++ head/UPDATING Wed Feb 29 21:38:31 2012 (r232317) @@ -22,6 +22,14 @@ NOTE TO PEOPLE WHO THINK THAT FreeBSD 10 machines to maximize performance. (To disable malloc debugging, run ln -s aj /etc/malloc.conf.) +20120229: + Now unix domain sockets behave "as expected" on nullfs(5). Previously + nullfs(5) did not pass through all behaviours to the underlying layer, + as a result if we bound to a socket on the lower layer we could connect + only to the lower path; if we bound to the upper layer we could connect + only to the upper path. The new behavior is one can connect to both the + lower and the upper paths regardless what layer path one binds to. + 20120211: The getifaddrs upgrade path broken with 20111215 has been restored. If you have upgraded in between 20111215 and 20120209 you need to Modified: head/sys/kern/uipc_usrreq.c ============================================================================== --- head/sys/kern/uipc_usrreq.c Wed Feb 29 21:11:02 2012 (r232316) +++ head/sys/kern/uipc_usrreq.c Wed Feb 29 21:38:31 2012 (r232317) @@ -542,7 +542,7 @@ restart: UNP_LINK_WLOCK(); UNP_PCB_LOCK(unp); - vp->v_socket = unp->unp_socket; + VOP_UNP_BIND(vp, unp->unp_socket); unp->unp_vnode = vp; unp->unp_addr = soun; unp->unp_flags &= ~UNP_BINDING; @@ -638,7 +638,7 @@ uipc_detach(struct socket *so) * XXXRW: Should assert vp->v_socket == so. */ if ((vp = unp->unp_vnode) != NULL) { - unp->unp_vnode->v_socket = NULL; + VOP_UNP_DETACH(vp); unp->unp_vnode = NULL; } unp2 = unp->unp_conn; @@ -1308,7 +1308,7 @@ unp_connect(struct socket *so, struct so * and to protect simultaneous locking of multiple pcbs. */ UNP_LINK_WLOCK(); - so2 = vp->v_socket; + VOP_UNP_CONNECT(vp, &so2); if (so2 == NULL) { error = ECONNREFUSED; goto bad2; @@ -2318,17 +2318,15 @@ vfs_unp_reclaim(struct vnode *vp) active = 0; UNP_LINK_WLOCK(); - so = vp->v_socket; + VOP_UNP_CONNECT(vp, &so); if (so == NULL) goto done; unp = sotounpcb(so); if (unp == NULL) goto done; UNP_PCB_LOCK(unp); - if (unp->unp_vnode != NULL) { - KASSERT(unp->unp_vnode == vp, - ("vfs_unp_reclaim: vp != unp->unp_vnode")); - vp->v_socket = NULL; + if (unp->unp_vnode == vp) { + VOP_UNP_DETACH(vp); unp->unp_vnode = NULL; active = 1; } Modified: head/sys/kern/vfs_default.c ============================================================================== --- head/sys/kern/vfs_default.c Wed Feb 29 21:11:02 2012 (r232316) +++ head/sys/kern/vfs_default.c Wed Feb 29 21:38:31 2012 (r232317) @@ -123,6 +123,9 @@ struct vop_vector default_vnodeops = { .vop_unlock = vop_stdunlock, .vop_vptocnp = vop_stdvptocnp, .vop_vptofh = vop_stdvptofh, + .vop_unp_bind = vop_stdunp_bind, + .vop_unp_connect = vop_stdunp_connect, + .vop_unp_detach = vop_stdunp_detach, }; /* @@ -1037,6 +1040,30 @@ vop_stdadvise(struct vop_advise_args *ap return (error); } +int +vop_stdunp_bind(struct vop_unp_bind_args *ap) +{ + + ap->a_vp->v_socket = ap->a_socket; + return (0); +} + +int +vop_stdunp_connect(struct vop_unp_connect_args *ap) +{ + + *ap->a_socket = ap->a_vp->v_socket; + return (0); +} + +int +vop_stdunp_detach(struct vop_unp_detach_args *ap) +{ + + ap->a_vp->v_socket = NULL; + return (0); +} + /* * vfs default ops * used to fill the vfs function table to get reasonable default return values. Modified: head/sys/kern/vnode_if.src ============================================================================== --- head/sys/kern/vnode_if.src Wed Feb 29 21:11:02 2012 (r232316) +++ head/sys/kern/vnode_if.src Wed Feb 29 21:38:31 2012 (r232317) @@ -640,6 +640,26 @@ vop_advise { IN int advice; }; +%% unp_bind vp E E E + +vop_unp_bind { + IN struct vnode *vp; + IN struct socket *socket; +}; + +%% unp_connect vp L L L + +vop_unp_connect { + IN struct vnode *vp; + OUT struct socket **socket; +}; + +%% unp_detach vp = = = + +vop_unp_detach { + IN struct vnode *vp; +}; + # The VOPs below are spares at the end of the table to allow new VOPs to be # added in stable branches without breaking the KBI. New VOPs in HEAD should # be added above these spares. When merging a new VOP to a stable branch, Modified: head/sys/sys/vnode.h ============================================================================== --- head/sys/sys/vnode.h Wed Feb 29 21:11:02 2012 (r232316) +++ head/sys/sys/vnode.h Wed Feb 29 21:38:31 2012 (r232317) @@ -703,6 +703,9 @@ int vop_stdpathconf(struct vop_pathconf_ int vop_stdpoll(struct vop_poll_args *); int vop_stdvptocnp(struct vop_vptocnp_args *ap); int vop_stdvptofh(struct vop_vptofh_args *ap); +int vop_stdunp_bind(struct vop_unp_bind_args *ap); +int vop_stdunp_connect(struct vop_unp_connect_args *ap); +int vop_stdunp_detach(struct vop_unp_detach_args *ap); int vop_eopnotsupp(struct vop_generic_args *ap); int vop_ebadf(struct vop_generic_args *ap); int vop_einval(struct vop_generic_args *ap); _______________________________________________ svn-src-all@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/svn-src-all To unsubscribe, send any mail to "svn-src-all-unsubscribe@freebsd.org"
Author: trociny Date: Tue Apr 24 19:08:40 2012 New Revision: 234660 URL: http://svn.freebsd.org/changeset/base/234660 Log: MFC r232317: Introduce VOP_UNP_BIND(), VOP_UNP_CONNECT(), and VOP_UNP_DETACH() operations for setting and accessing vnode's v_socket field. The operations are necessary to implement proper unix socket handling on layered file systems like nullfs(5). This change fixes the long standing issue with nullfs(5) being in that unix sockets did not work between lower and upper layers: if we bound to a socket on the lower layer we could connect only to the lower path; if we bound to the upper layer we could connect only to the upper path. The new behavior is one can connect to both the lower and the upper paths regardless what layer path one binds to. PR: kern/51583, kern/159663 Suggested by: kib Reviewed by: arch Modified: stable/9/UPDATING (contents, props changed) stable/9/sys/kern/uipc_usrreq.c stable/9/sys/kern/vfs_default.c stable/9/sys/kern/vnode_if.src stable/9/sys/sys/vnode.h Directory Properties: stable/9/sys/ (props changed) Modified: stable/9/UPDATING ============================================================================== --- stable/9/UPDATING Tue Apr 24 19:00:42 2012 (r234659) +++ stable/9/UPDATING Tue Apr 24 19:08:40 2012 (r234660) @@ -9,6 +9,14 @@ handbook. Items affecting the ports and packages system can be found in /usr/ports/UPDATING. Please read that file before running portupgrade. +20120422: + Now unix domain sockets behave "as expected" on nullfs(5). Previously + nullfs(5) did not pass through all behaviours to the underlying layer, + as a result if we bound to a socket on the lower layer we could connect + only to the lower path; if we bound to the upper layer we could connect + only to the upper path. The new behavior is one can connect to both the + lower and the upper paths regardless what layer path one binds to. + 20120109: The acpi_wmi(4) status device /dev/wmistat has been renamed to /dev/wmistat0. Modified: stable/9/sys/kern/uipc_usrreq.c ============================================================================== --- stable/9/sys/kern/uipc_usrreq.c Tue Apr 24 19:00:42 2012 (r234659) +++ stable/9/sys/kern/uipc_usrreq.c Tue Apr 24 19:08:40 2012 (r234660) @@ -541,7 +541,7 @@ restart: UNP_LINK_WLOCK(); UNP_PCB_LOCK(unp); - vp->v_socket = unp->unp_socket; + VOP_UNP_BIND(vp, unp->unp_socket); unp->unp_vnode = vp; unp->unp_addr = soun; unp->unp_flags &= ~UNP_BINDING; @@ -637,7 +637,7 @@ uipc_detach(struct socket *so) * XXXRW: Should assert vp->v_socket == so. */ if ((vp = unp->unp_vnode) != NULL) { - unp->unp_vnode->v_socket = NULL; + VOP_UNP_DETACH(vp); unp->unp_vnode = NULL; } unp2 = unp->unp_conn; @@ -1307,7 +1307,7 @@ unp_connect(struct socket *so, struct so * and to protect simultaneous locking of multiple pcbs. */ UNP_LINK_WLOCK(); - so2 = vp->v_socket; + VOP_UNP_CONNECT(vp, &so2); if (so2 == NULL) { error = ECONNREFUSED; goto bad2; @@ -2317,17 +2317,15 @@ vfs_unp_reclaim(struct vnode *vp) active = 0; UNP_LINK_WLOCK(); - so = vp->v_socket; + VOP_UNP_CONNECT(vp, &so); if (so == NULL) goto done; unp = sotounpcb(so); if (unp == NULL) goto done; UNP_PCB_LOCK(unp); - if (unp->unp_vnode != NULL) { - KASSERT(unp->unp_vnode == vp, - ("vfs_unp_reclaim: vp != unp->unp_vnode")); - vp->v_socket = NULL; + if (unp->unp_vnode == vp) { + VOP_UNP_DETACH(vp); unp->unp_vnode = NULL; active = 1; } Modified: stable/9/sys/kern/vfs_default.c ============================================================================== --- stable/9/sys/kern/vfs_default.c Tue Apr 24 19:00:42 2012 (r234659) +++ stable/9/sys/kern/vfs_default.c Tue Apr 24 19:08:40 2012 (r234660) @@ -123,6 +123,9 @@ struct vop_vector default_vnodeops = { .vop_unlock = vop_stdunlock, .vop_vptocnp = vop_stdvptocnp, .vop_vptofh = vop_stdvptofh, + .vop_unp_bind = vop_stdunp_bind, + .vop_unp_connect = vop_stdunp_connect, + .vop_unp_detach = vop_stdunp_detach, }; /* @@ -1037,6 +1040,30 @@ vop_stdadvise(struct vop_advise_args *ap return (error); } +int +vop_stdunp_bind(struct vop_unp_bind_args *ap) +{ + + ap->a_vp->v_socket = ap->a_socket; + return (0); +} + +int +vop_stdunp_connect(struct vop_unp_connect_args *ap) +{ + + *ap->a_socket = ap->a_vp->v_socket; + return (0); +} + +int +vop_stdunp_detach(struct vop_unp_detach_args *ap) +{ + + ap->a_vp->v_socket = NULL; + return (0); +} + /* * vfs default ops * used to fill the vfs function table to get reasonable default return values. Modified: stable/9/sys/kern/vnode_if.src ============================================================================== --- stable/9/sys/kern/vnode_if.src Tue Apr 24 19:00:42 2012 (r234659) +++ stable/9/sys/kern/vnode_if.src Tue Apr 24 19:08:40 2012 (r234660) @@ -640,23 +640,31 @@ vop_advise { IN int advice; }; -# The VOPs below are spares at the end of the table to allow new VOPs to be -# added in stable branches without breaking the KBI. New VOPs in HEAD should -# be added above these spares. When merging a new VOP to a stable branch, -# the new VOP should replace one of the spares. +%% unp_bind vp E E E -vop_spare1 { +vop_unp_bind { IN struct vnode *vp; + IN struct socket *socket; }; -vop_spare2 { +%% unp_connect vp L L L + +vop_unp_connect { IN struct vnode *vp; + OUT struct socket **socket; }; -vop_spare3 { +%% unp_detach vp = = = + +vop_unp_detach { IN struct vnode *vp; }; +# The VOPs below are spares at the end of the table to allow new VOPs to be +# added in stable branches without breaking the KBI. New VOPs in HEAD should +# be added above these spares. When merging a new VOP to a stable branch, +# the new VOP should replace one of the spares. + vop_spare4 { IN struct vnode *vp; }; Modified: stable/9/sys/sys/vnode.h ============================================================================== --- stable/9/sys/sys/vnode.h Tue Apr 24 19:00:42 2012 (r234659) +++ stable/9/sys/sys/vnode.h Tue Apr 24 19:08:40 2012 (r234660) @@ -707,6 +707,9 @@ int vop_stdpathconf(struct vop_pathconf_ int vop_stdpoll(struct vop_poll_args *); int vop_stdvptocnp(struct vop_vptocnp_args *ap); int vop_stdvptofh(struct vop_vptofh_args *ap); +int vop_stdunp_bind(struct vop_unp_bind_args *ap); +int vop_stdunp_connect(struct vop_unp_connect_args *ap); +int vop_stdunp_detach(struct vop_unp_detach_args *ap); int vop_eopnotsupp(struct vop_generic_args *ap); int vop_ebadf(struct vop_generic_args *ap); int vop_einval(struct vop_generic_args *ap); _______________________________________________ svn-src-all@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/svn-src-all To unsubscribe, send any mail to "svn-src-all-unsubscribe@freebsd.org"
State Changed From-To: open->closed The fix merged to stable/9.