Bug 127213 - [tmpfs] [patch] sendfile on tmpfs data corruption
Summary: [tmpfs] [patch] sendfile on tmpfs data corruption
Status: Closed FIXED
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: 7.1-PRERELEASE
Hardware: Any Any
: Normal Affects Only Me
Assignee: Xin LI
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2008-09-08 18:50 UTC by citrin
Modified: 2011-02-19 13:40 UTC (History)
0 users

See Also:


Attachments
tmpfs-sendfile.patch.txt (3.57 KB, text/plain; charset=utf-8)
2009-10-02 21:48 UTC, Gleb Kurtsou
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description citrin 2008-09-08 18:50:02 UTC
sendfile(2) currupts data for files on tmpfs

Fix: 

Not known.
How-To-Repeat: 1. Create tmpfs and write test file

# mount -t tmpfs tmpfs /spool/tmpfs
# echo 'this is test file' > /spool/tmpfs/tt
# cat /spool/tmpfs/tt
this is test file

2. Transfer this file using sendfile. E. g. using web server from port
www/nginx

# cat nginx.conf
events {
    worker_connections  1024;
}

http {
    default_type  application/octet-stream;
    access_log off;
    sendfile   on;

    server {
        listen       8081;
        location / {
            root   /spool/tmpfs;
        }
    }
}

# nginx -c nginx.conf

Content of fetched file will be different garbage on each request:

# fetch -qo - http://127.0.0.1:8081/tt | hd
00000000  d4 98 7a 00 08 00 00 00  dc 98 7a 00 08 00 00 00  |Ôz.....Üz.....|
00000010  e4 98                                             |ä|
00000012

# fetch -qo - http://127.0.0.1:8081/tt | md5
ff035bff2dcf972ee7dfd023455997ef
# fetch -qo - http://127.0.0.1:8081/tt | md5
bb07cd1a0ede3c7af24e4a86a14b0493
# fetch -qo - http://127.0.0.1:8081/tt | md5
e84539257c2650da866a186435acc93d

With sendfile disbled in nginx.conf file is not corrupted, but we need
sendfile support for other application (where only sendfile supported).
Comment 1 Mark Linimon freebsd_committer freebsd_triage 2008-09-08 23:26:04 UTC
Responsible Changed
From-To: freebsd-bugs->freebsd-fs

Over to maintainer(s).
Comment 2 Nate Eldredge 2008-10-06 07:20:42 UTC
Hi,

I investigated this a bit.

First, note that this bug has some security implications, because it 
appears that the garbage written by sendfile is kernel memory contents, 
which could contain something sensitive.  It is sufficient for an attacker 
to have read access to a file on a mounted tmpfs.  So it should really get 
fixed.

I'm not terribly familiar with vfs or vm internals, but it appears that 
sendfile causes VOP_READ to be called with the IO_VMIO flag and a dummy 
uio.  tmpfs_read (in sys/fs/tmpfs/tmpfs_vnops.c) doesn't handle this 
correctly; it always just copies the data to the supplied uio, which in 
this case does nothing.  It looks like the data is supposed to make it 
into vn->v_object, and tmpfs_read doesn't do that.  (If I understand it 
correctly, on a normal filesystem this is taken care of by bread().)

I am not sure what the correct semantics of IO_VMIO are supposed to be, so 
I don't know what the correct fix would be.  However, a quick fix is to 
not have a v_object at all; remove the call to vnode_create_vobject in 
tmpfs_open.  This seems to be legal since procfs, etc, work that way.  It 
does however mean that sendfile doesn't work at all.

I am curious what was the point of having a v_object in the first place, 
since the data is already in virtual memory.  Unless the goal was just to 
make sendfile work, which evidently wasn't successful.

Incidentally, to the initial reporter, what application do you have that 
requires sendfile?  In my experience, most things will fall back to a 
read/write loop if sendfile fails, since sendfile isn't available on all 
systems or under all circumstances.  So if you apply the quick fix so that 
sendfile always fails, it might at least get your application working 
again.

-- 

Nate Eldredge
neldredge@math.ucsd.edu
Comment 3 Maxim Konovalov 2008-10-06 08:36:38 UTC
Hello,

On Mon, 6 Oct 2008, 06:40-0000, Nate Eldredge wrote:

[...]
>  Incidentally, to the initial reporter, what application do you have
>  that requires sendfile?  In my experience, most things will fall
>  back to a read/write loop if sendfile fails, since sendfile isn't
>  available on all systems or under all circumstances.  So if you
>  apply the quick fix so that sendfile always fails, it might at
>  least get your application working again.
>
As stated in the PR Andrey used nginx (ports/www/nginx).  But I could
easily reproduce the bug with the stock ftpd(8) with the ftproot on
tmpfs.

-- 
Maxim Konovalov
Comment 4 citrin 2008-10-06 22:38:50 UTC
 > Incidentally, to the initial reporter, what application do you have that
 > requires sendfile?

We want to use tmpfs with our homegrown application, which can work only using sendfile().
Currently we use md+ufs, but with md data present in memory twice - in md and in VM cache.

-- 
  Anton Yuzhaninov
Comment 5 Nate Eldredge 2008-10-06 23:22:57 UTC
On Mon, 6 Oct 2008, Maxim Konovalov wrote:

> Hello,
>
> On Mon, 6 Oct 2008, 06:40-0000, Nate Eldredge wrote:
>
> [...]
>>  Incidentally, to the initial reporter, what application do you have
>>  that requires sendfile?  In my experience, most things will fall
>>  back to a read/write loop if sendfile fails, since sendfile isn't
>>  available on all systems or under all circumstances.  So if you
>>  apply the quick fix so that sendfile always fails, it might at
>>  least get your application working again.
>>
> As stated in the PR Andrey used nginx (ports/www/nginx).  But I could
> easily reproduce the bug with the stock ftpd(8) with the ftproot on
> tmpfs.

To simplify matters further, here is the testcase I used when testing 
this, which uses sendfile to send some data over a unix domain socket. 
Do:

./server /tmpfs/data mysocket &
./client mysocket >data.out
cmp /tmpfs/data data.out

If things work right, data and data.out should be identical.  But if data 
is a file on a tmpfs, data.out contains apparently random kernel memory 
contents.

# This is a shell archive.  Save it in a file, remove anything before
# this line, and then unpack it by entering "sh file".  Note, it may
# create directories; files and directories will be owned by you and
# have default permissions.
#
# This archive contains:
#
#	Makefile
#	client.c
#	server.c
#	util.c
#	util.h
#
echo x - Makefile
sed 's/^X//' >Makefile << 'END-of-Makefile'
XCC = gcc
XCFLAGS = -Wall -W -g
X
Xall : server client
X
Xserver : server.o util.o
X	$(CC) -o $@ $>
X
Xclient : client.o util.o
X	$(CC) -o $@ $>
X
Xserver.o client.o util.o : util.h
X
Xclean :
X	rm -f server client *.o
END-of-Makefile
echo x - client.c
sed 's/^X//' >client.c << 'END-of-client.c'
X#include <stdio.h>
X#include <fcntl.h>
X#include <unistd.h>
X#include <stdlib.h>
X#include "util.h"
X
Xint main(int argc, char *argv[]) {
X	int s;
X	if (argc < 2) {
X		fprintf(stderr, "Usage: %s socketname\n", argv[0]);
X		exit(1);
X	}
X	if ((s = connect_unix_socket(argv[1])) < 0) {
X		exit(1);
X	}
X	fake_sendfile(s, 1);
X	return 0;
X}
X
X
X 
END-of-client.c
echo x - server.c
sed 's/^X//' >server.c << 'END-of-server.c'
X#include <stdio.h>
X#include <fcntl.h>
X#include <unistd.h>
X#include <stdlib.h>
X#include "util.h"
X
Xint main(int argc, char *argv[]) {
X	int f, listener, connection;
X	if (argc < 3) {
X		fprintf(stderr, "Usage: %s filename socketname\n", argv[0]);
X		exit(1);
X	}
X	if ((f = open(argv[1], O_RDONLY)) < 0) {
X		perror(argv[1]);
X		exit(1);
X	}
X	if ((listener = listen_unix_socket(argv[2])) < 0) {
X		exit(1);
X	}
X	if ((connection = accept_unix_socket(listener)) >= 0) {
X		real_sendfile(f, connection);
X	}
X	return 0;
X}
X
X
X 
END-of-server.c
echo x - util.c
sed 's/^X//' >util.c << 'END-of-util.c'
X/* send data from file to unix domain socket */
X
X#include <stdio.h>
X#include <time.h>
X#include <signal.h>
X#include <errno.h>
X#include <sys/types.h>
X#include <sys/socket.h>
X#include <sys/un.h>
X#include <string.h>
X#include <stdlib.h>
X#include <unistd.h>
X
Xint create_unix_socket(void) {
X	int fd;
X	if ((fd = socket(PF_LOCAL, SOCK_STREAM, 0)) < 0) {
X		perror("socket");
X		return -1;
X	}
X	return fd;
X}
X
Xint make_unix_sockaddr(const char *pathname, struct sockaddr_un *sa) {
X	memset(sa, 0, sizeof(*sa));
X	sa->sun_family = PF_LOCAL;
X	if (strlen(pathname) + 1 > sizeof(sa->sun_path)) {
X		fprintf(stderr, "%s: pathname too long (max %lu)\n",
X			pathname, sizeof(sa->sun_path));
X		errno = ENAMETOOLONG;
X		return -1;
X	}
X	strcpy(sa->sun_path, pathname);
X	return 0;
X}
X
Xstatic char *sockname;
Xvoid delete_socket(void) {
X	unlink(sockname);
X}
X
Xint listen_unix_socket(const char *path) {
X	int fd;
X	struct sockaddr_un sa;
X	if (make_unix_sockaddr(path, &sa) < 0)
X		return -1;
X	if ((fd = create_unix_socket()) < 0)
X		return -1;
X	if (bind(fd, (struct sockaddr *)&sa, sizeof(sa)) < 0) {
X		perror("bind");
X		close(fd);
X		return -1;
X	}
X	sockname = strdup(path);
X	atexit(delete_socket);
X
X	if (listen(fd, 5) < 0) {
X		perror("listen");
X		close(fd);
X		return -1;
X	}
X	return fd;
X}
X
Xint accept_unix_socket(int fd) {
X	int s;
X	if ((s = accept(fd, NULL, 0)) < 0) {
X		perror("accept");
X		return -1;
X	}
X	return s;
X}
X
Xint connect_unix_socket(const char *path) {
X	int fd;
X	struct sockaddr_un sa;
X	if (make_unix_sockaddr(path, &sa) < 0)
X		return -1;
X	if ((fd = create_unix_socket()) < 0)
X		return -1;
X	if (connect(fd, (struct sockaddr *)&sa, sizeof(sa)) < 0) {
X		perror("connect");
X		return -1;
X	}
X	return fd;
X}
X
X#define BUFSIZE 65536
X
Xint fake_sendfile(int from, int to) {
X	char buf[BUFSIZE];
X	int v;
X	int sent = 0;
X	while ((v = read(from, buf, BUFSIZE)) > 0) {
X		int d = 0;
X		while (d < v) {
X			int w = write(to, buf, v - d);
X			if (w <= 0) {
X				perror("write");
X				return -1;
X			}
X			d += w;
X			sent += w;
X		}
X	}
X	if (v != 0) {
X		perror("read");
X		return -1;
X	}
X	return sent;
X}
X
Xint real_sendfile(int from, int to) {
X	int v;
X	v = sendfile(from, to, 0, 0, NULL, NULL, 0);
X	if (v < 0) {
X		perror("sendfile");
X	}
X	return v;
X}
X 
X 
END-of-util.c
echo x - util.h
sed 's/^X//' >util.h << 'END-of-util.h'
X/* send data from file to unix domain socket */
X
X#include <stdio.h>
X#include <time.h>
X#include <signal.h>
X#include <errno.h>
X#include <sys/types.h>
X#include <sys/socket.h>
X#include <sys/un.h>
X
Xint create_unix_socket(void);
Xint make_unix_sockaddr(const char *pathname, struct sockaddr_un *sa);
Xint listen_unix_socket(const char *path);
Xint accept_unix_socket(int fd);
Xint connect_unix_socket(const char *path);
Xint fake_sendfile(int from, int to);
Xint real_sendfile(int from, int to);
X 
X 
END-of-util.h
exit


-- 

Nate Eldredge
neldredge@math.ucsd.edu
Comment 6 Maxim Konovalov 2008-10-07 04:43:17 UTC
On Mon, 6 Oct 2008, 15:22-0700, Nate Eldredge wrote:

> On Mon, 6 Oct 2008, Maxim Konovalov wrote:
>
> > Hello,
> >
> > On Mon, 6 Oct 2008, 06:40-0000, Nate Eldredge wrote:
> >
> > [...]
> > >  Incidentally, to the initial reporter, what application do you have
> > >  that requires sendfile?  In my experience, most things will fall
> > >  back to a read/write loop if sendfile fails, since sendfile isn't
> > >  available on all systems or under all circumstances.  So if you
> > >  apply the quick fix so that sendfile always fails, it might at
> > >  least get your application working again.
> > >
> > As stated in the PR Andrey used nginx (ports/www/nginx).  But I could
> > easily reproduce the bug with the stock ftpd(8) with the ftproot on
> > tmpfs.
>
> To simplify matters further, here is the testcase I used when
> testing this, which uses sendfile to send some data over a unix
> domain socket. Do:
>
> ./server /tmpfs/data mysocket &
> ./client mysocket >data.out
> cmp /tmpfs/data data.out
>
> If things work right, data and data.out should be identical.  But if
> data is a file on a tmpfs, data.out contains apparently random
> kernel memory contents.
>
Hi Nate,

It'd be really nice if you extend
src/tools/regression/sockets/sendfile regression test for this bug.
Now it doesn't detect this case.

-- 
Maxim Konovalov
Comment 7 Marc Olzheim 2009-07-29 09:51:40 UTC
This is not just a bug, but a security hole. What you are seeing is the
content of other VM pages. This has exactly the same symptoms as
http://security.freebsd.org/advisories/FreeBSD-SA-05:02.sendfile.asc

In short: do not use tmpfs unless you trust all users on the system with
root access.
Comment 8 Rebecca Cran freebsd_committer freebsd_triage 2009-09-07 03:41:11 UTC
I've submitted the PR kern/138465 which contains a patch to
tools/regression/sockets/sendfile which adds an MD5 field to the header
structure; when all the data has been received the MD5 is recalculated
and the test fails if they differ.

As expected it fails when /tmp is a tmpfs filesystem.

-- 
Bruce
Comment 9 Gleb Kurtsou 2009-10-02 21:48:29 UTC
[ resending it with changed subject.
looks like it was ignored by bug-followup@ ]

Try the patch attached, it fixes the bug for me.
The same workaround is used by zfs.
Comment 10 dfilter service freebsd_committer freebsd_triage 2009-10-08 00:17:24 UTC
Author: delphij
Date: Wed Oct  7 23:17:15 2009
New Revision: 197850
URL: http://svn.freebsd.org/changeset/base/197850

Log:
  Add a special workaround to handle UIO_NOCOPY case.  This fixes data
  corruption observed when sendfile() is being used.
  
  PR:		kern/127213
  Submitted by:	gk
  MFC after:	2 weeks

Modified:
  head/sys/fs/tmpfs/tmpfs_vnops.c

Modified: head/sys/fs/tmpfs/tmpfs_vnops.c
==============================================================================
--- head/sys/fs/tmpfs/tmpfs_vnops.c	Wed Oct  7 23:01:31 2009	(r197849)
+++ head/sys/fs/tmpfs/tmpfs_vnops.c	Wed Oct  7 23:17:15 2009	(r197850)
@@ -43,6 +43,8 @@ __FBSDID("$FreeBSD$");
 #include <sys/priv.h>
 #include <sys/proc.h>
 #include <sys/resourcevar.h>
+#include <sys/sched.h>
+#include <sys/sf_buf.h>
 #include <sys/stat.h>
 #include <sys/systm.h>
 #include <sys/unistd.h>
@@ -428,15 +430,72 @@ tmpfs_setattr(struct vop_setattr_args *v
 }
 
 /* --------------------------------------------------------------------- */
+static int
+tmpfs_nocacheread(vm_object_t tobj, vm_pindex_t idx,
+    vm_offset_t offset, size_t tlen, struct uio *uio)
+{
+	vm_page_t	m;
+	int		error;
+
+	VM_OBJECT_LOCK(tobj);
+	vm_object_pip_add(tobj, 1);
+	m = vm_page_grab(tobj, idx, VM_ALLOC_WIRED |
+	    VM_ALLOC_ZERO | VM_ALLOC_NORMAL | VM_ALLOC_RETRY);
+	if (m->valid != VM_PAGE_BITS_ALL) {
+		if (vm_pager_has_page(tobj, idx, NULL, NULL)) {
+			error = vm_pager_get_pages(tobj, &m, 1, 0);
+			if (error != 0) {
+				printf("tmpfs get pages from pager error [read]\n");
+				goto out;
+			}
+		} else
+			vm_page_zero_invalid(m, TRUE);
+	}
+	VM_OBJECT_UNLOCK(tobj);
+	error = uiomove_fromphys(&m, offset, tlen, uio);
+	VM_OBJECT_LOCK(tobj);
+out:
+	vm_page_lock_queues();
+	vm_page_unwire(m, TRUE);
+	vm_page_unlock_queues();
+	vm_page_wakeup(m);
+	vm_object_pip_subtract(tobj, 1);
+	VM_OBJECT_UNLOCK(tobj);
+
+	return (error);
+}
+
+static __inline int
+tmpfs_nocacheread_buf(vm_object_t tobj, vm_pindex_t idx,
+    vm_offset_t offset, size_t tlen, void *buf)
+{
+	struct uio uio;
+	struct iovec iov;
+
+	uio.uio_iovcnt = 1;
+	uio.uio_iov = &iov;
+	iov.iov_base = buf;
+	iov.iov_len = tlen;
+
+	uio.uio_offset = 0;
+	uio.uio_resid = tlen;
+	uio.uio_rw = UIO_READ;
+	uio.uio_segflg = UIO_SYSSPACE;
+	uio.uio_td = curthread;
+
+	return (tmpfs_nocacheread(tobj, idx, offset, tlen, &uio));
+}
 
 static int
 tmpfs_mappedread(vm_object_t vobj, vm_object_t tobj, size_t len, struct uio *uio)
 {
+	struct sf_buf	*sf;
 	vm_pindex_t	idx;
 	vm_page_t	m;
 	vm_offset_t	offset;
 	off_t		addr;
 	size_t		tlen;
+	char		*ma;
 	int		error;
 
 	addr = uio->uio_offset;
@@ -461,33 +520,30 @@ lookupvpg:
 		vm_page_wakeup(m);
 		VM_OBJECT_UNLOCK(vobj);
 		return	(error);
+	} else if (m != NULL && uio->uio_segflg == UIO_NOCOPY) {
+		if (vm_page_sleep_if_busy(m, FALSE, "tmfsmr"))
+			goto lookupvpg;
+		vm_page_busy(m);
+		VM_OBJECT_UNLOCK(vobj);
+		sched_pin();
+		sf = sf_buf_alloc(m, SFB_CPUPRIVATE);
+		ma = (char *)sf_buf_kva(sf);
+		error = tmpfs_nocacheread_buf(tobj, idx, offset, tlen,
+		    ma + offset);
+		if (error == 0) {
+			uio->uio_offset += tlen;
+			uio->uio_resid -= tlen;
+		}
+		sf_buf_free(sf);
+		sched_unpin();
+		VM_OBJECT_LOCK(vobj);
+		vm_page_wakeup(m);
+		VM_OBJECT_UNLOCK(vobj);
+		return	(error);
 	}
 	VM_OBJECT_UNLOCK(vobj);
 nocache:
-	VM_OBJECT_LOCK(tobj);
-	vm_object_pip_add(tobj, 1);
-	m = vm_page_grab(tobj, idx, VM_ALLOC_WIRED |
-	    VM_ALLOC_ZERO | VM_ALLOC_NORMAL | VM_ALLOC_RETRY);
-	if (m->valid != VM_PAGE_BITS_ALL) {
-		if (vm_pager_has_page(tobj, idx, NULL, NULL)) {
-			error = vm_pager_get_pages(tobj, &m, 1, 0);
-			if (error != 0) {
-				printf("tmpfs get pages from pager error [read]\n");
-				goto out;
-			}
-		} else
-			vm_page_zero_invalid(m, TRUE);
-	}
-	VM_OBJECT_UNLOCK(tobj);
-	error = uiomove_fromphys(&m, offset, tlen, uio);
-	VM_OBJECT_LOCK(tobj);
-out:
-	vm_page_lock_queues();
-	vm_page_unwire(m, TRUE);
-	vm_page_unlock_queues();
-	vm_page_wakeup(m);
-	vm_object_pip_subtract(tobj, 1);
-	VM_OBJECT_UNLOCK(tobj);
+	error = tmpfs_nocacheread(tobj, idx, offset, tlen, uio);
 
 	return	(error);
 }
_______________________________________________
svn-src-all@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to "svn-src-all-unsubscribe@freebsd.org"
Comment 11 Xin LI freebsd_committer freebsd_triage 2009-10-08 00:17:33 UTC
State Changed
From-To: open->patched

Patch from gk@ has been applied against -HEAD, MFC reminder. 


Comment 12 Xin LI freebsd_committer freebsd_triage 2009-10-08 00:17:33 UTC
Responsible Changed
From-To: freebsd-fs->delphij

Take.
Comment 13 Andriy Gapon freebsd_committer freebsd_triage 2010-09-15 11:26:05 UTC
r197850 has never been MFC-ed?!

-- 
Andriy Gapon
Comment 14 dfilter service freebsd_committer freebsd_triage 2010-09-24 18:27:04 UTC
Author: delphij
Date: Fri Sep 24 17:26:57 2010
New Revision: 213106
URL: http://svn.freebsd.org/changeset/base/213106

Log:
  MFC r197850:
  
  Add a special workaround to handle UIO_NOCOPY case.  This fixes data
  corruption observed when sendfile() is being used.
  
  Requested by:	avg
  PR:		kern/127213
  Submitted by:	gk

Modified:
  stable/8/sys/fs/tmpfs/tmpfs_vnops.c
Directory Properties:
  stable/8/sys/   (props changed)
  stable/8/sys/amd64/include/xen/   (props changed)
  stable/8/sys/cddl/contrib/opensolaris/   (props changed)
  stable/8/sys/contrib/dev/acpica/   (props changed)
  stable/8/sys/contrib/pf/   (props changed)
  stable/8/sys/dev/xen/xenpci/   (props changed)

Modified: stable/8/sys/fs/tmpfs/tmpfs_vnops.c
==============================================================================
--- stable/8/sys/fs/tmpfs/tmpfs_vnops.c	Fri Sep 24 16:40:46 2010	(r213105)
+++ stable/8/sys/fs/tmpfs/tmpfs_vnops.c	Fri Sep 24 17:26:57 2010	(r213106)
@@ -43,6 +43,8 @@ __FBSDID("$FreeBSD$");
 #include <sys/priv.h>
 #include <sys/proc.h>
 #include <sys/resourcevar.h>
+#include <sys/sched.h>
+#include <sys/sf_buf.h>
 #include <sys/stat.h>
 #include <sys/systm.h>
 #include <sys/unistd.h>
@@ -433,15 +435,72 @@ tmpfs_setattr(struct vop_setattr_args *v
 }
 
 /* --------------------------------------------------------------------- */
+static int
+tmpfs_nocacheread(vm_object_t tobj, vm_pindex_t idx,
+    vm_offset_t offset, size_t tlen, struct uio *uio)
+{
+	vm_page_t	m;
+	int		error;
+
+	VM_OBJECT_LOCK(tobj);
+	vm_object_pip_add(tobj, 1);
+	m = vm_page_grab(tobj, idx, VM_ALLOC_WIRED |
+	    VM_ALLOC_ZERO | VM_ALLOC_NORMAL | VM_ALLOC_RETRY);
+	if (m->valid != VM_PAGE_BITS_ALL) {
+		if (vm_pager_has_page(tobj, idx, NULL, NULL)) {
+			error = vm_pager_get_pages(tobj, &m, 1, 0);
+			if (error != 0) {
+				printf("tmpfs get pages from pager error [read]\n");
+				goto out;
+			}
+		} else
+			vm_page_zero_invalid(m, TRUE);
+	}
+	VM_OBJECT_UNLOCK(tobj);
+	error = uiomove_fromphys(&m, offset, tlen, uio);
+	VM_OBJECT_LOCK(tobj);
+out:
+	vm_page_lock_queues();
+	vm_page_unwire(m, TRUE);
+	vm_page_unlock_queues();
+	vm_page_wakeup(m);
+	vm_object_pip_subtract(tobj, 1);
+	VM_OBJECT_UNLOCK(tobj);
+
+	return (error);
+}
+
+static __inline int
+tmpfs_nocacheread_buf(vm_object_t tobj, vm_pindex_t idx,
+    vm_offset_t offset, size_t tlen, void *buf)
+{
+	struct uio uio;
+	struct iovec iov;
+
+	uio.uio_iovcnt = 1;
+	uio.uio_iov = &iov;
+	iov.iov_base = buf;
+	iov.iov_len = tlen;
+
+	uio.uio_offset = 0;
+	uio.uio_resid = tlen;
+	uio.uio_rw = UIO_READ;
+	uio.uio_segflg = UIO_SYSSPACE;
+	uio.uio_td = curthread;
+
+	return (tmpfs_nocacheread(tobj, idx, offset, tlen, &uio));
+}
 
 static int
 tmpfs_mappedread(vm_object_t vobj, vm_object_t tobj, size_t len, struct uio *uio)
 {
+	struct sf_buf	*sf;
 	vm_pindex_t	idx;
 	vm_page_t	m;
 	vm_offset_t	offset;
 	off_t		addr;
 	size_t		tlen;
+	char		*ma;
 	int		error;
 
 	addr = uio->uio_offset;
@@ -465,33 +524,30 @@ lookupvpg:
 		vm_page_wakeup(m);
 		VM_OBJECT_UNLOCK(vobj);
 		return	(error);
+	} else if (m != NULL && uio->uio_segflg == UIO_NOCOPY) {
+		if (vm_page_sleep_if_busy(m, FALSE, "tmfsmr"))
+			goto lookupvpg;
+		vm_page_busy(m);
+		VM_OBJECT_UNLOCK(vobj);
+		sched_pin();
+		sf = sf_buf_alloc(m, SFB_CPUPRIVATE);
+		ma = (char *)sf_buf_kva(sf);
+		error = tmpfs_nocacheread_buf(tobj, idx, offset, tlen,
+		    ma + offset);
+		if (error == 0) {
+			uio->uio_offset += tlen;
+			uio->uio_resid -= tlen;
+		}
+		sf_buf_free(sf);
+		sched_unpin();
+		VM_OBJECT_LOCK(vobj);
+		vm_page_wakeup(m);
+		VM_OBJECT_UNLOCK(vobj);
+		return	(error);
 	}
 	VM_OBJECT_UNLOCK(vobj);
 nocache:
-	VM_OBJECT_LOCK(tobj);
-	vm_object_pip_add(tobj, 1);
-	m = vm_page_grab(tobj, idx, VM_ALLOC_WIRED |
-	    VM_ALLOC_ZERO | VM_ALLOC_NORMAL | VM_ALLOC_RETRY);
-	if (m->valid != VM_PAGE_BITS_ALL) {
-		if (vm_pager_has_page(tobj, idx, NULL, NULL)) {
-			error = vm_pager_get_pages(tobj, &m, 1, 0);
-			if (error != 0) {
-				printf("tmpfs get pages from pager error [read]\n");
-				goto out;
-			}
-		} else
-			vm_page_zero_invalid(m, TRUE);
-	}
-	VM_OBJECT_UNLOCK(tobj);
-	error = uiomove_fromphys(&m, offset, tlen, uio);
-	VM_OBJECT_LOCK(tobj);
-out:
-	vm_page_lock_queues();
-	vm_page_unwire(m, TRUE);
-	vm_page_unlock_queues();
-	vm_page_wakeup(m);
-	vm_object_pip_subtract(tobj, 1);
-	VM_OBJECT_UNLOCK(tobj);
+	error = tmpfs_nocacheread(tobj, idx, offset, tlen, uio);
 
 	return	(error);
 }
_______________________________________________
svn-src-all@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to "svn-src-all-unsubscribe@freebsd.org"
Comment 15 Xin LI freebsd_committer freebsd_triage 2011-02-19 13:39:53 UTC
State Changed
From-To: patched->closed

Patch MFC'ed as 213106.