Created attachment 179941 [details] Test for the bug Hello. I've noticed that you can execute a code from ZFS filesystem with exec property set to off with mmap'ing it with PROT_READ | PROT_EXEC protection. If this is not done at mmap time, further calls to mprotect trying to set PROT_EXEC protection will fail. For example, I cannot launch SBCL with core image in my no-exec home directory (because it calls mprotect), but I can execute any code from shared libraries in home directory if I open it with dlopen(). It seems to me as a bug, so I report. I attach a working example to illustrate a problem. It will require some tuning. In the attached archive compile lib.c as a shared library. Then place it in no-exec ZFS filesystem. Then compile test-working.c and test-bug.c, replacing hard-coded values in calls to open() and mmap() (filename and address of return_value. You can get the address with objdump -T). Place binaries somewhere you can execute them. Execute test-working at first. It will fail to work if library is in no-exec FS. Then execute test-bug. It will return the value 4 from the library no matter where it is stored. Even if it is on no-exec FS, it will be executed.
I must add that this is not ZFS specific as it seemed at first, rather vm specific. It's all the same in any file system mounted with -o noexec. You can use libraries with dlopen() or mmap files with PROT_READ | PROT_EXEC
-o noexec only means that execve(2) with a binary living on that filesystem is disallowed, -o noexec does not disable mmaping with PROT_EXEC. So that part of the bug report is not a bug. I am interested in the other part, where mprotect(PROT_EXEC) fails to you. Can you provide a minimal example which demostrate what you are trying to do and failed syscall ? I want a minimal test program in C.
Created attachment 179980 [details] Minimal test (In reply to Konstantin Belousov from comment #2) Oh, I thought noexec means complete 100% protection from any execution ;) But I have some doubts. What does the following code means in sys/kern/vfs_vnops.c ? /* * Ensure that file and memory protections are * compatible. Note that we only worry about * writability if mapping is shared; in this case, * current and max prot are dictated by the open file. * XXX use the vnode instead? Problem is: what * credentials do we use for determination? What if * proc does a setuid? */ mp = vp->v_mount; if (mp != NULL && (mp->mnt_flag & MNT_NOEXEC) != 0) maxprot = VM_PROT_NONE; else maxprot = VM_PROT_EXECUTE; if ((fp->f_flag & FREAD) != 0) maxprot |= VM_PROT_READ; else if ((prot & VM_PROT_READ) != 0) return (EACCES); The source is vn_map() function in source src/sys/kern/vfs_vnops.c A minimal test as you asked. Compile it and launch with any file on no-exec filesystem as its only argument. It tries to do mmap() and mprotect() and returns results
(In reply to shamaz.mazum from comment #3) The fragment you cited is exactly the cause why mprotect(2) call in your test program fails. mprotect(2) checks that new protection is a subset of the maxprot. That said, my opinion is that disallowing PROT_EXEC for mappings from -o noexec mounts is useless. If you determined, there is nothing which could prevent you from mapping anonymous memory, copying data from the file into it, and then executing. OTOH, I admit that there is inconsistency between mmap(2) and mprotect(2), which was introduced by r127187. The patch I attached fixes that, but I wonder would it be more useful to revert the mentioned revision instead.
Created attachment 179995 [details] For MNT_NOEXEC mounts, disallow PROT_EXEC in prot as well.
(In reply to Konstantin Belousov from comment #4) Thanks for the fix. This discrepancy between mmap and mprotect made me think that there was a bug there. If you are interested, I noticed that, as I said before, trying to launch SBCL (Steel Bank Common Lisp) with a core image stored in my home no-exec file system. It failed to start because it maps core image in memory and then calls mprotect for those mappings. But when I opened a shared library using foreign functions interface, it worked perfectly (I believe dynamic linker uses mmap). > If you determined, there is nothing which could prevent you from mapping anonymous memory, copying data from the file into it, and then executing. Yes, you are right. Too bad for my paranoia ;)
A commit references this bug: Author: kib Date: Sun Feb 19 20:51:05 UTC 2017 New revision: 313967 URL: https://svnweb.freebsd.org/changeset/base/313967 Log: Apply noexec mount option for mmap(PROT_EXEC). Right now the noexec mount option disallows image activators to try execve the files on the mount point. Also, after r127187, noexec also limits max_prot map entries permissions for mappings of files from such mounts, but not the actual mapping permissions. As result, the API behaviour is inconsistent. The files from noexec mount can be mapped with PROT_EXEC, but if mprotect(2) drops execution permission, it cannot be re-enabled later. Make this consistent logically and aligned with behaviour of other systems, by disallowing PROT_EXEC for mmap(2). Note that this change only ensures aligned results from mmap(2) and mprotect(2), it does not prevent actual code execution from files coming from noexec mount. Such files can always be read into anonymous executable memory and executed from there. Reported by: shamaz.mazum@gmail.com PR: 217062 Reviewed by: alc Sponsored by: The FreeBSD Foundation MFC after: 1 week Changes: head/sys/fs/devfs/devfs_vnops.c head/sys/kern/vfs_vnops.c
This seems like pointless consistency and seems likely to break existing (perhaps somewhat nonsensical) configurations. At first glance, it seems better for mprotect() to neglect to honor MNT_NOEXEC since, IMO, MNT_NOEXEC is only really enforceable in the image activator.
(In reply to Brooks Davis from comment #8) I already stated that noexec is only meaningful as a protection against image activator, but another argument for consistency is other OSes behaviour. In particular, I verified that Linux behaves in a way requested by the bug reporter, before I committed the r313967.
(In reply to Konstantin Belousov from comment #9) The fact that linux does this is indeed a good argument for doing the same. Thanks for pointing that out.
A commit references this bug: Author: kib Date: Sun Feb 26 11:02:14 UTC 2017 New revision: 314298 URL: https://svnweb.freebsd.org/changeset/base/314298 Log: MFC r313967: Apply noexec mount option for mmap(PROT_EXEC). PR: 217062 Changes: _U stable/11/ stable/11/sys/fs/devfs/devfs_vnops.c stable/11/sys/kern/vfs_vnops.c
There is a commit referencing this PR, but it's still not closed and has been inactive for some time. Closing the PR as fixed but feel free to re-open it if the issue hasn't been completely resolved. Thanks