I use light-weight small cloneable FreeBSD systems built from the basic AWS EC2 AMI, first with 12.0-RELEASE and just now re-built with 12.2-RELEASE version. The idea is to have a minimal system rootfs while the bulk of the system disk, the /usr tree, is mounted from EFS (NFSv4). The AMI out of the box for 12.0 required 8 GB system disk, and now for 12.2 it's 10 GB, if I can do it in 1 GB I have some significant cost savings in redundant EBS volumes while also an easier time to distribute application software updates to my worker and server farm. I have described the principle here: https://forums.freebsd.org/threads/amazon-aws-ec2-super-light-configuration.69084/ The problem is that no matter what I do, the system locks up after about 4 hours. And it's not a kernel panic, it's not associated with any error, and I cannot see it at all. Nothing on the console screenshot, nothing in the logs and nothing in the /var/log/messages upon force-stop & re-start. Moreover, AWS EC2 console thinks the system is up in its status check. But any attempt to do anything results in hangs. SSH connection hangs. telnet to port 22, no response at all (not even connection reset, etc.) just hangs. If I happen to be logged in on the system whatever shell command I am doing will just hang, unable to CTRL-C break out nor CTRL-Z to the background. Just hang. So I have nothing to show you. But I can provide anyone who wants to see it with a setup and would appreciate if someone would help out here. Here is my /etc/fstab: # Custom /etc/fstab for FreeBSD VM images /dev/gpt/rootfs / ufs rw 1 1 /dev/gpt/varfs /var ufs rw 1 1 fdesc /dev/fd fdescfs rw 0 0 proc /proc procfs rw 0 0 /usr /.usr nullfs rw 0 0 fs-xxxxxx.efs.us-xxxxx-x.amazonaws.com:/ /usr nfs rw,nfsv4,minorversion=1,oneopenown,rsize=1048576,wsize=1048576,hard,timeo=600,retrans=2,noresvport,late,bg 0 0 It is freezing up within around 4 hours, even just idling around doing nothing, and it doesn't seem like heavy work activity would accelerate the lock-up situation. Also doesn't depend on which instance type, specifically I have tried t3.nano, t3.micro, c5.large, m5a.large, it happens on all of them about the same time.
Note, this is not like the ENA kernel panic nor the other AWS EC2 freeze on t3 bug (apparently fixed in 12.1-RELENG) nor does the presence or absence of the "intsmb0: Could not allocate I/O space" error during boot (open since 2018) make any difference. So this is not a duplicate of those bugs. sysctl -a |... kern.features.nfsd: 1 kern.features.nfscl: 1 vfs.nfs.downdelayinitial: 12 vfs.nfs.downdelayinterval: 30 vfs.nfs.defect: 0 vfs.nfs.iodmax: 20 vfs.nfs.iodmin: 0 vfs.nfs.iodmaxidle: 120 vfs.nfs.use_buf_pager: 1 vfs.nfs.fileid_maxwarnings: 10 vfs.nfs.diskless_rootpath: vfs.nfs.diskless_valid: 0 vfs.nfs.nfs_ip_paranoia: 1 vfs.nfs.nfs_directio_allow_mmap: 1 vfs.nfs.nfs_keep_dirty_on_error: 0 vfs.nfs.nfs_directio_enable: 0 vfs.nfs.clean_pages_on_close: 1 vfs.nfs.commit_on_close: 0 vfs.nfs.prime_access_cache: 0 vfs.nfs.access_cache_timeout: 60 vfs.nfs.dssameconn: 0 vfs.nfs.ignore_eexist: 0 vfs.nfs.pnfsiothreads: -1 vfs.nfs.userhashsize: 100 vfs.nfs.debuglevel: 0 vfs.nfs.callback_addr: vfs.nfs.realign_count: 0 vfs.nfs.realign_test: 0 vfs.nfs.pnfsmirror: 1 vfs.nfs.enable_uidtostring: 0 vfs.nfs.dsretries: 2 vfs.nfs.skip_wcc_data_onerr: 1 vfs.nfs.nfs3_jukebox_delay: 10 vfs.nfs.reconnects: 0 vfs.nfs.bufpackets: 4 searching for ways to do some debug logging, should I enable this? debug.fail_point.status_nfscl_force_fileid_warning: off debug.fail_point.nfscl_force_fileid_warning: off How might I log NFS warnings? Current setup with out of the box unchanged /etc/syslog.conf # $FreeBSD: releng/12.2/usr.sbin/syslogd/syslog.conf 338146 2018-08-21 17:01:47Z brd $ # # Spaces ARE valid field separators in this file. However, # other *nix-like systems still insist on using tabs as field # separators. If you are sharing this file between systems, you # may want to use only tabs as field separators here. # Consult the syslog.conf(5) manpage. *.err;kern.warning;auth.notice;mail.crit /dev/console *.notice;authpriv.none;kern.debug;lpr.info;mail.crit;news.err /var/log/messages security.* /var/log/security auth.info;authpriv.info /var/log/auth.log mail.info /var/log/maillog cron.* /var/log/cron !-devd *.=debug /var/log/debug.log *.emerg * # uncomment this to log all writes to /dev/console to /var/log/console.log # touch /var/log/console.log and chmod it to mode 600 before it will work #console.info /var/log/console.log # uncomment this to enable logging of all log messages to /var/log/all.log # touch /var/log/all.log and chmod it to mode 600 before it will work #*.* /var/log/all.log # uncomment this to enable logging to a remote loghost named loghost #*.* @loghost # uncomment these if you're running inn # news.crit /var/log/news/news.crit # news.err /var/log/news/news.err # news.notice /var/log/news/news.notice # Uncomment this if you wish to see messages produced by devd # !devd # *.>=notice /var/log/devd.log !* include /etc/syslog.d include /usr/local/etc/syslog.d in /var/log/messages you see the reboot without any issues reported: Nov 1 04:00:13 freebsd ec2[777]: ############################################################# ... Nov 1 05:49:13 freebsd su[1602]: ec2-user to root on /dev/pts/5 Nov 1 08:56:09 freebsd syslogd: exiting on signal 15 Nov 1 08:59:01 freebsd syslogd: kernel boot file is /boot/kernel/kernel Nov 1 08:59:01 freebsd kernel: ---<<BOOT>>--- not sure why syslogd is exiting on signal 15, that might have to do wit the lock-up, but more likely has to do with my force-stop action on the AWS console. Here is the earlier lock-up sequence: Oct 31 23:58:12 freebsd su[888]: ec2-user to root on /dev/pts/2 Oct 31 23:58:23 freebsd su[891]: ec2-user to root on /dev/pts/1 Oct 31 23:58:43 freebsd fsck[861]: /dev/gpt/varfs: 550 files, 1650 used, 125234 free (106 frags, 15641 blocks, 0.1% fragmentation) Nov 1 03:48:09 freebsd syslogd: exiting on signal 15 Nov 1 03:50:35 freebsd syslogd: kernel boot file is /boot/kernel/kernel Nov 1 03:50:35 freebsd kernel: ---<<BOOT>>--- about 4 hours like I estimated, and the "syslogd: exiting on signal 15" is just too close to the next boot that I don't think it is a herald of the lock-up, but rather it happens when I do the force-stop on the AWS EC2 management dashboard. And this would prove that we are not losing log messages due to the lock-up. If the system had anything to say, it would have said it!
Before anyone spends time on this, I am still trying some options. For now I have made 3 modification in my NFS mount options: 1. timeo=150 down from 600 - that alone didn't help 2. oneopenown - that alone didn't help 3. minorversion=1 added, the manual isn't clear if it is a requirement for oneopenown or if it implicitly sets minorversion=1, but since I added that, I have not frozen up for almost 6 hours. That's new. I also enabled crude NFS logging with sysctl vfs.nfs.debuglevel=4 And this is what I see in the /var/log/messages: Nov 1 08:59:23 freebsd su[926]: ec2-user to root on /dev/pts/1 Nov 1 10:28:20 freebsd su[1271]: ec2-user to root on /dev/pts/2 Nov 1 10:28:38 freebsd kernel: clnt call=0 Nov 1 10:28:43 freebsd syslogd: last message repeated 8 times Nov 1 10:28:43 freebsd kernel: readrpc: aft doiods=5 Nov 1 10:28:43 freebsd kernel: clnt call=0 Nov 1 10:29:14 freebsd syslogd: last message repeated 1125 times Nov 1 10:31:20 freebsd syslogd: last message repeated 63 times Nov 1 10:36:05 freebsd syslogd: last message repeated 368 times Nov 1 10:36:05 freebsd kernel: readrpc: aft doiods=5 Nov 1 10:36:05 freebsd kernel: clnt call=0 ... Nov 1 13:25:37 freebsd kernel: clnt call=0 Nov 1 13:25:37 freebsd kernel: readrpc: aft doiods=5 Nov 1 13:25:37 freebsd kernel: clnt call=0 Nov 1 13:25:37 freebsd syslogd: last message repeated 6 times Nov 1 13:25:37 freebsd kernel: readrpc: aft doiods=5 Nov 1 13:25:37 freebsd kernel: clnt call=0 Nov 1 13:25:37 freebsd kernel: readrpc: aft doiods=5 Nov 1 13:25:37 freebsd kernel: clnt call=0 Nov 1 13:25:37 freebsd kernel: readrpc: aft doiods=5 Nov 1 13:25:37 freebsd kernel: clnt call=0 Nov 1 13:25:37 freebsd syslogd: last message repeated 6 times Nov 1 13:25:37 freebsd kernel: readrpc: aft doiods=5 Nov 1 13:25:37 freebsd kernel: clnt call=0 Nov 1 13:25:37 freebsd syslogd: last message repeated 6 times Nov 1 13:25:37 freebsd kernel: readrpc: aft doiods=5 Nov 1 13:25:37 freebsd kernel: clnt call=0 Nov 1 13:25:37 freebsd kernel: readrpc: aft doiods=5 Nov 1 13:25:37 freebsd kernel: clnt call=0 Nov 1 13:25:56 freebsd syslogd: last message repeated 53 times Nov 1 13:25:56 freebsd kernel: readrpc: aft doiods=5 Nov 1 13:25:56 freebsd kernel: clnt call=0 Nov 1 13:26:27 freebsd syslogd: last message repeated 80 times Nov 1 13:26:55 freebsd syslogd: last message repeated 204 times Nov 1 13:26:55 freebsd kernel: readrpc: aft doiods=5 Nov 1 13:26:55 freebsd kernel: clnt call=0 Nov 1 13:27:36 freebsd syslogd: last message repeated 58 times Nov 1 13:29:26 freebsd syslogd: last message repeated 87 times Nov 1 13:30:47 freebsd syslogd: last message repeated 78 times Nov 1 13:30:47 freebsd kernel: readrpc: aft doiods=5 Nov 1 13:30:47 freebsd kernel: clnt call=0 Nov 1 13:31:21 freebsd syslogd: last message repeated 26 times Nov 1 13:33:00 freebsd syslogd: last message repeated 58 times Nov 1 13:43:21 freebsd syslogd: last message repeated 119 times Nov 1 13:53:06 freebsd syslogd: last message repeated 277 times Nov 1 14:03:00 freebsd syslogd: last message repeated 263 times Nov 1 14:13:22 freebsd syslogd: last message repeated 216 times Nov 1 14:23:06 freebsd syslogd: last message repeated 5110 times Nov 1 14:33:00 freebsd syslogd: last message repeated 491 times Nov 1 14:43:22 freebsd syslogd: last message repeated 120 times nothing really meaningful to me. But I hope that if it does lock up, I will see something peculiar. On the other hand, now on c5.large with: 1. minorversion=1 2. sysctl vfs.nfs.debuglevel=4 I have not locked up again. If I can make it a full day, I will take back these 3 changes one by one, first the debug logging, then the minorversion=1, to see when the problem recurs.
My understanding is that EFS should have oneopenown and minorversion=1. BTW on recent EC2 images I have an "efs" special automount target (https://www.daemonology.net/blog/2020-05-31-Some-new-FreeBSD-EC2-features.html) -- not sure if this is useful to you here. Something you haven't mentioned: Is your EFS I/O quota ok? Running out of I/O quota can make a system extremely slow (especially with a small filesystem which gets more I/O tokens slowly).
Adding rmacklem to CC since he's the NFS expert...
Thank you so much for the feedback. I am up over 10 hours now. So it looks like the spell has finally been broken! So, it may well be that the minorversion=1 is necessary for the oneopenown to work. I suggest updating the man page. It currently reads: minorversion=<value> Override the default of 0 for the minor version of the NFS Version 4 protocol. The only minor version currently supported is 1. This option is only meaningful when used with the nfsv4 option. oneopenown Make a minor version 1 of the NFS Version 4 protocol mount use a single OpenOwner for all Opens. This may be useful for a server with a very low limit on OpenOwners, such as AmazonEFS. It can only be used with an NFSv4.1 mount. It may not work correctly when Delegations are being issued by a server, but note that the AmazonEFS server does not issued delegations at this time. pnfs Enable support for parallel NFS (pNFS) for minor version 1 of the NFS Version 4 protocol. This option is only meaningful when used with the minorversion option. The clear statement of pnfs "This option is only meaningful when used with the minorversion option." would remove all ambiguity. I do wonder also about the speed compared to a "native" Amazon Linux mount -t efs mount. I feel like it's slower. My throughput allocation utilization is barely a blip, not an issue for me. I personally can't really use your efs automounter because I need to be in precise control of the mount /usr during boot. I wonder if more improvements can be gleaned from the mount_efs Linux code. Anyway, it looks more and more like this is a no-bug and just somehow a lack of clear and conspiciously placed documentation and warning. This week, if nothing else fails, I will close this issue and I will update all the places where I had left some negative commments. I love FreeBSD after all. Thank you.
> I wonder if more improvements can be gleaned from the mount_efs Linux code. My impression from the last time I looked at that code is that they're setting a lot of values explicitly, but those values are all the same as our defaults. If you notice any setting changes which could improve FreeBSD's EFS performance please let me know though -- I'll be happy to adjust our automount defaults. I've considered creating a "mount_efs" for FreeBSD but my impression was that for most users the automount option all they need -- you're clearly an exception to this.
In your initial post you show: nfsv4,minorversion=1,oneopenown so why was it hanging then? However, you are correct that oneopenown is needed for EFS and it only applies to minorversion 1 (or higher for FreeBSD13 if EFS supports it). I suppose I thought that NFSv4.1 was obvious shorthand for NFS version 4 minorversion 1, but the man page can be updated to clarify that. Btw, if you do "nfsstat -m" on the client, you see what mount options are actually being used. (The rsize, wsize is capped at vfs.maxbcachebuf, which is 64K by default and can be increased to MAXPHYS.) Oh, and I find the best way to debug these things is looking at a packet capture in wireshark. You can do the capture via "tcpdump -s 0 -w out.pcap host <server>" on the client and then load out.pcap into wireshark. (Unlike tcpdump, wireshark knows NFS.) Anyhow, sounds like you have it working now, rick
Yes, I apologize, I had one by one edited my options: 1. timeo=150 down from 600 - that alone didn't help 2. oneopenown - that alone didn't help 3. minorversion=1 added, the manual isn't clear if it is a requirement for oneopenown or if it implicitly sets minorversion=1, but since I added that, I have not frozen up for almost 6 hours. That's new. and I wrote this bug report when I was already at step 3, yet no time had passed yet for me to trust that now it would work.
Cool. So is an update to the man page sufficient to resolve this?
Confirmed, 33 hours of uptime without problems. Thank you for being a sounding board so I could solve my problem myself. It seems like you didn't do much other than reading this and giving some comments, but without you guys being there and giving me the challenge do a decent job in reporting my problem in detail and helping myself, I would have gone another year without figuring this out.
Good. I will put improving the man page on my todo list.
A commit references this bug: Author: rmacklem Date: Tue Dec 1 23:33:11 UTC 2020 New revision: 368268 URL: https://svnweb.freebsd.org/changeset/base/368268 Log: Improve man page for AmazonEFS mounts. PR#250770 was actually just a misunderstanding of what NFS mount options are needed for AmazonEFS mounts. This patch attempts to clarify the manpage to clarify this. This is a content change. PR: 250770 Reviewed by: bcr MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D27430 Changes: head/sbin/mount_nfs/mount_nfs.8
A commit references this bug: Author: rmacklem Date: Tue Dec 8 22:37:30 UTC 2020 New revision: 368466 URL: https://svnweb.freebsd.org/changeset/base/368466 Log: MFC: r368268 Improve man page for AmazonEFS mounts. PR#250770 was actually just a misunderstanding of what NFS mount options are needed for AmazonEFS mounts. This patch attempts to clarify the manpage to clarify this. This is a content change. PR: 250770 Changes: _U stable/12/ stable/12/sbin/mount_nfs/mount_nfs.8