FreeBSD Bugzilla – Attachment 252988 Details for
Bug 280978
Kernel panics with vfs.nfsd.enable_locallocks=1 and nfs clients doing hdf5 file operations
Home
|
New
|
Browse
|
Search
|
[?]
|
Reports
|
Help
|
New Account
|
Log In
Remember
[x]
|
Forgot Password
Login:
[x]
Steps to reproduce panics
FreeBSD Panic Reproduction-CURRENT.md (text/plain), 5.06 KB, created by
Matthew L. Dailey
on 2024-08-21 15:09:03 UTC
(
hide
)
Description:
Steps to reproduce panics
Filename:
MIME Type:
Creator:
Matthew L. Dailey
Created:
2024-08-21 15:09:03 UTC
Size:
5.06 KB
patch
obsolete
># FreeBSD Panic Reproduction >Instructions for reproducing a FreeBSD kernel panic that seems to be induced by specific hdf5 traffic from a Linux client, coupled with the use of the vfs.nfsd.enable_locallocks sysctl. > ># Server >Reproducible from version 13.0-RELEASE through 15.0-CURRENT (as of July 2024). Panics tend to happen faster with kerberized nfs, but it easier to create a test case with sys nfs. Panics also tend to happen faster with a zfs filesystem, rather than ufs. > >For this, I will assume 15.0-CURRENT-20240725-82283cad12a4-271360 with a zfs root filesystem. > >## VM Parameters >Testing has been done with the following VM on esxi: >- 8 CPUs >- 16GB RAM >- 100GB HD for OS >- 20GB HD for dumps >- e1000 NIC >- EFI Firmware > >## FreeBSD Installation >During installation, these are general instructions we follow: >- Keymap: Default keymap >- Hostname: (appropriate fqdn) >- Distribution Select: > - kernel-dbg > - src >- Partitioning: Auto (ZFS) > - Pool Type/Disks: stripe, da0 > - Pool Name: zroot > - Force 4K Sectors: yes > - Encrypt Disks: no > - Partition Scheme: GPT (BIOS+UEFI) > - Swap Size: 2G > - Mirror Swap: no > - Encrypt Swap: no >- Root password: (password) >- Network > - Interface: em0 - Manual configuration > - Configure IPv4: yes > - DHCP: no > - IP address: (appropriate address) > - Subnet Mask: (appropriate subnet) > - Default Router (appropriate router) > - Configure IPv6: no > - Resolver > - Search: (appropriate search domain(s)) > - IPv4 DNS #1: (appropriate dns) > - IPv4 DNS #2: (appropriate dns) >- CMOS Clock set to UTC: yes >- Time Zone: > - EST/EDT > - Set date/time: skip >- Boot services: > - sshd > - ntpd > - ntpd_sync_on_start > - dumpdev >- System Hardening: all disabled >- Add User Accounts: Add appropriate user account, invited to wheel >- Final Configuration: finish >- Manual Configuration: no >- Complete: reboot > >I have found that ntpd is sometimes unhappy after the initial install on esxi, defaulting to UTC, but calling it the local time zone. Check the time after reboot and fix, if necessary, with: >``` >service ntpd stop >ntpdate (appropriate ntp server) >service ntpd start >reboot >``` > >## Set up dump device >Format: >``` >gpart create -s GPT /dev/da1 >gpart add -t freebsd-swap da1 >``` > >Enable: >``` >dumpon off >dumpon /dev/da1p1 >``` > >Add to/edit rc.conf: >``` >dumpdev="/dev/da1p1" >``` > >## Create and export sys mount >``` >mkdir /nfstest >chmod 1777 /nfstest >``` > >## Create /etc/exports (adjust network as appropriate, or restrict to a host): >``` >V4: /nfstest -sec=sys -network 10.0.0.0/8 >/nfstest -sec=sys -network 10.0.0.0/8 >``` > >## Enable nfs >``` >sysrc nfs_server_enable=YES >sysrc nfsv4_server_enable=YES >``` > >## Set sysctl flags >``` >cat << 'EOF' >> /etc/sysctl.conf > ># NFS Settings >vfs.nfsd.server_min_nfsvers=4 >vfs.nfsd.server_min_minorversion4=1 >vfs.nfsd.enable_locallocks=1 >EOF >``` > >## Reboot >Reboot to be sure everything comes up properly, including nfs. > ># Client >All client testing has been with Ubuntu 22.04. It's quickest to just do a server install, rather than a full desktop, but either works. > >These instructions assume the 22.04.4 server install. > >## VM Parameters >Testing has been done with the following VM on esxi: >- 16 CPUs >- 8GB RAM >- 20GB HD for OS >- vmxnet3 NIC >- EFI Firmware > >## Ubuntu Installation >During installation, these are general instructions we follow: >- Language: English >- Keyboard: English (US) >- Base installation: Ubuntu Server >- Network: Set up as appropriate > - Testing done with static IP and IPv6 disabled >- Proxy address: blank >- Mirror address: http://us.archive.ubuntu.com/ubuntu/ >- Storage configuration: Use entire disk >- Profile setup: appropriate local user and hostname >- Ubuntu Pro: skip >- SSH Setup: > - Install OpenSSH server: yes > - Import SSH identity: no >- Server snaps: none > >## Update >Make sure everything is updated on the system: >``` >apt update >apt dist-upgrade >``` > >Reboot, if needed, for kernel updates > >## nfs >Installed needed package for nfs: >``` >apt install nfs-common >``` > >Create mountpoint (change servername/ip as appropriate): >``` >mkdir /mnt/nfstest >cat << 'EOF' >> /etc/fstab >10.1.2.3:/ /mnt/nfstest nfs nfsvers=4.1,sec=sys,nosuid 0 0 >EOF >``` > >Test the mount: >``` >mount /mnt/nfstest >``` > >## Compile test binary >Using test code from: https://cvw.cac.cornell.edu/parallel-io-libraries/phdf5/hyperslabs-example > >Install preprequisites: >``` >apt install build-essential openmpi-bin libhdf5-openmpi-dev >``` > >Download and compile: >``` >wget https://cvw.cac.cornell.edu/parallel-io-libraries/phdf5/parallel_write_hslab_contiguous.c >h5pcc -o parallel_write_hslab_contiguous parallel_write_hslab_contiguous.c >``` > ># Testing >The easiest test is to just run this program in a loop, preferably within screen/tmux to be sure the session stays active. > >``` >screen >cd /mnt/nfstest/ >while [ 1 ]; do mpirun -np 16 ./parallel_write_hslab_contiguous test.hdf5 4096 4096; date; sleep .1; done; >``` > >With `vfs.nfsd.enable_locallocks=1` panics happen in generally less than 24 hours. >With `vfs.nfsd.enable_locallocks=0` we have not experienced any panics with systems up for ~15 days in limited testing.
You cannot view the attachment while viewing its details because your browser does not support IFRAMEs.
View the attachment on a separate page
.
View Attachment As Raw
Actions:
View
Attachments on
bug 280978
: 252988 |
253528