If you have a UFS2 filesystem with multiple snapshots on it, and commence several processes of rapid file deletions on that filesystem, the system will hang. I first produced this problem on a system that had a filesystem with six snapshots enabled on it. I proceeded to start three concurrent `rm` processes of directories with large numbers of files in them (on that filesystem). The system hung as a result. Further testing showed that in the exact same environment, three such `rm` processes did not hang a system when the filesystem in question only had two snapshots. I incremented the number of snapshots to 3, and the system hung when the same three `rm` processes were performed. It is possible that any ratio of a large number of snapshots to a large number of intensive file deletions on the filesystem that is snapshotted will cause the system to hang. For instance, one snapshot and ten file deletion processes, or perhaps ten snapshots and two file deletion processes. I have not tested any further combinations than the ones listed above. I am highly confident that this behavior exists and is reproducible. I have reproduced this behavior on both 6.0 and 6.1. Fix: Do not use UFS2 snapshots. It is unreasonable to assume that concurrent rapid file deletions will not occur on a filesystem that has snapshots on it. How-To-Repeat: (assume a filesystem mounted on /mnt/data1) cp -R /usr/src /mnt/data1/test1 cp -R /usr/src /mnt/data1/test2 cp -R /usr/src /mnt/data1/test3 mksnap_ffs /mnt/data1 /mnt/data1/.sync/snap1 mksnap_ffs /mnt/data1 /mnt/data1/.sync/snap2 mksnap_ffs /mnt/data1 /mnt/data1/.sync/snap3 rm -rf /mnt/data1/test1 & rm -rf /mnt/data1/test2 & rm -rf /mnt/data1/test3 & (system will hang)
Responsible Changed From-To: freebsd-bugs->freebsd-fs Over to maintainer(s).
With the described test scenario I was able to reproduce a deadlock consistently on HEAD. The time of deadlock however seems to be a bit different from that described in the pr. On HEAD the deadlock occur during deletion of the snapshot files. http://people.freebsd.org/~pho/stress/log/pr-94769.txt - Peter
I recently started experiencing similar behavior on a 7.1-RELEASE i386 host. In this case the deadlock occurs as soon as sysutils/freebsd-snapshots run it's weekly cron entry: 0 * * * * root periodic-snapshot hourly 0 0 * * * root periodic-snapshot daily 0 0 * * 0 root periodic-snapshot weekly I am running a GENERIC kernel without quotas. Here's the list of snapshots on the various filesystems: Filesystem User User% Snap Snap% Snapshot / 267MB 54.0% 704KB 0.1% daily.0 / 267MB 54.0% 704KB 0.1% daily.1 / 267MB 54.0% 704KB 0.1% daily.2 / 267MB 54.0% 704KB 0.1% daily.3 / 267MB 54.0% 704KB 0.1% daily.4 / 267MB 54.0% 784KB 0.2% daily.5 / 267MB 54.0% 784KB 0.2% daily.6 / 267MB 54.0% 608KB 0.1% hourly.0 / 267MB 54.0% 640KB 0.1% hourly.1 / 267MB 54.0% 656KB 0.1% hourly.2 / 267MB 54.0% 656KB 0.1% hourly.3 / 267MB 54.0% 704KB 0.1% weekly.0 / 267MB 54.0% 896KB 0.2% weekly.1 / 267MB 54.0% 928KB 0.2% weekly.2 / 267MB 54.0% 1MB 0.2% weekly.3 /var 1237MB 10.4% 16MB 0.1% daily.0 /var 1237MB 10.4% 17MB 0.1% daily.1 /var 1237MB 10.4% 30MB 0.3% daily.2 /var 1237MB 10.4% 32MB 0.3% daily.3 /var 1237MB 10.4% 32MB 0.3% daily.4 /var 1237MB 10.4% 33MB 0.3% daily.5 /var 1237MB 10.4% 141MB 1.2% daily.6 /var 1237MB 10.4% 7MB 0.1% hourly.0 /var 1237MB 10.4% 7MB 0.1% hourly.1 /var 1237MB 10.4% 7MB 0.1% hourly.2 /var 1237MB 10.4% 8MB 0.1% hourly.3 /var 1237MB 10.4% 8MB 0.1% hourly.4 /var 1237MB 10.4% 8MB 0.1% hourly.5 /var 1237MB 10.4% 8MB 0.1% hourly.6 /var 1237MB 10.4% 8MB 0.1% hourly.7 /var 1237MB 10.4% 8MB 0.1% hourly.8 /var 1237MB 10.4% 8MB 0.1% hourly.9 /var 1237MB 10.4% 25MB 0.2% weekly.0 /usr 87545MB 74.1% 138MB 0.1% daily.0 /usr 87545MB 74.1% 217MB 0.2% daily.1 /usr 87545MB 74.1% 234MB 0.2% daily.2 /usr 87545MB 74.1% 287MB 0.2% daily.3 /usr 87545MB 74.1% 292MB 0.2% daily.4 /usr 87545MB 74.1% 278MB 0.2% daily.5 /usr 87545MB 74.1% 349MB 0.3% daily.6 /usr 87545MB 74.1% 88MB 0.1% hourly.0 /usr 87545MB 74.1% 200MB 0.2% hourly.1 /usr 87545MB 74.1% 864MB 0.7% weekly.0 /usr 87545MB 74.1% 721MB 0.6% weekly.1 /usr 87545MB 74.1% 782MB 0.7% weekly.2 /usr 87545MB 74.1% 4772MB 4.0% weekly.3 I am eager and willing to assist in anyway to investigate and eliminate this issue. -- Regards, Doug
For bugs matching the following criteria: Status: In Progress Changed: (is less than) 2014-06-01 Reset to default assignee and clear in-progress tags. Mail being skipped