227740 – concurrent zfs management operations may lead to a race/subsystem locking

Bug 227740 - concurrent zfs management operations may lead to a race/subsystem locking

Summary: concurrent zfs management operations may lead to a race/subsystem locking

Status:	Closed DUPLICATE of bug 226499

Alias:	None

Product:	Base System
Classification:	Unclassified
Component:	kern (show other bugs)
Version:	11.1-STABLE
Hardware:	Any Any

Importance:	--- Affects Only Me
Assignee:	freebsd-fs (Nobody)

URL:
Keywords:

Depends on:
Blocks:

Reported:	2018-04-24 13:27 UTC by emz
Modified:	2018-12-07 05:31 UTC (History)
CC List:	2 users (show)

See Also:

Attachments
Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description emz 2018-04-24 13:27:28 UTC

concurrent zfs commands operations may lead to a race/subsystem locking.

for instance this is the current state wich is not changing for at least 30 minutes (system got into it after issuing concurrent zfs commands):

===Cut===
[root@san1:~]# ps ax | grep zfs
    9  -  DL      7:41,34 [zfskern]
57922  -  Is      0:00,01 sshd: zfsreplica [priv] (sshd)
57924  -  I       0:00,00 sshd: zfsreplica@notty (sshd)
57925  -  Is      0:00,00 csh -c zfs list -t snapshot
57927  -  D       0:00,00 zfs list -t snapshot
58694  -  I       0:00,01 /usr/local/bin/sudo /sbin/zfs list -t all
58695  -  D       0:00,00 /sbin/zfs list -t all
59512  -  Is      0:00,02 sshd: zfsreplica [priv] (sshd)
59516  -  I       0:00,00 sshd: zfsreplica@notty (sshd)
59517  -  Is      0:00,00 csh -c zfs list -t snapshot
59520  -  D       0:00,00 zfs list -t snapshot
59552  -  I       0:00,01 /usr/local/bin/sudo /sbin/zfs list -t all
59553  -  D       0:00,00 /sbin/zfs list -t all
59554  -  I       0:00,01 /usr/local/bin/sudo /sbin/zfs list -t all
59555  -  D       0:00,00 /sbin/zfs list -t all
59556  -  I       0:00,01 /usr/local/bin/sudo /sbin/zfs list -t all
59557  -  D       0:00,00 /sbin/zfs list -t all
59558  -  I       0:00,01 /usr/local/bin/sudo /sbin/zfs list -t all
59559  -  D       0:00,00 /sbin/zfs list -t all
59560  -  I       0:00,01 /usr/local/bin/sudo /sbin/zfs list -t all
59561  -  D       0:00,00 /sbin/zfs list -t all
59564  -  I       0:00,01 /usr/local/bin/sudo /sbin/zfs list -t all
59565  -  D       0:00,00 /sbin/zfs list -t all
59570  -  I       0:00,01 /usr/local/bin/sudo /sbin/zfs list -t all
59571  -  D       0:00,00 /sbin/zfs list -t all
59572  -  I       0:00,01 /usr/local/bin/sudo /sbin/zfs list -t all
59573  -  D       0:00,00 /sbin/zfs list -t all
59574  -  I       0:00,01 /usr/local/bin/sudo /sbin/zfs list -t all
59575  -  D       0:00,00 /sbin/zfs list -t all
59878  -  Is      0:00,02 sshd: zfsreplica [priv] (sshd)
59880  -  I       0:00,00 sshd: zfsreplica@notty (sshd)
59881  -  Is      0:00,00 csh -c zfs list -t snapshot
59883  -  D       0:00,00 zfs list -t snapshot
60800  -  Is      0:00,01 sshd: zfsreplica [priv] (sshd)
60806  -  I       0:00,00 sshd: zfsreplica@notty (sshd)
60807  -  Is      0:00,00 csh -c zfs list -t snapshot
60809  -  D       0:00,00 zfs list -t snapshot
60917  -  I       0:00,01 /usr/local/bin/sudo /sbin/zfs list -t all
60918  -  D       0:00,00 /sbin/zfs list -t all
60950  -  I       0:00,01 /usr/local/bin/sudo /sbin/zfs list -t all
60951  -  D       0:00,00 /sbin/zfs list -t all
60966  -  Is      0:00,02 sshd: zfsreplica [priv] (sshd)
60968  -  I       0:00,00 sshd: zfsreplica@notty (sshd)
60969  -  Is      0:00,00 csh -c zfs list -t snapshot
60971  -  D       0:00,00 zfs list -t snapshot
61432  -  Is      0:00,03 sshd: zfsreplica [priv] (sshd)
61434  -  I       0:00,00 sshd: zfsreplica@notty (sshd)
61435  -  Is      0:00,00 csh -c zfs list -t snapshot
61437  -  D       0:00,00 zfs list -t snapshot
61502  -  I       0:00,01 /usr/local/bin/sudo /sbin/zfs list -t all
61503  -  D       0:00,00 /sbin/zfs list -t all
61504  -  I       0:00,01 /usr/local/bin/sudo /sbin/zfs list -t all
61505  -  D       0:00,00 /sbin/zfs list -t all
61506  -  I       0:00,01 /usr/local/bin/sudo /sbin/zfs list -t all
61507  -  D       0:00,00 /sbin/zfs list -t all
61508  -  I       0:00,01 /usr/local/bin/sudo /sbin/zfs list -t all
61509  -  D       0:00,00 /sbin/zfs list -t all
61510  -  I       0:00,01 /usr/local/bin/sudo /sbin/zfs list -t all
61511  -  D       0:00,00 /sbin/zfs list -t all
61512  -  I       0:00,01 /usr/local/bin/sudo /sbin/zfs list -t all
61513  -  D       0:00,00 /sbin/zfs list -t all
61569  -  I       0:00,01 /usr/local/bin/sudo /sbin/zfs list -t all
61570  -  D       0:00,00 /sbin/zfs list -t all
61851  -  Is      0:00,02 sshd: zfsreplica [priv] (sshd)
61853  -  I       0:00,00 sshd: zfsreplica@notty (sshd)
61854  -  Is      0:00,00 csh -c zfs list -t snapshot
61856  -  D       0:00,00 zfs list -t snapshot
57332  7  D+      0:00,04 zfs rename data/esx/boot-esx03 data/esx/boot-esx03_orig
58945  8  D+      0:00,00 zfs list
62119  3  S+      0:00,00 grep zfs
[root@san1:~]# ps ax | grep ctladm
62146  3  S+      0:00,00 grep ctladm
[root@san1:~]#
===Cut===

This seems to be the operation that locks the system:

zfs rename data/esx/boot-esx03 data/esx/boot-esx03_orig

the dataset info:

===Cut===
# zfs get all data/esx/boot-esx03
NAME                 PROPERTY              VALUE                       SOURCE
data/esx/boot-esx03  type                  volume                      -
data/esx/boot-esx03  creation              ср авг.  2 15:48 2017  -
data/esx/boot-esx03  used                  8,25G                       -
data/esx/boot-esx03  available             9,53T                       -
data/esx/boot-esx03  referenced            555M                        -
data/esx/boot-esx03  compressratio         1.06x                       -
data/esx/boot-esx03  reservation           none                        default
data/esx/boot-esx03  volsize               8G                          local
data/esx/boot-esx03  volblocksize          8K                          default
data/esx/boot-esx03  checksum              on                          default
data/esx/boot-esx03  compression           lz4                         inherited from data
data/esx/boot-esx03  readonly              off                         default
data/esx/boot-esx03  copies                1                           default
data/esx/boot-esx03  refreservation        8,25G                       local
data/esx/boot-esx03  primarycache          all                         default
data/esx/boot-esx03  secondarycache        all                         default
data/esx/boot-esx03  usedbysnapshots       0                           -
data/esx/boot-esx03  usedbydataset         555M                        -
data/esx/boot-esx03  usedbychildren        0                           -
data/esx/boot-esx03  usedbyrefreservation  7,71G                       -
data/esx/boot-esx03  logbias               latency                     default
data/esx/boot-esx03  dedup                 off                         inherited from data/esx
data/esx/boot-esx03  mlslabel                                          -
data/esx/boot-esx03  sync                  standard                    default
data/esx/boot-esx03  refcompressratio      1.06x                       -
data/esx/boot-esx03  written               555M                        -
data/esx/boot-esx03  logicalused           586M                        -
data/esx/boot-esx03  logicalreferenced     586M                        -
data/esx/boot-esx03  volmode               dev                         inherited from data
data/esx/boot-esx03  snapshot_limit        none                        default
data/esx/boot-esx03  snapshot_count        none                        default
data/esx/boot-esx03  redundant_metadata    all                         default
===Cut===

Since the dataset is only 8G big, it's unlikely that it should take that amount of time to be rename, considering disks are idle.

Got this two times in a row, and as a result all the zfs/zpool commands stopped working.

I have manually brought the system into panicking to get the crashdumps.
Crashdumps are located here:

http://san1.linx.playkey.net/r332096M/

along with a brief description and full kernel/module binaries.
Please note that the vmcore.0 is from another panic, this lockup crashdumps are 1 (unfortunately, no txt files saved) and 2.

Comment 1 emz 2018-05-29 06:39:33 UTC

Got anotherlockup, induced manual panic, crashdump is available via the URL above, number 5 (vmcore.5.gz).

Comment 2 Andriy Gapon freebsd_committer

2018-12-06 13:41:07 UTC

Is this still a problem?

Comment 3 emz 2018-12-06 15:23:53 UTC

Yup, but this particular PR is a duplicate either of https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=229958 (vice-versa) or of https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=226499. It's highly reproducible, like 1 of 5 zfs renames leads to lockup. I plan to schedule a maintenance time on our production site, reproduce this and get the procstat -kk of the zfs rename process that you requested in 226499.

Comment 4 Andriy Gapon freebsd_committer

2018-12-06 17:59:04 UTC

So, maybe let's close this bug and continue in one of the more detailed / researched ones?

Comment 5 emz 2018-12-07 05:30:36 UTC

Yup, that seems reasonable.

Comment 6 emz 2018-12-07 05:31:24 UTC


*** This bug has been marked as a duplicate of bug 226499 ***