I usually keep my / mounted read-only -- mostly for safety. When I need to change some aspect of configuration or add a port, I remount rw, do what needs doing, and then remount back to read-only mode.
This worked for me for many years, but lately I noticed, the remounting back to the ro-mode is taking longer than it used to. Many seconds, especially, when it was active shortly before the remount command.
Today, however, with the freshly rebuilt kernel (FreeBSD 10.4-STABLE #15 r325999M), it takes well over a minute! A minute, during which the machine largely freezes -- for example, making new logins impossible.
I understand the need to flush the unwritten to disk, when switching to ro, but just how much should that take in the scenario I just had:
1048 11:56 mount -orw -u /
1049 11:56 chmod u+s /opt/cyrus/bin/deliver
1050 11:56 mount -oro -u /
? The second remount took well over minute and it seems wrong.
The filesystem is based on SSD and looks like:
/dev/ada0s1a on / (ufs, NFS exported, local, read-only)
`tunefs -p /` reports:
tunefs: POSIX.1e ACLs: (-a) disabled
tunefs: NFSv4 ACLs: (-N) disabled
tunefs: MAC multilabel: (-l) disabled
tunefs: soft updates: (-n) enabled
tunefs: soft update journaling: (-j) disabled
tunefs: gjournal: (-J) disabled
tunefs: trim: (-t) enabled
tunefs: maximum blocks per file in a cylinder group: (-e) 1048576
tunefs: average file size: (-f) 16384
tunefs: average number of files in a directory: (-s) 64
tunefs: minimum percentage of free space: (-m) 8%
tunefs: space to hold for metadata blocks: (-k) 0
tunefs: optimization preference: (-o) time
tunefs: volume label: (-L)
This report is perhaps the third or fourth about the slow sync problem. Unfortunately we have not been able to reproduce the slowdown.
We believe it started with r230553. Note that I carefully use the word 'started' and not 'caused', because I believe that the change in r230553 is required to avoid loss of user data. But it is not clear what makes the sync run for such a long time.
No users who reported the slowdown have provided a useful insight into their configuration. What we need is a description of your filesystem. Ideal would be a dump image that when restored demonstrates the problem.
I don’t know, what to say. I’ve been seeing this problem for a while on all of the machines I manage (that’s about 8).
Besides the tunefs output already included, I’m not sure, what to add. While the mount command is waiting, it is in the biowr-state.
(In reply to Mikhail T. from comment #2)
Being in the biowr state is expected as the filesystem needs to sync out all the changes. The question is why there are so many of them. Randomly guessing, perhaps it decides that there are a large number of access times that need to be written. When upgrading to read-write, include the -o noatime option so there is no attempt to update access times.
It would also be useful to collect the number of I/O (particularly writes) that the downgrade command is doing (time -l if your shell does not otherwise privide the I/O statistics). You can also get an idea of the I/Os being done using iostat.
I think that r230553 counts as a while ago: It was made Wed Jan 25 20:54:09 2012
You may try the patch from https://reviews.freebsd.org/D13197, but I doubt that it would change much for you.
Seeing this manifested pretty bad today - on a freshly-rebuilt world (r338291M). The other thing that changed is the RAM upgrade -- from 10GB to 40GB.
Could it be, the kernel is trying to flush ALL filesystems, rather than just the one being remounted? Because, unlike the / on UFS, other filesystems could be quite busy...
While waiting, I press Ctrl-T once in a while, and get:
root@narawntapu:/etc/mail (1022) mount -oro -u /
load: 0.93 cmd: mount 26321 [biowr] 230.01r 0.00u 9.01s 2% 2224k
load: 0.47 cmd: mount 26321 [biowr] 283.84r 0.00u 10.99s 1% 2224k
load: 0.57 cmd: mount 26321 [biowr] 318.95r 0.00u 12.25s 2% 2224k
The actual change to the /-filesystem is trivial.
(In reply to Kirk McKusick from comment #3)
> include the -o noatime option
No obvious difference:
root@narawntapu:/etc/mail (1024) mount -orw,noatime /
root@narawntapu:/etc/mail (1025) touch q
root@narawntapu:/etc/mail (1026) rm q
root@narawntapu:/etc/mail (1027) mount -oro -u /
load: 0.71 cmd: mount 27675 [biowr] 7.92r 0.00u 0.72s 4% 2224k
load: 0.52 cmd: mount 27675 [biowr] 54.83r 0.00u 2.64s 2% 2224k
load: 0.47 cmd: mount 27675 [biowr] 68.05r 0.00u 3.12s 2% 2224k
load: 0.58 cmd: mount 27675 [biowr] 121.83r 0.00u 5.08s 3% 2224k
load: 0.36 cmd: mount 27675 [biowr] 254.59r 0.00u 9.97s 2% 2224k
> time -l
root@narawntapu:/etc/mail (1034) /usr/bin/time -l mount -oro /
330,18 real 0,00 user 13,01 sys
2212 maximum resident set size
22 average shared memory size
7 average unshared data size
121 average unshared stack size
112 page reclaims
0 page faults
0 block input operations
344120 block output operations
0 messages sent
0 messages received
0 signals received
344178 voluntary context switches
1014 involuntary context switches
The entire FS is:
Filesystem 1M-blocks Used Avail Capacity Mounted on
/dev/ada0s1a 22808 16268 4715 78% /
*what* 344120 "block output operations"?!
(In reply to Mikhail Teterin from comment #7)
Perhaps more interestingly,
> 344178 voluntary context switches
Almost every single one of those bios is written synchronously, it looks like.
How many inodes does the filesystem have? Does mount -oro,noatime -u (in addition to rw,noatime) help in any way?