After upgrading to pkg 1.10, I noticed on my package build machine (bare metal i7+SSD) that extracting packages was about 100x slower than with pkg 1.9, a 'pkg upgrade' took hours instead of minutes, making it unusable. I noticed that pkg was in 'biowr' state, and 'systat -vmstat' showed the disk at '100% busy' even though there were maybe 10-50 transactions/sec and less than 1MB/s throughput. Doing 'dd' in a separate terminal can write with 170MB/s so the physical disk is not saturated. The truss tool showed that a lot of time was spent in the ftruncate() call, which was introduced in pkg 1.10 (https://github.com/freebsd/pkg/commit/2085a3e975fc3b1102cbf809f325cb2d394117bc) hence why the problem appeared only recently. I created a minimal example which exhibits the problem: ________ example.c ________ #include <fcntl.h> #include <unistd.h> int main() { char* outfile = "woot2"; int fd = open(outfile, O_WRONLY|O_CREAT|O_EXCL, 0100644); ftruncate(fd, 100000); close(fd); } ________ loop.sh ________ #!/bin/sh /usr/bin/cc -o example example.c || exit 1 /usr/bin/uname -a /sbin/mount | grep ada while true; do rm -f ./woot2 sleep 1 /usr/bin/truss -D -o truss.out ./example grep ftruncate truss.out done _________________________ This produces the following output on my machine: FreeBSD builder.dt.lfms.nl 10.3-RELEASE-p18 FreeBSD 10.3-RELEASE-p18 #0: Tue Apr 11 10:31:00 UTC 2017 root@amd64-builder.daemonology.net:/usr/obj/usr/src/sys/GENERIC amd64 /dev/ada0p2 on / (ufs, local, noatime, journaled soft-updates) 0.439747415 ftruncate(3,0x186a0) = 0 (0x0) 0.431741841 ftruncate(3,0x186a0) = 0 (0x0) 0.000638086 ftruncate(3,0x186a0) = 0 (0x0) 0.000648680 ftruncate(3,0x186a0) = 0 (0x0) 0.477675635 ftruncate(3,0x186a0) = 0 (0x0) 0.000652645 ftruncate(3,0x186a0) = 0 (0x0) 0.467652474 ftruncate(3,0x186a0) = 0 (0x0) 0.000646409 ftruncate(3,0x186a0) = 0 (0x0) 0.466081847 ftruncate(3,0x186a0) = 0 (0x0) 0.432613030 ftruncate(3,0x186a0) = 0 (0x0) [...] As can be seen, a significant part of the ftruncate() calls take almost half a second to complete. You would expect this operation to be 0.0 seconds almost always. It explains why pkg, which ftruncates thousands of files, is suddenly so slow. I have tried disabling SU-journaling, and also disabling softupdates entirely. The problem only appears when using UFS with journaled softupdates: mount | delay -------------------------------|-------- journaled softupdates | PRESENT softupdates without journaling | absent no softupdates | absent I have tried booting in single user mode from various live CDs, letting the loop script run for a longer time. The problem is most severe on FreeBSD 10.3, is absent on FreeBSD 11.0, and seems back in very diminished form on a 12.0-CURRENT (Thu Apr 20 05:39:11) image. I took the following measurements: Kernel | slow | fast | slow% ---------------------------|------|------|------ 10.3-RELEASE-p18 amd64 | 54 | 138 | 28% 11.0-RELEASE-p1 amd64 | 0 | 299 | 0% 12.0-CURRENT r317181 amd64 | 5 | 224 | 2% I have tried to reproduce this problem on various VMs, but could not get it to appear. I just have a single bare metal Core i7 SSD machine which has it. The disk is a: ada0 at ahcich2 bus 0 scbus2 target 0 lun 0 ada0: <OCZ-VERTEX4 1.5> ACS-2 ATA SATA 3.x device ada0: Serial Number OCZ-**************** ada0: 600.000MB/s transfers (SATA 3.x, UDMA6, PIO 8192bytes) ada0: Command Queueing enabled ada0: 244198MB (500118192 512 byte sectors) ada0: quirks=0x1<4K> ada0: Previously was known as ad8 Partition table should be good I think: => 34 500118125 ada0 GPT (238G) 34 6 - free - (3.0K) 40 128 1 freebsd-boot (64K) 168 490733432 2 freebsd-ufs (234G) 490733600 8388608 3 freebsd-swap (4.0G) 499122208 995951 - free - (486M) I would like to debug further into the kernel but have little dtrace experience. Any hints would be welcome.
Do you have trim enabled on your volume ? If yes, is the problem reproducable after disabling it ?
Good idea, I forgot to mention! I had not enabled TRIM. I have now enabled it and cleared the free blocks (fsck_ufs -E). This has made no difference. Results are same in single user mode with or without TRIM.
Could you try the same load on your machine, with e.g. swap-backed md(4), formatted into UFS with same tuning (AKA tunefs(8)) ? Is slow-down reproducable ? If not, I wonder if you can replace your disk by different model and try the same load.
I'm having the same issue with a box I upgraded from 10.3 to 11.1 (via 11.0) on a UFS filesystem. I'm pretty sure extracting packages wasn't this slow earlier.
I have the same problem after upgrade from 10.1 to 11.1 - very slow pkg extract LSI Raid1 mps0, UFS filesystem pkg-1.10.1 kernel 11.1-RELEASE-p1 FreeBSD 11.1-RELEASE-p1 #0: Wed Aug 9 11:55:48 UTC 2017 root@amd64-builder.daemonology.net:/usr/obj/usr/src/sys/GENERIC amd64 camcontrol devlist <LSI Logical Volume 3000> at scbus0 target 0 lun 0 (pass0,da0)
I'm also noticing this since upgrading from 10.3-RELEASE to 11.1-RELEASE. All of a sudden pkg upgrades were unbearably slow when extracting. This is on ESXi 5.5 host with a pretty old non-ssd raid 6 datastore. The older FreeBSD vms on this host don't show the issue when upgrading - only the 2 FreeBSD 11.1 vms. top -m io showed pkg using 100% and its state was biowr. FreeBSD: 11.1-RELEASE-p6 pkg: 1.10.3
Could somebody experiencing the problem to look on the disk which was originally reported as 100% busy during the test with `gstat -I 1s -pod`? It would show distribution of I/O operations between reads, writes, deletes and cache flushes, providing counts, sizes and average latency for which. It could be a starting point is the problem is indeed disk-bound. As I understand, journaled UFS may started to use cache flushes, not used before, and I can theoretically guess some scenarios when some disks are not handling it well.
I just tried to recreate the issue using the script in OP, but my times don't look as bad as Walter's. FreeBSD site.com 11.1-RELEASE-p4 FreeBSD 11.1-RELEASE-p4 #0: Tue Nov 14 06:12:40 UTC 2017 root@amd64-builder.daemonology.net:/usr/obj/usr/src/sys/GENERIC amd64 /dev/da0p2 on / (ufs, NFS exported, local, journaled soft-updates) 0.066318722 ftruncate(3,0x186a0) = 0 (0x0) 0.025288527 ftruncate(3,0x186a0) = 0 (0x0) 0.274295490 ftruncate(3,0x186a0) = 0 (0x0) 0.044457881 ftruncate(3,0x186a0) = 0 (0x0) 0.108361900 ftruncate(3,0x186a0) = 0 (0x0) 0.124630942 ftruncate(3,0x186a0) = 0 (0x0) 0.126537198 ftruncate(3,0x186a0) = 0 (0x0) 0.042581028 ftruncate(3,0x186a0) = 0 (0x0) 0.054237172 ftruncate(3,0x186a0) = 0 (0x0) 0.068103733 ftruncate(3,0x186a0) = 0 (0x0) 0.126835073 ftruncate(3,0x186a0) = 0 (0x0) gstat: L(q) ops/s r/s kBps ms/r w/s kBps ms/w d/s kBps ms/d o/s ms/o %busy Name 0 7 0 0 0.0 7 115 15.4 0 0 0.0 0 0.0 9.9 da0 0 5 0 0 0.0 5 69 15.5 0 0 0.0 0 0.0 6.9 da0 0 1 0 0 0.0 1 0 8.7 0 0 0.0 0 0.0 0.9 da0 0 4 0 0 0.0 4 67 23.4 0 0 0.0 0 0.0 7.7 da0 0 5 0 0 0.0 5 68 16.0 0 0 0.0 0 0.0 7.1 da0
(In reply to Alexander Motin from comment #7) AFAIK, the journaling code issues BIO_FLUSH only when debug.softdep.flushcache is set to 1, which it is not by default. I have not been able to reproduce the issue with or without that sysctl set, but it would interesting to know whether anyone that can reproduce this is setting that sysctl.
I'm currently experiencing this again during an upgrade from 11.1 to 11.2. Changing debug.softdep.flushcache from its default 0 appears to have no effect on the package extraction.