Tested on a Xeon-E5 v3 with 64GB memory and 11-current #289783 when memory utilization is high, the command 'systat -vmstat 1' will also consumes more cpu cycles. % top last pid: 40272; load averages: 0.32, 4.74, 8.01 up 3+01:19:54 17:23:59 58 processes: 1 running, 57 sleeping CPU: 0.1% user, 0.0% nice, 1.6% system, 0.1% interrupt, 98.3% idle Mem: 4509M Active, 52G Inact, 2819M Wired, 1572M Buf, 2930M Free Swap: 3598M Total, 3598M Free PID USERNAME THR PRI NICE SIZE RES STATE C TIME CPU COMMAND 46841 root 30 20 0 9248M 7930M kqread 9 20.8H 11.88% bhyve 49914 jsli 1 23 0 19320K 3884K select 5 134:08 4.79% systat the cause is supposed to be getting sysctl variable vm.vmtotal, which in turn calls vmtotal(). This can be reproduced with sysctl command: % time repeat 100 sysctl vm.vmtotal > /dev/null 0.055u 8.102s 0:08.19 99.5% 31+175k 0+0io 0pf+0w It does not seem to mitigate loading by freeing memory: % top last pid: 14037; load averages: 0.15, 3.09, 6.45 up 3+17:45:01 09:49:06 57 processes: 1 running, 56 sleeping CPU: 0.0% user, 0.0% nice, 1.2% system, 0.0% interrupt, 98.8% idle Mem: 2060M Active, 45M Inact, 3004M Wired, 1572M Buf, 57G Free Swap: 3598M Total, 623M Used, 2975M Free, 17% Inuse PID USERNAME THR PRI NICE SIZE RES STATE C TIME CPU COMMAND 49914 jsli 1 27 0 19320K 2780K select 8 199:48 12.56% systat 1026 jsli 1 29 0 28116K 8372K select 7 45:39 0.86% tmux % time repeat 100 sysctl vm.vmtotal > /dev/null 0.070u 25.400s 0:26.35 96.6% 28+174k 0+0io 0pf+0w On contrast the status shortly after reboot: % top last pid: 1083; load averages: 0.15, 0.24, 0.12 up 0+00:02:45 09:56:59 42 processes: 3 running, 39 sleeping CPU: 0.0% user, 0.0% nice, 0.1% system, 0.0% interrupt, 99.9% idle Mem: 72M Active, 39M Inact, 451M Wired, 11M Buf, 62G Free Swap: 3598M Total, 3598M Free PID USERNAME THR PRI NICE SIZE RES STATE C TIME CPU COMMAND 1004 jsli 1 38 0 21972K 4176K CPU6 6 0:00 0.52% tmux 814 root 1 20 0 213M 14128K select 3 0:00 0.37% nmbd 1031 jsli 1 20 0 17272K 4628K select 1 0:00 0.16% systat 1083 jsli 1 38 0 9008K 2464K CPU11 11 0:00 0.05% sh % time repeat 100 sysctl vm.vmtotal > /dev/null 0.055u 0.160s 0:00.20 105.0% 172+238k 0+0io 0pf+0w Originally discussed in http://lists.freebsd.org/pipermail/freebsd-hackers/2015-October/048483.html
I noticed that if a program calls clock() frequently (clock() in turn calls getrusage()), the system itself responds slow. For example, we run word2vec program (http://word2vec.googlecode.com/svn/trunk/word2vec.c) in 32 threads (on 32-core machine) and during that all other programs (even single-threaded) run an order of magnitude slower compared with the time they use without word2vec. I wonder if the reason in the same.
Forgot to mention in my previous comment that removing clock() call from word2vec program fixes that issue.