Bug 235556

Summary: rctl(8) pcpu/cputime is too high
Product: Base System Reporter: Oleg Ginzburg <olevole>
Component: kernAssignee: freebsd-bugs (Nobody) <bugs>
Status: New ---    
Severity: Affects Many People CC: allanjude, cyril, markj
Priority: ---    
Version: CURRENT   
Hardware: Any   
OS: Any   

Description Oleg Ginzburg 2019-02-06 15:51:17 UTC
This is a duplicate of https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=171811  created at the request of gonzo@, so that we can close the old one from 2012.

I don’t know how the original Issue maker lives (ben@desync.com), but I'm still alive. So, I'll do it.

All information from the old issue is relevant and affects all the "supported" versions of FreeBSD in 2019:

more info how to reproduce: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=171811#c3
Comment 1 Allan Jude freebsd_committer freebsd_triage 2019-02-07 03:41:04 UTC
Can you freshly describe the problem and provide some reproduction steps?
Comment 2 Oleg Ginzburg 2019-02-07 10:44:54 UTC
RACCT pcpu metrics are incorrect when processes end quickly.

What i expect:
 maximum value for 1-core process: 100

What i get:
 100x256

It is not an artificial or abstract state. For example this behavior is easy to see when working 'make config' for autotools with the launch of a lot of short calls (e.g env BATCH=no make -C /usr/ports/misc/mc clean configure). This makes it impossible to use any external billing based on RACCT. 

How to reproduce ( we use cpuset here to create load on only one core. So we should have pcpu=100 for jail assuming the jail does nothing else ):

1) Run jail1
2) Try to execute ant fast/light external command (e.g. /bin/ls ) in the loop.
For more convincing create a simple utility:

---
#include <stdio.h>

int main()
{
return 0;
}
---

Write execution loop and drop it into jail, e.g /root/run.sh:
---
#!/bin/sh

while [ 1 ]; do
/root/a.out > /dev/null
done
---

Run inside jail this script via cpuset:

cpuset -c -l 0 /bin/sh /root/run.sh


After this we can see on the 'top -P' something like:
---
182 processes: 2 running, 180 sleeping
CPU 0: 34.1% user,  0.0% nice, 65.9% system,  0.0% interrupt,  0.0% idle
CPU 1:  0.5% user,  0.0% nice,  0.0% system,  0.0% interrupt, 99.5% idle
CPU 2:  3.1% user,  0.0% nice,  1.2% system,  0.0% interrupt, 95.7% idle
CPU 3:  0.0% user,  0.0% nice,  0.4% system,  0.0% interrupt, 99.6% idle
CPU 4:  1.2% user,  0.0% nice,  0.8% system,  0.0% interrupt, 98.1% idle
CPU 5:  0.8% user,  0.0% nice,  0.4% system,  0.0% interrupt, 98.8% idle
CPU 6:  1.2% user,  0.0% nice,  0.0% system,  0.0% interrupt, 98.8% idle
CPU 7:  0.4% user,  0.0% nice,  0.4% system,  0.0% interrupt, 99.2% idle
...

  PID USERNAME       THR PRI NICE   SIZE    RES STATE   C   TIME    WCPU COMMAND
41437 root             1  76    0 11408K  2232K CPU0    0   0:07  12.79% sh
...
---


Only one core is busy. However if we look at the RACCT from the hoster side, we see the following:

freebsd:~ # rctl -u jail:jail1 | grep pcpu
pcpu=25600
freebsd:~ # rctl -u jail:jail1 | grep pcpu
pcpu=25600
freebsd:~ # rctl -u jail:jail1 | grep pcpu
pcpu=25600
Comment 3 cyril 2021-06-03 21:00:11 UTC
I've created a patch here: https://reviews.freebsd.org/D30632 which seems to fix this problem.
Comment 4 Oleg Ginzburg 2021-06-04 09:29:54 UTC
(In reply to cyril from comment #3)

I confirm - everything is fine now.
Tested on: FreeBSD 14.0-CURRENT #0 main-n247127-1976e079544-dirty amd64
Comment 5 cyril 2021-07-13 19:50:14 UTC
(In reply to Oleg Ginzburg from comment #4)

markj discovered that the above patch is actually not correct. I have made another patch that modifies how pcpu is calculated. It is now based on the elapsed cputime value divided by the elapsed realtime value, rather than aggregating the pcpu of all processes in a jail. You can try the patch here: https://reviews.freebsd.org/D30878