Bug 171811 - [patch] rctl(8) cputime is too high
Summary: [patch] rctl(8) cputime is too high
Status: Closed FIXED
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: 10.0-CURRENT
Hardware: Any Any
: Normal Affects Only Me
Assignee: freebsd-bugs (Nobody)
URL:
Keywords: patch
Depends on:
Blocks:
 
Reported: 2012-09-20 17:00 UTC by ben
Modified: 2021-06-17 16:42 UTC (History)
6 users (show)

See Also:


Attachments
kern_racct.diff (535 bytes, patch)
2012-09-20 17:00 UTC, ben
no flags Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description ben 2012-09-20 17:00:22 UTC
rctl's idea of cputime is unreasonably high with lots of process turnover.

Fix: Attached patch seems to help.
How-To-Repeat: 
# jail -c command=sh
jail# while true; do id > /dev/null; done

meanwhile:

# dtrace -n 'rusage:add-cred/args[0]->cr_prison->pr_id != 0 && args[1] == 0/{printf("%d: jail %d cputime %d", pid, args[0]->cr_prison->pr_id, args[2])}'
5  57139                  rusage:add-cred 37375: jail 5 cputime 124211
5  57139                  rusage:add-cred 37375: jail 5 cputime 6330
5  57139                  rusage:add-cred 37375: jail 5 cputime 51237828
5  57139                  rusage:add-cred 37375: jail 5 cputime 173602
5  57139                  rusage:add-cred 37375: jail 5 cputime 6834680
(...)
Comment 1 ben 2012-09-21 08:51:18 UTC
Sorry, please ignore my patch; I guess the problem is just that =
p_prev_runtime is never initialized.  I'm not sure why it exists, but =
removing it makes things work as expected.=
Comment 2 Edward Tomasz Napierala freebsd_committer freebsd_triage 2014-02-03 11:16:30 UTC
Responsible Changed
From-To: freebsd-bugs->trasz

I'll take it.
Comment 3 Oleg Ginzburg 2017-11-03 22:16:17 UTC
Since this problem can not be fixed for a long time it may be the best solution to add one more line into 'BUGS' of rctl(8) https://man.freebsd.org/rctl/8

At the moment (end of 2017) this problem has not been fixed in any supported FreeBSD version and this problem makes it impossible to have the correct statistics on the jail and makes it dangerous for people who use a billing system based on RACCT.

This problem also affects 'pcpu' metrics ( %CPU, in percents of a single CPU core ) and can be easy to reproduce on single core:

1) Run jail1
2) Try to execute ant fast/light external command (e.g. /bin/ls ) in the loop. Or compile this sample as /root/a.out in jail:

---
#include <stdio.h>

int main()
{
return 0
}
---

Write execution loop and drop it into jail, e.g /root/run.sh:
---
#!/bin/sh

while [ 1 ]; do
/root/a.out > /dev/null
done
---

Run inside jail this script via:

cpuset -c -l 0 /bin/sh /root/run.sh


After this we can see on the 'top -P':
---
182 processes: 2 running, 180 sleeping
CPU 0: 34.1% user,  0.0% nice, 65.9% system,  0.0% interrupt,  0.0% idle
CPU 1:  0.5% user,  0.0% nice,  0.0% system,  0.0% interrupt, 99.5% idle
CPU 2:  3.1% user,  0.0% nice,  1.2% system,  0.0% interrupt, 95.7% idle
CPU 3:  0.0% user,  0.0% nice,  0.4% system,  0.0% interrupt, 99.6% idle
CPU 4:  1.2% user,  0.0% nice,  0.8% system,  0.0% interrupt, 98.1% idle
CPU 5:  0.8% user,  0.0% nice,  0.4% system,  0.0% interrupt, 98.8% idle
CPU 6:  1.2% user,  0.0% nice,  0.0% system,  0.0% interrupt, 98.8% idle
CPU 7:  0.4% user,  0.0% nice,  0.4% system,  0.0% interrupt, 99.2% idle
...

  PID USERNAME       THR PRI NICE   SIZE    RES STATE   C   TIME    WCPU COMMAND
41437 root             1  76    0 11408K  2232K CPU0    0   0:07  12.79% sh
...
---


Only one core is busy. However if we look at the RACCT from the hoster side, we see the following picture:

freebsd:~ # rctl -u jail:jail1 | grep pcpu
pcpu=25600
freebsd:~ # rctl -u jail:jail1 | grep pcpu
pcpu=25600
freebsd:~ # rctl -u jail:jail1 | grep pcpu
pcpu=25600


Unfortunately this is not an unlikely reproduction of the problem.
Similar you can see in real life very often, for example at the configuration stage when a large number of commands are executed:

Try to execute in the jail, for example:

env BATCH=no make -C /usr/ports/misc/mc clean configure

And you will see the problem of statistics again
Comment 4 Allan Jude freebsd_committer freebsd_triage 2017-11-05 01:37:48 UTC
As an aside, you might want to look at /usr/bin/sa for accounting.
Comment 5 Oleg Ginzburg 2017-11-05 10:40:46 UTC
(In reply to Allan Jude from comment #4)

Yes, I know about sa(8) but there are other problems (there is no support for jail, only cpu metrics..)

Ideally for each component of FreeBSD (jail, racct..) would have a active maintainer. But today FreeBSD is a hobby-OS with a catastrophically small number of developers, fixes can be expected for several years (and not see them). 

Therefore, if we can not fix the bug, it should be described in the man pages.

PS: openfiles metrics also give abnormally high values (in a few dozen) via RACCT (compared to fstat / lsof). But without subsystem maintainer and without entering such information into the man page, I'm not sure that it makes sense to write PR for this. Unfortunately, I can only help with testing (from a practical point of view) and report for issue but do not fix it ;-) Thanks.
Comment 6 Allan Jude freebsd_committer freebsd_triage 2017-11-05 15:37:53 UTC
(In reply to olevole from comment #5)
The maintainer of RACCT is still active, just busy.

For the open files count, remember that every socket, pipe, and other special types of file descriptor counts as an open file.

I'll try to get someone to look at this
Comment 7 ben 2017-11-05 20:39:21 UTC
Be aware that my problem was with the cputime measurement, which seemed to be fixed not long after this bug.  Thanks!
Comment 8 Oleg Ginzburg 2017-11-06 11:29:51 UTC
(In reply to ben from comment #7)

Indeed, there is no problem with cputime now. Probably, this PR should be closed.
Nevertheless the problem with ncpu is real
Comment 9 Oleg Ginzburg 2017-11-06 11:35:08 UTC
(In reply to Allan Jude from comment #6)
Allan, Thanks!

Should I register a new PR ? The original problem on which this PR is introduced is already fixed. The problem with 'pcpu' is very similar:

'pcpu' value are normal on the static processes ( e.g install net-p2p/cpuminer and run: minerd --benchmark )

but incorrect for this cases: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=171811#c3
Comment 10 Eitan Adler freebsd_committer freebsd_triage 2018-05-20 23:53:47 UTC
For bugs matching the following conditions:
- Status == In Progress
- Assignee == "bugs@FreeBSD.org"
- Last Modified Year <= 2017

Do
- Set Status to "Open"
Comment 11 Oleksandr Tymoshenko freebsd_committer freebsd_triage 2019-01-19 04:58:57 UTC
(In reply to olevole from comment #9)

Please create new PR for pcpu issue so this one could be closed. Thanks
Comment 12 Oleg Ginzburg 2019-02-06 15:51:50 UTC
done https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=235556
Comment 13 Mark Johnston freebsd_committer freebsd_triage 2021-06-17 16:42:53 UTC
The original bug is fixed per comment 7.  There is a patch for PR 235556 which will land shortly.