Bug 254639 - Fork syscall performance
Summary: Fork syscall performance
Status: New
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: Unspecified
Hardware: Any Any
: --- Affects Some People
Assignee: freebsd-bugs (Nobody)
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2021-03-29 13:12 UTC by Kirill
Modified: 2021-03-31 06:04 UTC (History)
1 user (show)

See Also:


Attachments
dtrace stack (337.64 KB, text/plain)
2021-03-31 06:04 UTC, Kirill
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Kirill 2021-03-29 13:12:44 UTC
Hello.

Can someone please explain where is the error? I am running Python code that simulates forking a large process and gives poor performance results. Average 4 seconds.

import time
import os
some_str = ' ' * (100 * 1024 * 1024 * 1024)
while True:
    p = time.time()
    pid = os.fork()
    tt = time.time() - p
    if pid != 0:
        print("%d "%pid + str(tt) )
        time.sleep(1)
    if pid == 0:
        break
Comment 1 Mark Johnston freebsd_committer 2021-03-29 15:26:17 UTC
How much RAM do you have?  Hopefully more than 100GB.

I suspect that we're spending a lot of time in pmap_copy(), which copies page tables into the child process.  It only copies referenced PTEs, but in this case we'll copy all of the page tables backing some_str.

You can try something like this to confirm:

# dtrace -q -n 'fbt::pmap_copy:entry /pid == $target/{self->ts = timestamp} fbt::pmap_copy:return /self->ts/{@ = sum(timestamp - self->ts); self->ts = 0;} tick-1s {printa("spent %@dns copying\n", @); clear(@);}' -c "python test.py"
Comment 2 Kirill 2021-03-29 15:36:42 UTC
I have 192GB. Trace:

dtrace -q -n 'fbt::pmap_copy:entry /pid == $target/{self->ts = timestamp} fbt::pmap_copy:return /self->ts/{@ = sum(timestamp - self->ts); self->ts = 0;} tick-1s {printa("spent %@dns
 copying\n", @); clear(@);}' -c "python test_fork.py"
spent 238724ns copying
spent 0ns copying
spent 0ns copying
spent 0ns copying
spent 0ns copying
spent 0ns copying
spent 0ns copying
spent 0ns copying
spent 0ns copying
25576 8.959986448287964
spent 525185651ns copying
spent 0ns copying
spent 0ns copying
spent 0ns copying
spent 0ns copying
spent 279323ns copying
25593 4.342242956161499
spent 434345424ns copying
spent 0ns copying
spent 0ns copying
spent 0ns copying
spent 0ns copying
spent 0ns copying
spent 0ns copying
spent 352273ns copying
spent 423458987ns copying
spent 0ns copying
25597 7.248447418212891
spent 0ns copying
spent 0ns copying
spent 0ns copying
spent 0ns copying
spent 0ns copying
25602 4.694051027297974
spent 432109608ns copying
spent 0ns copying
spent 0ns copying
spent 0ns copying
spent 0ns copying
spent 0ns copying
25619 4.810415983200073
Comment 3 Mark Johnston freebsd_committer 2021-03-30 22:21:45 UTC
If I am reading the output correctly, we are spending hundreds of ms copying page tables, but this is not where the majority of time is spent.

It would also be useful to see output from

# dtrace -n 'profile-997hz /pid == $target/{@[stack()] = count()}' -c "python test.py"
Comment 4 Kirill 2021-03-31 06:04:42 UTC
Created attachment 223724 [details]
dtrace stack

Yep. Somewhere hidden more than 4 seconds. Trace in the attachment.