Hello. Can someone please explain where is the error? I am running Python code that simulates forking a large process and gives poor performance results. Average 4 seconds. import time import os some_str = ' ' * (100 * 1024 * 1024 * 1024) while True: p = time.time() pid = os.fork() tt = time.time() - p if pid != 0: print("%d "%pid + str(tt) ) time.sleep(1) if pid == 0: break
How much RAM do you have? Hopefully more than 100GB. I suspect that we're spending a lot of time in pmap_copy(), which copies page tables into the child process. It only copies referenced PTEs, but in this case we'll copy all of the page tables backing some_str. You can try something like this to confirm: # dtrace -q -n 'fbt::pmap_copy:entry /pid == $target/{self->ts = timestamp} fbt::pmap_copy:return /self->ts/{@ = sum(timestamp - self->ts); self->ts = 0;} tick-1s {printa("spent %@dns copying\n", @); clear(@);}' -c "python test.py"
I have 192GB. Trace: dtrace -q -n 'fbt::pmap_copy:entry /pid == $target/{self->ts = timestamp} fbt::pmap_copy:return /self->ts/{@ = sum(timestamp - self->ts); self->ts = 0;} tick-1s {printa("spent %@dns copying\n", @); clear(@);}' -c "python test_fork.py" spent 238724ns copying spent 0ns copying spent 0ns copying spent 0ns copying spent 0ns copying spent 0ns copying spent 0ns copying spent 0ns copying spent 0ns copying 25576 8.959986448287964 spent 525185651ns copying spent 0ns copying spent 0ns copying spent 0ns copying spent 0ns copying spent 279323ns copying 25593 4.342242956161499 spent 434345424ns copying spent 0ns copying spent 0ns copying spent 0ns copying spent 0ns copying spent 0ns copying spent 0ns copying spent 352273ns copying spent 423458987ns copying spent 0ns copying 25597 7.248447418212891 spent 0ns copying spent 0ns copying spent 0ns copying spent 0ns copying spent 0ns copying 25602 4.694051027297974 spent 432109608ns copying spent 0ns copying spent 0ns copying spent 0ns copying spent 0ns copying spent 0ns copying 25619 4.810415983200073
If I am reading the output correctly, we are spending hundreds of ms copying page tables, but this is not where the majority of time is spent. It would also be useful to see output from # dtrace -n 'profile-997hz /pid == $target/{@[stack()] = count()}' -c "python test.py"
Created attachment 223724 [details] dtrace stack Yep. Somewhere hidden more than 4 seconds. Trace in the attachment.