Bug 253718 - Major issue with zombie processes from standard base system utils
Summary: Major issue with zombie processes from standard base system utils
Status: New
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: 13.0-STABLE
Hardware: amd64 Any
: --- Affects Many People
Assignee: freebsd-bugs (Nobody)
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2021-02-20 11:20 UTC by dmilith
Modified: 2021-02-21 16:54 UTC (History)
0 users

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description dmilith 2021-02-20 11:20:28 UTC
Since FreeBSD 13-ALPHA I was able to reproduce a very nasty bug. I thought it's HardenedBSD-related, but today I also reproduced it on vanilla version of FreeBSD.

The issue happens when I build software using my "Sofin" script (which is just plain sh script using sed, awk and other base system utilities to automate software build process). From htop, the issue looks like this:

http://s.verknowsys.com/122c70f4414e3ccd903d34746d8284e8.png

In this case, two processes turn zombie, and build is stuck (here perl build process hanged on invoking some sed and zsh).
This is how build looks from my script side: http://s.verknowsys.com/9fe4981b309b381785e0b4c4ab7d0aea.png (this way I know it's perl in this case, but I had same issues building Qemu and other software bundles before)

It's a very, very severe bug, (would say critical blocker) which is not that hard to reproduce.
Comment 1 Konstantin Belousov freebsd_committer 2021-02-20 11:23:05 UTC
If it is so easy to reproduce, provide absolutely minimal reproduction scenario.
Comment 2 dmilith 2021-02-20 23:30:51 UTC
I provided a whole VM image, cause my system (svdOS) is a whole stack of things built on top of FreeBSD, including:

1. Sofin - which requires stuff like /Software/Git and /Software/Zsh on ZFS datasets to run properly.
2. There's a whole ZFS dataset infrastructure provided for Sofin utility… which would take a while to set up manually.
4. There are prepared build-utilities used by build-host (under /Services/Sofin/), which is an even trickier part.
5. Every software bundle is read-only, the system is divided into read-only and writable parts. / is made read-only…

Well, it's quite some custom stuff, but all open source and without any magic.
If you'd like to see stuff from the source code point of view, they're also open and available here: https://github.com/VerKnowSys/sofin and system build stuff is here: https://github.com/VerKnowSys/svdOS - basically shell script and few lines of C code.

So whole that stuff requires a special script to be invoked on vanilla OS… which would take far too long to explain…


So here is the easiest reproduction path I can provide:

Go to http://software.verknowsys.com/build-host-images/

There you'll find my Vmware exported build-host-vm in standard "ova" format (it's a tarball). Should be possible to convert the vmdk to qcow2 if you need it, it doesn't matter.



Here's the reproduction path:

0. Start the vm.
1. Log in as root to the VM over SSH. Use "/'\;[]p-=" as root password.
2. Invoke those commands:

# Tell Sofin to remove both previously built software bundles:
s rm Imagemagick Qemu

# let's now build both bundles again:
s b Imagemagick Qemu


It will start building Imagemagick and Qemu with all software dependencies step by step… which will take a while.

In my case, the last time it hanged was on "perl" requirement of Qemu bundle, but one build before it was "qemu" requirement of Qemu bundle…

When you'll notice no progress for a longer period of time, log in over SSH again and htop/ top should show you zombie processes.


Hope you'll be able to reproduce it as well :)