Bug 263908

Summary: Something spawning many "sh" process (possibly zfsd), stalled system (No more processes), would not boot normally afterward
Product: Base System Reporter: Greg <greg>
Component: binAssignee: freebsd-fs (Nobody) <fs>
Status: Closed FIXED    
Severity: Affects Only Me CC: asomers, grahamperrin, leeb
Priority: --- Keywords: needs-qa, regression
Version: 13.1-STABLE   
Hardware: amd64   
OS: Any   
See Also: https://reviews.freebsd.org/D6793

Description Greg 2022-05-11 00:11:02 UTC
Not sure how, or even if, I should report this. However figured I should say something, since process I am using to install and run 13.1-RC6 is basically same as what I had going with 13.0. But... with a serious issue! All things being equal, issues point to a flaw or difference in 13.1-RC6 compared to 13.0.

Did a fresh install of 13.1RC-6 on Sunday (05/08) evening. Ran into an issue with MFI driver (reported as bug 263906) but was able to work around with MRSAS driver (which I intended to use anyway). Installed common packages for benchmarks. Built a zpool using dRAID out of HDDs and special vdev using 3x mirror of SSDs. Applied mix of system tunables that had been working reliably under 13.0 (can provide if requested). Started a test set of back to back fio and iozone benchmarks.

Next morning went to check results. Found I could not run anything, was getting "No more processes" on my shell. Left it running, later Monday evening found I was able to run processes. But there were over 37,000+ instances of "sh" running! Mostly in sleep. I was able to pull /var/log/messages, and found:

May  9 20:11:00 freebsd kernel: maxproc limit exceeded by uid 2 (pid 21916); see tuning(7) and login.conf(5)

Results from top at the time:

last pid: 22684;  load averages:  0.26,  0.18,  0.11                                                                                                                          up 0+22:20:59  20:15:46
37976 processes:1 running, 37975 sleeping
CPU:  0.1% user,  0.0% nice,  6.0% system,  0.0% interrupt, 93.8% idle
Mem: 1112K Active, 19G Inact, 8491M Laundry, 2648M Wired, 40K Buf, 817M Free
ARC: 236M Total, 50M MFU, 108M MRU, 2067K Header, 75M Other
     90M Compressed, 222M Uncompressed, 2.46:1 Ratio
Swap: 8192M Total, 2784M Used, 5408M Free, 33% Inuse

  PID USERNAME    THR PRI NICE   SIZE    RES STATE    C   TIME    WCPU COMMAND
22684 root          1  20    0    72M    46M CPU1     1   0:16  85.79% top
25011 ntpd          1  20    0    21M  1724K select   3   0:02   0.00% ntpd
 8242 root          1  52    0    13M  2004K wait     1   0:01   0.00% sh

Did a reboot, and has been all down hill from there. System will no longer boot, at least not to login prompt. It stalls during several points at loading up, after usb driver load, and after starting network. Can coax it along some what by crtl-c/x/z, the last thing it will do is "Starting devd".

Kernel seems to be running, as it will reboot if you hit ctrl-alt-del, or power down if you tap power button.

I can get into single user mode, but find /var/log is empty.

I let it sit for a while at one point, and it displayed a few lines over time that it was killing of "sh" processes.

Because I had rebooted several times on the first night, right now I suspect some stock ("out of the box") cron job is running and looping, creating all the "sh" processes. But I don't have enough detail yet.

Honestly still figuring out how I get root file system out of read-only mode when booted single user? I want to comment out everything in /etc/crontab and try booting. See if one of these is the cause. (again all "stock", I didn't create any custom cron jobs yet)

Because of the issues with the MFI driver, I did pull the LSI 9361 HBA out of the server. I even destroyed the dRAID pool. Doesn't seem related, issue persists.

So why am I reporting this as a "bug", when I lack enough detail to confirm the actual issue? Because every single step I did was the same as performed under 13.0. On the same hardware, that had been 100% stable for 3+ months. All things being equal, there is something "wrong" or "different" in 13.1-RC6 which is now broken / breaking my setup.

In the interest of helping rule this out as a flaw in RC6, willing to do what I can to trouble shoot further. But honestly would need more input as to proper diagnostic steps. I do have a little more time to "play" with this hardware, before I have to select a version and put it into production. I was holding out so I could run 13.1 when it goes to release. But if I cannot figure this out I will roll back to 13.0 for production, since that was fully stable.

Please let me know what other details to provide, suggestions for trouble shooting, further diagnostics. Just looking to contribute to RC6 testing, determine if this is a bug or a "just me" problem. Thanks!

-Greg-
Comment 1 Greg 2022-05-12 02:08:44 UTC
I have made some progress, and I have to report that something is up with zfsd! Cannot say for sure that is what was spawning all the "sh" processes, but I suspect that to be the case.

After commenting out the following from /etc/rc.conf I am now able to boot normally:

#zfsd_enable="YES"
#service zfsd start

This was after figuring out how to zfs set readonly=off and mount -a my zroot pool in single user mode. And trying all of this first:

- Comment out everything in /etc/crontab
- Remove all the sysctl and other tunable tweaks I had customized

Now it is complaining about my dRAID test pool not being available, and that appears to still be listed in zpool.cache. But it was having this issue with not booting, prior to pulling one of the HBAs (LSI 9361 mentioned previously), so this issue with zfsd existed while that pool was still available.

I will double check, but I am fairly sure there was nothing wrong with that pool. Regardless, I cannot imagine it is intended behavior for zfsd to prevent a system from booting, regardless of the state any zpools are in? Beyond perhaps serious issues with zroot, which doesn't appear to be the case here (it passes a scrub no issues).

If anyone is interested in getting more debugging, while I still have the test case and hardware setup for this, please let me know. Willing to put a little more effort into figuring this out.

Again, same setup under 13.0 was not having this issue. Same benchmarks run back to back for days on end. Same dRAID design. Same used of zfsd.

Thanks!

-Greg-
Comment 2 Graham Perrin freebsd_committer freebsd_triage 2023-04-22 09:08:04 UTC
(In reply to Greg from comment #1)

13.1-RELEASE was announced around four days later. 

If not too late to ask: can you recall whether the issue persisted with RELEASE?
Comment 3 Greg 2023-04-30 18:53:24 UTC
(In reply to Graham Perrin from comment #2)

According to my notes, I left these commented out for remainder of 13.1-RC6 testing:

#zfsd_enable="YES"
#service zfsd start

After 13.1-RELEASE came out, and I was ready to move from lab testing to production setup of this new server, I did a fresh install. I did not run into this issue with zfsd spawning many "sh" processes, it has never come back up. Server has been in production use for almost a year, no real issues. So what ever caused this, was resolved by the time 13.1-RELEASE hit.

I did still have this issue:
Bug 263906 - MFI driver fails with "Fatal firmware error" line 1155 in ../../dm/src/dm.c 

Where I had to set:

set hw.mfi.mrsas_enable="1"
set hint.hw.mfi.mrsas_enable="1"

Which had not been an issue under 13.0. Outside of that, 13.1 has been very stable and server has been performing well.
Comment 4 Alan Somers freebsd_committer freebsd_triage 2023-05-01 13:00:47 UTC
zfsd never spawns any sh processes, so it can't be the cause of your initial fork bomb.
Comment 5 leeb 2023-05-01 21:11:29 UTC
Expected behaviour.

The service executable sources rc.conf.
If the service executable is in rc.conf it's called recursively.