Created attachment 206810 [details]
Various Information On Clean Boot of System
I am not new to Unix having used Unix since 1995, and as my only OS since 1999. I am new to FreeBSD mostly since mid July 2019, and on/off attempts over last 15 years.
This problem is serious, but likely I or only a few unique types of users will experience this serious problem.
What makes this bug unique from information gathering and debugging point of view is the complete loss of the keyboard. Complete loss does mean no input from the keyboard is accepted not is any input from the keyboard result in a beep.
At the time of this FreeBSD Kernel condition I had been and still was in my XFCE4 DE environment. The mouse was responsive and applications would respond to the mouse, but application would respond to any keyboard input. The CTL-Alt-Fn keyboard sequence was also unresponsive. For reasons unknown the only XFCE4 functions not greyed out was "Log Out".
Various applications were successfully and normally exited via the use of the DE mouse by selecting the application menu to exit the application.
Upon "Log Out" of XFCE DE the command line appeared as is how my system is so configured. I did not install SLIM nor any other DE GUI based login manager. The command line would not accept any keyboard input and likewise no error messages not keyboard beeps.
Two additional observations were also made.
The first and most important is I believe the condition of the keyboard was directly related to the FreeBSD Kernel in a intense swap file activity for about 15 minutes before the base system killed Firefox as the almost certain object of the intense swap file activity. Be aware I have far more than likely to ever be needed at present of swap file space. That means the swap file capacity would not have been reached. At the time of the intense swap activity about 4.78GB of swap was being used as it so happens htop was already running on a terminal as well as XOSview at time of intense swap file use. When the base system killed firefox the swap file use dropped to about 678Mb per htop. htop could not be exited via the keyboard "q" normally used to exit htop.
The second by pure chance of circumstance occurred at the terminal prompt after the successful exit from XFCE DE unresponsive to any keyboard input. I happened to plug my phone in to charge via USB cable left in the laptop ongoing for that purpose. The Android phone in question has no USB access by design of the phone (long story, but Samsung disabled any USB access on this specific version of the phone). Despite the phone not having USB access to files on the phone FreeBSD Kernel did issue messages to the terminal when the phone was connected to the USB port. I tried the keyboard again out of curiosity, and found he keyboard was responsive! This allowed me to shutdown FreeBSD normally vs the only other option I did not wish to use being a power off.
Despite issuing a "poweroff" command and then powering on the laptop after there appeared to be a number of file system repairs done, but I have not been able to find where FreeBSD keeps the messages that flash by on the booting of the FreeBSD kernel to save those messages and look at them in more detail.
I had a long career in IT and much of that career was working on bugs that were often very difficult to find, including difficult to duplicate. So I know more information will be needed for when this FreeBSD Kernel bug occurs again. I know it will, but I do not know when.
In the process of first sorts of software I like to have on a Unix system bsdsar was at top of list after some basic research. Sadly bsdsar seems to have been an inactive project for a long time. I have been unable to find a alternative to bsdsar that would have at least provided some metrics on th paging activity, extent, how much and how long before firefox was killed by the base system. Wat I need to know is based on the likely limitation of no keyboard access what can I do that will collect information about this issue that can be looked at even after I may have to hard power off the laptop vs using "shutdown" or "poweroff" or "reboot" CLI commands?
Created attachment 206812 [details]
dmesg from just prior to firefox killed by base system, poweroff CLI, to first few messages after Power On laptop
This is the dmesg from just before FreeBSD Kernel base system killed firefox after intense swap file activity by firefox for about 15 minutes to when able to issue CLI "poweroff" command after by chance phone enabled keyboard access again to first few dmesg after powered on laptop after CLI "poweroff" command.
I am new to FreeBSD and found the bug was assigned to virtualization@FreeBSD.org as I choose bhyve as "Component". This is not a virtualization bug at all. I am was just trying to select a "Component" for base system and not sources that some of the "Component" indicated as the "Component". I am going to try "kern" for "Component" even tough I never built the FreeBSD kernel from source. I just installed the stable FreeBSD Kernel thjat has not been updated since I installed FreeBSD mid July 2019.
(In reply to John from comment #2)
Since this is an old'ish stable system would it be possible to reproduce this either on a fresh stable build, or try to reproduce this behavior on a release?
having said that I have certainly seen issues where firefox will swap like crazy and user input will become very slow while a core file for firefox is being generated due to a crash. generally after the intensive i/o is done the system becomes responsive again. also, are you able to ssh into this system when the problem occurs? if so, what processes are running - the systat(1) and procstat(1) man pages will be helpful here.
(In reply to pete from comment #3)
Thanks for reply and information.
I have a few questions.
I installed FreeBSD just after the 11.3 release. I used the USB image and installed all but the sources. I was expecting the base system and kernel updates would occur via updates. So far that appears not to occur assuming no base/kernel updates since the release of 11.3.
I have used pkg update/upgrade that appears to keep the installed packages up to date.
About 4 weeks ago I looked into if and why the FreeBSD kernel is or not upgraded. I ave not determined yet if there have been base/kernel updates since the 11.3 release. Should I have installed sources? Are sources needed for base/kernel updates? Why if not needed for installing FreeBSD? I choose not to install sources as I did not feel I would have a need to customized the FreeBSD kernel. I had for many reasons had to many times (usually about 4-8 times a year, sometimes many more) configure and compile the Linux Kernel many many times over the past 20 years. I assumed stable FreeBSD kernel meant stable and stable updates would occur. I sense I am incorrect with that assumption, but not certain what the best approach is for FreeBSD kernel updates while not destabilizing my DE and installed packages. My last attempt at FreeBSD as test system a couple years ago updates to packages and kernel would cause frequent issues and I could not access any X, DE and/or key DE apps for months at time. One very very creative effort taking lots of timet dug me out of one of those instances of many months of no access beyond CLI after boot. I have always favoured CLI after boot by choice so I can then choose what to do next. Is there a safe approach to update kernel/base so the packaged installed are not out of sync with the kernel/base? If so where do I look to find this. I have felt safe enough thus far to allow me to use FreeBSD as my primary system. Long term ideal is o find a way to create a LiveCD FreeBSD like OliveBSD for stability handful of systems I have that run 24/7/365 from another Kernel common to LiveCD images.
I believe the core issue is FreeBSD decided there was too much time and swap activity being used by Firefox so FreeBSD killed firefox. The core dump which I have not looked for nor care about does not take long to create. The system was responsive for that 15+ minutes of intense swap activity I am certain was all due to Firefox. As I mentioned swap file used dropped to basically no swap file space once Firefox was killed by FreeBSD. I have had a few of these events of intense swap file activity for a shorter time period where FreeBSD kills firefox, but this is first time after such an set of similar events has the keyboard been dead from FreeBSD kernel perspective, let alone carry to the CLI after DE is shut down via mouse still active.
For moment I think to keep all things equal I like to stay with the version of the FreeBSD kernel I have for one key reason. The less variables I change the better it is to isolate the issue. I believe that once the issue is isolated then it can be determined with ease if this issue still exists with more current versions of the FreeBSD Kernel/Base. The reason I favour this approach is based on my extensive past professional experience working with bugs that are difficult to duplicate as well as being complex in factors that are cause is the less that is changed the better and often such issues become more difficult to try to duplicate with newer versions as the underlying bug seems for indirect reasons to become deeper to reach ergo much more difficult to isolate/duplicate.
I do not know if there are any Free BSD sandboxes where I can attempt to duplicate the issue only needed CLI based environment that allows me to install other CLI applications that may help in my duplicating and collecting system information about the bug. If there is many that is an option to consider for this issue. This issue needs to be done bare metal and not via a VM for reasons I will skip other than to say if it was IBM VM then via VM would be just fine, but X86 based VM is nowhere close to IBM VM of 1980s, let alone the even more refined IBM VM of current day. Suffice to say I have lots of systems and VM experience dating back to the 1980s and onward for some years agmonst others.
I looked into and tried systat(1) and procstat(1) you suggested. systat allows some vmstat information to be displayed, but in awkward manner. Awkward meaning not compatible with a table or CSV format of data such as dstat (<http://dag.wiee.rs/home-made/dstat/> or sar in Linux create that I suspect bsdsar would also provide. I used sar via systat in Linux and I used dstat alot of time Linux to document what was almost always a similar set of conditions, but caused different variations of issues with the Linux Kernel. vmstat -s in my opinion produces the type of information for sure needed for this FreeBSD issue, but again not in a table or CSV format that can be iterated over an interval of time.
This means I am still at basic loss of how I can collect information proactively as once the bug occurs I cannot enter any commands at all including vmstat, systat, procstat, et al to get a sense of the FreeBSD Kernel bug. It means I have to run commands that can just run and collect information with timestamps proactive so these are running and ideally still able to log their information when this FreeBSD Kernel bug occurs.
10 days ago the FreeBSD system went into another intense swap file mode for about 15-20 minutes. Again due to firefox. Mouse and keyboard responsive during this time and other than obvious need to swap in other active apps response of other apps was reasonable. firefox was not killed by FreeBSD. The keyboard was again the issue, but a different keyboard symptom. This time "Y" (capital) would not type or display. "y"(lower case) worked just fine.
This was global in that no matter what application tried "Y", it would not display or type. I could not determine for some time if this was a display keyboard input issue. Some apps tried for "Y" included Sylpheed, leafpad, gedit, xfe4-terminal,tty via (alt-Fx), et al.
A few hours later I received an eMail and it so happened to have "Y" in the text body of eMail by deduction. Deduction means "Y" did not display. I obviously did not type the "Y" in the eMail sent to me. Suggestive of a display of "Y" issue as issue caused by this instance if intense swap file activity.
"Y" version of issue not resolved until reboot, but not clear if issue would have resolved on own in some unknown manner.
A quick update.
I have tried using chromium for last week without using firefox.
In summary chromium tends to use more memory than firefox for same scope of work. chromium clearly appears able to be able to use at least 3 times the memory as firefox when going beyond the threshold that firefox reaches to cause loss or corruption of keyboard or character buffers/translation tables, i.e. some form of memory/buffer corruption. The characterization of swap file page/page fault rates were far less and such that the system was not focused on mostly paging that occurs with using firefox.
This suggests there is some call or calls in firefox, system or otherwise that is causing memory/buffer corruption that may or may not self heal by chance.
The characterization differences cannot be measured as I have not been able to find FreeBSD tools that have similar ability to log the various key metrics as dstat and sysstat in Linux can. Be very clear I am not bashing FreeBSD,I am simply saying I have not found something that will log system metrics that would enable a clearly more factual presentation of at least the paging element this issue seems related to in some manner.