Bug 230172

Summary: FreeBSD 11.2 fails to boot on Celeron J1900 after upgrade from 11.1
Product: Base System Reporter: Max Burke <maxburke>
Component: kernAssignee: freebsd-bugs mailing list <bugs>
Status: Open ---    
Severity: Affects Some People CC: arthur, decke, dutchman01, elplutoniano, eugen, janm, jeff+freebsd, lapo, mafua, mam, me, miguelmclara, mike, omarandemad, root, ryan, simonp, yasu
Priority: --- Keywords: regression
Version: 11.2-RELEASE   
Hardware: amd64   
OS: Any   
See Also: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=229235
Attachments:
Description Flags
dmesg output of 11.2 kernel on J1900 (nonworking video output) none

Description Max Burke 2018-07-29 23:47:05 UTC
I am seeing the issue mentioned in this thread: 

https://forums.freebsd.org/threads/freebsd-upgrade-11-1-to-11-2-fails-to-boot-11-2-kernel-no-vb-no-nvidia.66538/

I upgraded from FreeBSD 11.1 to 11.2 via freebsd-update but the system fails to boot. As mentioned in the thread, after the "Booting message", but before the screen clears and prints the copyright message, the boot process stops.

The system is an ASRock Q1900-ITX; there is no discrete video adapter.
Comment 1 Pavel Minaev 2018-08-02 04:10:42 UTC
I'm also seeing this problem on Thinkpad 11E:

- Celeron N2940
- Intel HD Graphics (Bay Trail)

It's definitely not something with installed packages, custom kernels etc, because I can also reproduce it with install media - USB stick with image for 11.1 boots fine, and the one with 11.2 fails in the same exact way.
Comment 2 Pavel Minaev 2018-08-02 04:19:51 UTC
Same exact problem with FreeBSD-12.0-CURRENT-amd64-20180709-r336134-mini-memstick.img
Comment 3 Max Burke 2018-08-02 06:14:38 UTC
I have tried enabling the verbose boot option with the out-of-the-box 11.2 kernel but this has provided no extra diagnostic information.

(Unless I'm doing it wrong, which may be the case :)
Comment 4 Vicen Dominguez 2018-08-04 16:59:22 UTC
Hey guys!

Just confirm it from Spain. Same system (ASRock Q1900-ITX) and indeed... freezed at boot time.

I don't have any nvidea inside so I smell no relation to this driver.
The hang is in kernel time, just after the libalias.ko in my case.

And, indeed again, i am coming from 11.1 without problem so, sadly, it seems a regression... :(

any idea?
Comment 5 Vicen Dominguez 2018-08-04 17:16:35 UTC
In a new and a crazy try, I disabled all modules (including zfs) from the loader.conf with the same result: hung.

I have to add that I have a mirror (geom) as boot disk, and ufs. Nothing else.

Impossible to get some extra information with "verbose" boot enabled.
Comment 6 Daniel Mafua 2018-08-06 20:54:01 UTC
I've noticed many experiencing this issue (myself included) seem to have Intel Celeron/Atom processors. I do however have a few such boxes I successfully upgraded to 11.2, however noticed a line in the logs "VT: init without driver"

I had a box with an Intel Atom E3845 which would not boot a 11.2 kernel, but on a hunch unplugged the monitor then it would work (but with the message seen above). Notably the boxes I upgraded successfully from 11.1 to 11.2 didn't have monitors hooked up to them.
Comment 7 Daniel Mafua 2018-08-07 12:47:19 UTC
I did more testing on my Intel Atom PC. I waited for the machine to fully boot, then logged in via ssh and everything seemes fine, however if I plug in a monitor I see it's still stuck at the "booting..." screen before the copyright message.
Comment 8 mam 2018-08-18 09:28:15 UTC
Created attachment 196316 [details]
dmesg output of 11.2 kernel on J1900 (nonworking video output)
Comment 9 mam 2018-08-18 09:31:34 UTC
I can second that, altough the console is totally frozen (after the boot menu), the machine (also Asustec J1900-ITX) boots up and is available via SSH later on.
I have attached the dmesg output that contains a few warnings, maybe it can help to fix the problem because although this is mainly a router, a "last resort console" should be available too if everything fails...
Comment 10 Daniel Mafua 2018-08-18 14:22:24 UTC
In my case I also noticed ttyv* devices are missing in /dev. This  probably leads a person to believe the system is frozen, as it does not reboot with ctrl+alt+del since the system can't accept user input.

I've been able to work around this problem by setting kern.vty="sc"
 in /boot/loader.conf
Comment 11 mam 2018-08-18 15:17:28 UTC
great idea! it works!
(lol, it even now is in colour but in B&W like before :-) )

(the video output is not really great, there are often disturbing stripes and flicker on the screen, but who cares, at least a working console!)

:-)
Comment 12 Pavel Minaev 2018-08-19 02:08:57 UTC
I can confirm that changing the vty driver to "sc" fixes boot for me as well (although I also see flicker with it).
Comment 13 Max Burke 2018-08-19 03:11:12 UTC
Adding kern.vty=sc to my /boot/loader.conf worked! My system now boots with 11.2! :-)

(I had tried the monitor trick but the system never came up.)

Is there something that we can do to get a fix imported into the base system now?
Comment 14 miguelmclara 2018-09-11 17:12:00 UTC
Just found the same issue today, after updating to 11.2 on a Intel "Braswell" cpu.

using "vc" works around the issue, but I actually update so I could test a newer DRM code...


Any idea of a possible fix?
Comment 15 root 2018-10-03 20:51:07 UTC
I can confirm this bug. Adding "kern.vty=sc" works. The system is an ASRock D1800B-ITX.
Comment 16 Arthur Chance 2018-10-17 15:48:38 UTC
Just to add another data point. Gigabyte Brix box with Celeron N3000. Booting with kern.vty=vt appears to hang before the graphics mode switch, kern.vty=sc works fine.
Comment 17 Michael Proto 2018-11-01 03:54:26 UTC
Adding a me-too here, just upgraded my Celeron J1900 (ASrock Q1900-ITX) from FreeBSD 10.4 to 11.2 via freebsd-update and ran into this exact issue. Setting kern.vty=sc at the loader prompt allows the machine to complete the boot sequence and adding it to /boot/loader.conf allows a reboot to work without issue.
Comment 18 janm 2018-11-19 10:05:03 UTC
I am seeing this problem in 12.0-BETA4 on a SuperMicro J1900 miniITX based system. Setting the console to "sc" resolves the problem.
Comment 19 Vicen Dominguez 2019-02-05 15:59:46 UTC
I know this is from 11.1 to 11.2 upgrading process. But, just for your information, same as FreeBSD 12.0-RELEASE upgraded from FreeBSD 11.1 

The system is completely stuck in the boot process, without ssh access.
Comment 20 mam 2019-02-05 16:05:13 UTC
(In reply to Vicen Dominguez from comment #19)
this is not really surprising, the faulty driver has not been repaired/exchanged. You should have switched the device before update as suggested in this thread.

but usually, the machine, although looking irresponsly, is accessable from the network... maybe your problem is a bit different / deeper ?
Comment 21 Vicen Dominguez 2019-02-06 18:59:52 UTC
(In reply to mam from comment #20)
Indeed mam! changing the driver to "sc" worked for me. As you said, perhaps my problem is a little different because I never could connect to the server via ssh (waiting for 2 hours).

I am lost... no ideas. Anyway, thank you to everybody.
Comment 22 mam 2019-02-06 19:04:56 UTC
but now with the "sc" driver the network works again too???

that would be strange, but maybe now you can find a log entry that shines a light on you problem.

Anyway, if it works again, all is fine :-)
Comment 23 Ed Maste freebsd_committer 2019-02-06 19:11:15 UTC
(In reply to mam from comment #22)
Well, if vt used to work and now doesn't there's a regression we need to find. It seems there are several different reports in here - can someone confirm that the problem occurs under one (or both) of these cases:

11.1 to 11.2 upgrade with no monitor attached
11.1 to 11.2 upgrade with monitor attached

Also please include the motherboard model, firmware version, and for the second case the monitor interface (VGA/HDMI/etc.)
Comment 24 mam 2019-02-06 19:32:30 UTC
you are funny :-)
How could somebody notice that the console does not work, if there is no monitor attached ???

As far as it looks for now, its always that AsRock Q1900-ITM Mobo, the connection to the monitor does not matter, there is nothing on either VGA, DVI or HDMI.
It looks like the Intel GPU onboard is not supported in this particular processor.
Comment 25 Ed Maste freebsd_committer 2019-02-06 19:48:57 UTC
(In reply to mam from comment #24)
> How could somebody notice that the console does not work, if there is no
> monitor attached ???

If they're using a serial console.
Comment 26 mam 2019-02-06 19:56:15 UTC
then they would not notice it either.

Its only the monitor that blocks. 

Not sure about the keyboard, because you cant see anything, typing is not really responsive too.

Thank god, everything else works, the machine boots up and you can login over the network.

At the beginning everybody (including me) thought, it was a total crash, because Bios is ok, then the booloader shows the cute Demon and lets you change config and so on. But as soon as you try to boot the kernel, the rotating / | \ ... animations stops and from then on, nothing appears on screen anymore (but the current contents stays there visible until the cows come home).
Comment 27 Bernhard Froehlich freebsd_committer 2019-02-06 20:56:55 UTC
I can confirm the issue on an Supermicro X10SBA (FW 1.3a) with FreeBSD 12.0. That board also uses an Intel Celeron J1900.

@ed: If I understand you correctly you want to check if this is a regression between 11.1 and 11.2 so it should be enough to verify by booting those with a usb stick. I should be able to provide that data.
Comment 28 Vicen Dominguez 2019-02-07 10:23:27 UTC
(In reply to mam from comment #22)
sorry mam,  I wasn't very clear. yes yes... in a nutshell, with "sc" everything worked again.

I have a little "flicker" in the screen but it doesn't disturb me and I didn't try Xwindows, but it's a home server so the workaround works for me.
Comment 29 mam 2019-02-07 10:33:34 UTC
good :-)

I have the "flicker" too. Its a bit annoying but much better than no pic at all :-)
(No idea about X stuff too, this is a router, not a desktop)
Comment 30 Pavel Minaev 2019-02-07 18:31:22 UTC
To use X, you'll need to kldload drm_next_kmod. If you do that with sc driver in effect, your screen will go blank the moment you do it. But the system is still alive, and if you then run X, it'll work. I just made a little script that does both.
Comment 31 janm 2019-02-18 11:32:43 UTC
Some notes when 'vt' is used (some of this also in comments above):

* Console output stops when the kernel starts booting.
* If the boot is allowed to complete it is available via ssh.
* /dev/ttyv* device entries are not present.
* Loading /boot/modules/i915kms.ko makes the device entries appear and console output resumes. In my case this is on FreeBSD 12-p3 with port graphics/drm-fbsd12-kmod.
* "kill -HUP 1" restarts getty processes and the console becomes usable.
* If /boot/modules/i915kms.ko is loaded via kld_list in /etc/rc.conf, console output will resume once the module is loaded and the ttyv* getty processes will start correctly.
Comment 32 janm 2019-02-18 16:16:01 UTC
My comments above were on a Tipro J1900 based touch screen system.

I have also tested on a SuperMicro X10SBA; there the system hangs when loading i915kms.ko, and requires a power cycle.
Comment 33 simonp 2019-03-12 12:20:59 UTC
Just a little precisation:
11.2 does not only fail to boot "after an upgrade"
it just fails to boot on j1900, so it can't be installed ...

i suggest to revise the importance of the problem 
Sorry for having no solution to propose.
Comment 34 mam 2019-03-14 07:45:06 UTC
(In reply to simonp from comment #33)
You can use the same "trick" for a new installation.

Boot from DVD/Stick, go into boot options, change the console driver to SC and continue as usually.

after installation, remember to change it permanently in the /boot config

But of course, the issue should have been resolved by now, I have the impression, nobody wants to really implement it into the next distribution.
Comment 35 simonp 2019-03-14 11:51:51 UTC
(In reply to mam from comment #34)
Thank you very much for you reply, works 100%
(you indeed teached a new trick to an old dog ;-) ...)

And yes! i agree:
... the issue should have been resolved by now, I have the impression, nobody wants to really implement it into the next distribution.
Comment 36 Ed Maste freebsd_committer 2019-03-14 15:12:10 UTC
(In reply to mam from comment #34)
> But of course, the issue should have been resolved by now, I have the
> impression, nobody wants to really implement it into the next distribution.

Nobody with the skill and availability to fix this issue has the affected hardware.
Comment 37 mam 2019-03-14 16:08:13 UTC
(In reply to Ed Maste from comment #36)
There is no real hardware needed for the fix, just drop the faulty driver and use the working one as default. Nobody would be harmed and everything would work.

Or, make it more complex, look at the hardware at bootime and if a J1900 is found, switch drivers.

But of course, just keep on waiting will solve it too. Someday those affected CPUs will be out of service and nobody will notice it anymore.
Comment 38 Ed Maste freebsd_committer 2019-03-14 17:02:21 UTC
(In reply to mam from comment #37)
> There is no real hardware needed for the fix, just drop the faulty driver and
> use the working one as default. Nobody would be harmed and everything would
> work.

No, these are drivers for the same hardware, so the current, non-legacy driver needs to be fixed to work with this hardware.
Comment 39 Ed Maste freebsd_committer 2019-03-14 17:57:22 UTC
(In reply to Bernhard Froehlich from comment #27)
> @ed: If I understand you correctly you want to check if this is a regression
> between 11.1 and 11.2 so it should be enough to verify by booting those with a
> usb stick. I should be able to provide that data.

That is correct.

We also have snapshots of stable/11 at various points available in http://artifact.ci.freebsd.org/snapshot/stable-11/ which could be used to narrow the range of potential offending commits further.
Comment 40 Eugene Grosbein freebsd_committer 2019-03-16 19:44:42 UTC
Anyone having this problem should blame manufacturer of its hardware for garbage in its ACPI tables that is root of the problem. ACPI reports that system has no VGA at all. syscons ignores ACPI but vt obeys since 11.2 (it ignored before).

We already have a workaround for this problem that allows using vt(4) driver when old syscons cannot be used, e.g. in UEFI environment. Add this to /boot/loader.conf:

hw.vga.acpi_ignore_no_vga=1

Perhaps, installation media intended for interactive installation should have it by default.
Comment 41 Michael Proto 2019-03-19 02:52:42 UTC
Thank you for the update Eugene.

While I have no doubt the ACPI tables on this discount hardware are likely the cause (see below**), unfortunately this particular suggested fix did not work for me.


Intel Celeron J1900 (ASrock Q1900-ITX)

Escape to bootloader on boot

set kern.vty="vt"
set hw.vga.acpi_ignore_no_vga="1"

I still see the same problem, the spinner freezes when loading the kernel until I reboot and either set kern.vty="sc" or disconnect keyboard/monitor.


** Yes, this BIOS is quirky. I have it attached to a KVM via VGA/PS2 and during boot if my KVM is active on this terminal the system will sit on the BIOS "Press a key to enter option" screen forever until I hit ANY KEY, including CTRL/ALT. With the KVM pointed elsewhere it boots normally. I'm going to check my provider (again, likely in futility) for an updated firmware but wanted to throw this in as it is likely a sh*ty BIOS/UEFI being the issue. While I don't expect FreeBSD to have to work around it I'm prepared to do what I can to help debug.
Comment 42 Ryan Moeller 2019-03-21 00:17:25 UTC
(In reply to Michael Proto from comment #41)
That setting is not available in any 11.2 releases yet, but it is available on stable/11 snapshots.
Comment 43 janm 2019-06-04 06:19:53 UTC
An additional datapoint:

Booting with UEFI on the Tipro J1900 system resolves the problem, and kern.vty=vt works find all the way through. No change on the Supermicro X10SBA, also with a J1900.
Comment 44 Jeff Kletsky 2019-07-17 15:28:42 UTC
Just ran into this (again) after not being able to put off an upgrade on a J1900-based system any longer. Still an issue with 11.1-RELEASE-p(final) to 12.0-RELEASE using freebsd-update -r 12.0-RELEASE fetch / freebsd-update install
Comment 45 Jeff Kletsky 2019-07-17 16:19:30 UTC
As there now appears to be a set of work-arounds, and that this seems to be a common problem on at least J1900 and possibly other Celeron devices of that era, such as the N3150, 

I would suggest this issue and at least a link to the workaround be present in the Release Notes (which I did check prior to upgrading)

https://www.freebsd.org/releases/12.0R/relnotes.html#errata



See further

https://forums.freebsd.org/threads/freebsd-upgrade-11-1-to-11-2-fails-to-boot-11-2-kernel-no-vb-no-nvidia.66538/
Comment 46 Jeff Kletsky 2019-07-19 14:27:23 UTC
To add, this box is *not* using UEFI boot, but using BIOS ("legacy") boot, so the issue is not confined to UEFI boot.

The release notes at https://www.freebsd.org/releases/12.0R/relnotes.html#errata do not indicate that there is any problem, nor do they seem to provide a link to https://www.freebsd.org/releases/12.0R/errata.html Providing such a link would be valuable. As indicated by the lead paragraph, 

> [2018-12-11] Some IntelĀ® J1900 systems may hang on boot in UEFI mode. An observed workaround is to set kern.vty=sc at the loader(8) prompt. To have the setting persist after reboot(8), add kern.vty=sc to loader.conf(5).

should be expanded in scope by removal of the text "in UEFI mode"
Comment 47 zain david 2019-07-23 04:34:35 UTC
MARKED AS SPAM
Comment 48 Michael Proto 2019-07-25 07:01:55 UTC
So with 11.3 released, the following /boot/loader.conf settings do indeed work with my J1900-ITX:

kern.vty="vt"
hw.vga.acpi_ignore_no_vga="1"


Thanks to all involved!!