Bug 224069 - X, text console and suspend/resume fail
Summary: X, text console and suspend/resume fail
Status: Closed Overcome By Events
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: 11.1-STABLE
Hardware: Any Any
: --- Affects Many People
Assignee: Bugmeister
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2017-12-03 23:58 UTC by Stefan B.
Modified: 2025-01-17 15:20 UTC (History)
7 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Stefan B. 2017-12-03 23:58:26 UTC
PROBLEMS:
-Switching between X and text console fails, screen is garbled or no video at all
-Resuming after suspend results either in garbled or no video at all

ANALYSYS:
As I was getting annoyed by so many people affected by the buggy vesa.ko, and felt frustrated that these problems with X, particulartly switching between X and text console and the failure of suspend/resume to restore the screen, I finally took a look into vesa.ko and found immediately a thing very wrong.

In line 515 of /usr/src/sys/dev/fb/vesa.c you see this:
  /* regs.R_DL = STATE_SIZE; */

The DL register contains the subfunction of the BIOS call retrieving the buffer size. It must not be undefined, as it is unpredictable which subfunction will get called.

Thus I wish I knew who has commented out that line for what reason.
I think this justifies an explanation.
Because, the commenting out of this line caused all these problems, which have done big damage to FreeBSD's reputation regarding its usability as desktop OS.


FIX:
Uncomment that line.
Comment 1 Mark Linimon freebsd_committer freebsd_triage 2017-12-04 00:43:52 UTC
(In reply to Stefan B. from comment #0)
> Thus I wish I knew who has commented out that line for what reason.

(Note: jkim is already on the Cc: here, so I don't need to take any action.)

fwiw, there's a way to look that up.

Look up the src file on FreeBSD's svnweb:

  https://svnweb.freebsd.org/base/head/sys/dev/fb/vesa.c

Do an "annotate" based on the latest commit, here rev 326255:

  https://svnweb.freebsd.org/base/head/sys/dev/fb/vesa.c?annotate=326255

Search for the commented out line (shows as 517 there rather than 515):

  https://svnweb.freebsd.org/base/head/sys/dev/fb/vesa.c?annotate=326255#l517

Clicking the line number in that block takes you to:

  https://svnweb.freebsd.org/base/head/sys/dev/fb/vesa.c?r1=198250&r2=198251&

Clicking the 'latest revision' link on the right takes you to:

  https://svnweb.freebsd.org/base/head/sys/dev/fb/vesa.c?revision=198251&view=markup

which is a very large commit from jkim Mon Oct 19 20:58:10 2009 UTC (8 years, 1 month ago).  I have no knowledge of what's all involved there.
Comment 2 Conrad Meyer freebsd_committer freebsd_triage 2017-12-04 00:59:12 UTC
(In reply to Mark Linimon from comment #1)
Yeah, I copied him based on that commit.

Notably, before that commit, the DX register was initialized to that same constant.  Afterwards, the comment containing the initialization of DL (low half of DX) replaced it.
Comment 3 Stefan B. 2017-12-04 02:03:52 UTC
Now that is a bit ironic...

From the last link in Mark's post, I read in jkim's commit note:

   - Fix a long standing bug in state save/restore function.  
     The state buffer pointer should be ES:BX, not ES:DI according
     to VBE 3.0.  If it ever worked, that's because BX was always zero. :-)

The last sentence made me smile, because it may explain why the uninitialized DL which stood for eight years got undetected that long :)

Anyway, people who do much like jkim, do more mistakes than people who do little. Big commits, easier to overlook a detail...
So no problem, and let's thank jkim for his great contributions!

(Thank you Mark for the good explanation how to find out :)
Comment 4 Jung-uk Kim freebsd_committer freebsd_triage 2017-12-04 19:27:39 UTC
You guys are not reading the code right.  x86bios_init_regs(&regs) before the commented line initializes all registers to zero.  In other words, I commented out the line because it is unnecessary, i.e., STATE_SIZE is zero.  I am not really sure what's going on but you can do "sysctl debug.x86bios.call=1" to trace VESA function calls.
Comment 5 Jung-uk Kim freebsd_committer freebsd_triage 2017-12-04 19:32:53 UTC
(In reply to Stefan B. from comment #0)
> Because, the commenting out of this line caused all these problems, which have
> done big damage to FreeBSD's reputation regarding its usability as desktop OS.

I implemented amd64 suspend/resume support.  I don't think I deserve the harsh comment. :-(
Comment 6 Jung-uk Kim freebsd_committer freebsd_triage 2017-12-04 19:45:02 UTC
FYI, switching between X and text console is NOT handled by FreeBSD console drivers.  In fact, X/DRM/KMS drivers are responsible for saving/restoring frame buffer and GPU state.
Comment 7 Jung-uk Kim freebsd_committer freebsd_triage 2017-12-04 21:17:37 UTC
(In reply to Jung-uk Kim from comment #4)
> ... you can do "sysctl debug.x86bios.call=1" to trace VESA function calls.

Do "sysctl debug.x86bios.int=1" to trace VESA interrupt handlers, sorry.
Comment 8 Stefan B. 2017-12-04 21:28:43 UTC
(In reply to Jung-uk Kim from comment #4, #5 and #6)

#4: Didn't know about register zeroing. Such was not common back then when I did VGA programming via hardware and INT10 decades ago using mixed C and assembly back in the 16-bit DOS times.
The sysctl tip is great for debugging!


#5 No, you don't deserve that at all.
When I saw the commit Mark pointed me at, I instantly recognized that it was a lot of work involved. And you know, the bigger the work, the easier it is to overlook a detail. See please my comment #3, too.

And I did not know you were the one who implemented FreeBSD suspend/resume. That was great work, good to know. Thank you!
However, I think the suspend/resume thing has been neglected, and it does not  work for many people. Below I discuss part of possible reasons.
Btw, are you still interested in improving suspend/resume from STR to STD?


#6 That is very interesting, because:
In the FreeBSD forums, there are constant complaints regarding Nvidia cards. 
Garbled display when switching between console and X, and after resume. I have made the discovery that this apparently *only* happens when vesa.ko is present.
And that is in the GENERIC kernel or as loaded module.

In the forums I have talked much of that problem. Apparently all people that followed my advice to build a custom kernel *without* option VESA and switch from vt to sc console (because the default vt console pulls in vesa.ko, if not present in kernel), got rid of these problems.

And, in my naive opinion I think suspend/resume should *not* be broken by just doing kldload vesa.ko. (this is factually the way to reproduce the problem)

So I have the strong feeling that there is a serious problem with the vesa module. But that is just my (possibly misleading) intuition.
Comment 9 Jung-uk Kim freebsd_committer freebsd_triage 2017-12-04 22:22:15 UTC
(In reply to Stefan B. from comment #8)
> Btw, are you still interested in improving suspend/resume from STR to STD?

No, not really.

> In the FreeBSD forums, there are constant complaints regarding Nvidia cards.
...
> So I have the strong feeling that there is a serious problem with the vesa
> module. But that is just my (possibly misleading) intuition.

Ah, NVIDIA...  It is a well known problem for many years.  The root cause is NVIDIA does not really support VESA BIOS re-POST and save/restore state calls.  It just checks few bits and returns immediately if they are set/clear.  I have a feeling that they intentionally did it to make reverse engineering harder.  On top of that, they don't provide any information to improve the situation.  Long time ago, I even bought couple of cheap NVIDIA controllers to fix the problem but soon I gave up because I realized it was too complicated and I wasn't desperate.  So, no, VESA module have no "serious" problem.  Just NVIDIA doesn't want us to mess around with their hardware although we own it.

We can add few knobs in the driver to work around these issues and turn them on by default if NVIDIA controller is found.
Comment 10 Stefan B. 2017-12-04 22:54:19 UTC
(In reply to Jung-uk Kim from comment #9)

> We can add few knobs in the driver to work around these issues 
> and turn them on by default if NVIDIA controller is found.

That would be the right thing to do then.
Would save a lot of bad UX.

And the supporting guys on the forums would be happy if these constantly recurring nvidia problems would be mitigated (at least to a degree).
It's not nice to see people going back to Linux just because the text console is unusable etc.

> The root cause is NVIDIA does not really support VESA BIOS re-POST 
> and save/restore state calls.  It just checks few bits and returns 
> immediately if they are set/clear.

Aww. Didn't know that. You did obviously a lot of research. Kudos!

This is a very bad thing to learn, as this could make STD with Nvidia cards virtually impossible without involving X.

(Btw, do you know whether ATI cards behave better in this regard, supporting state saving/restoring?)
Comment 11 Jung-uk Kim freebsd_committer freebsd_triage 2017-12-04 23:06:06 UTC
(In reply to Stefan B. from comment #10)
> (Btw, do you know whether ATI cards behave better in this regard, supporting
> state saving/restoring?)

Yes.  I've never heard any complaint about ATI/AMD controllers for many years.
Comment 12 Jason W. Bacon freebsd_committer freebsd_triage 2025-01-17 01:02:39 UTC
Is this PR still relevant?

I have a MacBook using vesa that consistently has a black screen after suspend/resume.  System is running, as I can ssh in.
Comment 13 Jason W. Bacon freebsd_committer freebsd_triage 2025-01-17 01:03:59 UTC
See https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=252241
Comment 14 Stefan B. 2025-01-17 01:49:27 UTC
(In reply to Jason W. Bacon from comment #12)
I didn't follow much the FreeBSD development since mid-2022, as I went to a sabbatical.
When back in mid-2023, I found that all graphics drivers except Intel and AMD had been intentionally made dysfunctional.
The users' pressure was big enough to have at least the nv driver re-enabled, and after being shown that these chipsets are dominating the server market, the guy responsible finally re-instated the Matrox and aspeed drivers, which he originally wanted to have deleted together with all drivers not belonging to the Intel-AMD oligopoly.

The next stages announced were to replace xorg with Wayland, and removing the SC console driver (which would make suspend/resume dysfunctional on Nvidia systems).

That was enough for me.
I abandoned FreeBSD for Linux. I need reliable systems, for servers as well as for workstations. And here electricity is very expensive, and has recently become even more expensive, so S3 suspend is no real option.

There is no way around hibernate, which Linux is mastering since decades.
We do not know whether FreeBSD ever will implement hibernate, even though the basic work already has been implemented by Kim Jung-Uk.

And now to your question:
The VESA module has been removed from the default kernel some times ago in response to this PR, I think it was with 13.0.
Then, the question is, does the MacBook implement a full VESA feature set, or do they leave out the state save/restore functions like Nvidia cards do? I don't know.

I will now unsuscribe to the PR, as I am no longer part of the FreeBSD community.
Comment 15 Jason W. Bacon freebsd_committer freebsd_triage 2025-01-17 15:20:48 UTC
(In reply to Stefan B. from comment #14)

Sorry to hear about your frustrations.  FreeBSD certainly has some limitations, like all OSs, but FWIW, I actually switched from Linux to FreeBSD for reliability reasons.  I spent 20 years professionally managing a variety of Linux distros on a variety of workstation hardware and HPC clusters, and always had problems with systems freezing solid (about once a month on typical hardware).  Running FreeBSD on the same hardware resulted in almost no crashes, and of the few that did happen, most were traced to hardware problems.  This was important to us, since we had users running scientific analyses that could take weeks to complete and could not be easily resumed after a crash.

In my experience, the limitations of FreeBSD tend to be more on the cosmetic side, rather than systemic, and I've always been able to work around them.  That doesn't mean they aren't serious problems for other use cases, so switching to Linux may have been the right move for you.

In any case, sorry to see you go, and thanks for your efforts to improve FreeBSD.  You've undoubtedly left a bit less work for the the rest of us to deal with.