How do you want to handle the new stable branch? There are some significant changes here like vulkan support. There's an out of tree port version at https://github.com/shkhln/revird-aidivn/blob/afdiuxc/x11/nvidia-driver/Makefile
For Quadro K1100M (GK107GLM) on FreeBSD 14.0-CURRENT, can you tell, is x11/nvidia-driver appropriate? I'm particularly interested in wake from sleep (resume from suspend) when the computer (HP ZBook 17 G2) is docked, with a display on DisplayPort. On one hand: <https://www.nvidia.com/Download/driverResults.aspx/177146/en-us> for 470.57.02 does _not_ list the K1100M, from which a person might assume that a legacy alternative (maybe x11/nvidia-driver-390) is appropriate. On the other hand: <https://www.nvidia.com/en-us/drivers/unix/legacy-gpu/> _also_ does not list K1100M, from which a person might assume that it's appropriate to use the regular (non-legacy) Unified UNIX Graphics Driver.
(In reply to Kevin Bowling from comment #0) Since danfe@ isn't replying... > new stable branch The port is ready for 470 since https://github.com/freebsd/freebsd-ports/commit/d64eb42e5b50c43cec29f672d32f04ddb7d8dca8; it should be enough to do a version bump. > There are some significant changes here like vulkan support. That's ready as well. I have a minor cleanup patch for the corresponding port parts, though: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=253285#c2.
(In reply to Kevin Bowling from comment #0) > How do you want to handle the new stable branch? Just by updating the port. Alex is right, it should be straightforward. (In reply to Alex S from comment #2) > I have a minor cleanup patch for the corresponding port parts, though That should be included as well, thank you.
Created attachment 226784 [details] nvidia 470.57.02 Attaching the combined patch for others to test. I'm having stability issues during session restore for KDE5 and 4 Firefox windows that are stable on 460 driver. On a fresh user/session I didn't see the issue but I only tried one firefox window without much load. The issue results in a full hang of the system. I haven't started any debugging yet, so I will try to get some data this weekend when I have more time. Any thoughts from kde?
Thank you. (In reply to Kevin Bowling from comment #4) The same experience with 470.57.02 as with 460.84. Initial wake from sleep succeeded. Slept by clicking the button at the SDDM log in dialogue. Subsequent wakes fail, quickly (without appearance of an image) – and hard (no response to keyboard or trackpad input, no response to a normal press on the power button). If it helps: at the time of the second sleep with 470.57.02 – after the one successful wake – a succession of "… unexpected …" messages scrolled by. So quickly that (sorry) I had no chance to tell _what_ was unexpected. Loosely speaking (I don't know the technologies), it feels like, maybe, something troublesome is cached, but I have no idea where (or how to clear the cache). <https://bsd-hardware.info/?probe=2faf8af7be> % uclcmd get --file /boot/loader.conf screen.font null % grep screen.font /boot/loader.conf | grep -v \# screen.font="8x16" % sysrc -f /etc/rc.conf kld_list kld_list: fusefs usbhid nvidia-modeset % pkg info -x nvidia linux-nvidia-libs-460.84 nvidia-driver-470.57.02 nvidia-settings-470.42.01 nvidia-xconfig-470.42.01 % uname -KUv FreeBSD 14.0-CURRENT #103 main-n248269-941650aae97: Wed Jul 28 07:28:47 BST 2021 root@mowa219-gjp4-8570p:/usr/obj/usr/src/amd64.amd64/sys/GENERIC-NODEBUG 1400026 1400026 %
(In reply to Kevin Bowling from comment #4) > … stability issues during session restore for KDE5 and > 4 Firefox windows that are stable on 460 driver. I can't comment on the 460 experience, because I habitually: 1. quit Firefox 2. await disappearance of Firefox windows 3. watch htop until all firefox processes end – before logging out. > On a fresh user/session I didn't see the issue but I only tried > one firefox window without much load. The issue results in a > full hang of the system. Exceptionally, I logged out without quitting Firefox (two windows, around 952 tabs, various extensions). For me, too: * the subsequent log in (without restarting the OS) resulted in a full hang of the OS – no response to input – so I pressed and held the power button. The subsequent log in (after starting the OS) restored Firefox, and most other windowed applications, without a hang of the OS.
Created attachment 226795 [details] Photograph: OS hung after clicking a site-provided context menu item in Firefox 90.0.2 firefox-90.0.2,2 Right-click (Kensington trackball), release, roll the ball down to the required context menu item, click, the OS hung hard. This photograph shows an arrow pointer. Not reproducible with nvidia-driver-460.84; the gloved hand pointer changes to an arrow pointer within a split-second of clicking the context menu item.
Reported instability to nvidia FreeBSD forum https://forums.developer.nvidia.com/t/instability-with-470-57-02/185212
(In reply to Kevin Bowling from comment #4) I had a similar problem with 470.42.01 and now 470.57.02 shows the same symptom. FWIW, I've been using 465.31 for a while and it's been quite stable.
(In reply to Jung-uk Kim from comment #9) For me, it does NOT hang at all, but sudden REBOOTS WITHOUT CORE instead. The situations are quite random. *Just moving mouse cursore over Firefox window. *While reloading any web pages. *Logging into somewhere. Not at sll reproducible after sudden reboots. I FEEL 470.42.01 was worse, as it rebooted multiple time a day and I must get back to 465.31, while 470.57.02 reboots about once in 2 or 3 days. 465 series were stable, as you wrote.
I only had 2 or 3 lockups for whole this year (with FreeBSD 12.2/13 and every beta version of Nvidia's Unix driver). With no correlation to anything in particular. Either you are being overly dramatic or it's something specific to CURRENT.
(In reply to Alex S from comment #11) There are several reports of unexpected lockups post 465 here so I'm not sure who your comment is targeted to but it's not going to advance the narrative like figuring out what is going on will.
> For Quadro K1100M (GK107GLM) on FreeBSD 14.0-CURRENT, can you tell, is x11/nvidia-driver appropriate? Yes use x11/nvidia-driver. K1100M is a Kepler series, and 470 is the last driver release to support those. > There are several reports of unexpected lockups post 465 here I think these lockups are the result of a panic with the message "Sleep with fops_mtx held in Nvidia driver 470.42.01 on FreeBSD", which we knew about and have a fix for. This should be fixed in the next point release of 470. (sorry, I thought I had committed it in time for 470.57) The fix is changing fops_mtx in src/nvidia/nv-freebsd.h to an sx lock. If someone could get an actual coredump/kernel stacktrace that would be helpful to confirm they are hitting the same issue. Not ground truth, but I did a quick spot check with a recent-ish CURRENT, firefox, KDE5, and the latest internal version of the nvidia driver and couldn't trigger any panics. I'll keep an eye out.
Given the amount of reported problems with 470.57.xx, perhaps we could update the port to version 460.91.03 in the meantime as otis@ had suggested?
Created attachment 226898 [details] nvidia 470.57.02 w/sx locking My machine hard locks so I am unable to see a panic message on the vt or get a core. With Austin's suggestion I am running fops_mtx converted to an sx xlock and it is working well for me. This attachment is a quick and dirty patch just for the main 'nvidia-driver' port so others can confirm it works as well. If it works for others I will do it correctly.
(In reply to Tomoaki AOKI from comment #10) > … sudden REBOOTS WITHOUT CORE … There's a comparable report in the GhostBSD area, below <https://forums.ghostbsd.org/viewtopic.php?p=9893#p9893> (In reply to Kevin Bowling from comment #15) Thanks, I'll try.
Created attachment 226916 [details] Make mtx->sx patches optional for 470.x (In reply to Kevin Bowling from comment #15) The mtx->sx change fixed the instability issue for me. This patch should make it optional for 470.x.
danfe, what do you think about the above? I've been hammering this and cannot see any obvious issues.
(In reply to Kevin Bowling from comment #18) > danfe, what do you think about the above? I'd like to see it reviewed and picked by upstream first, and ideally making another bugfix release in 470.xx branch rather than us having to maintain even more local patches. I'm not sure if we have a direct contact with them now, but perhaps the forum thread mentioned in comment #8 could be updated?
(In reply to Kevin Bowling from comment #15) > … a quick and dirty patch just for the main 'nvidia-driver' port … Firefox: no problem at the time of writing. OS wake from sleep: not working. % uname -KUv FreeBSD 14.0-CURRENT #103 main-n248269-941650aae97: Wed Jul 28 07:28:47 BST 2021 root@mowa219-gjp4-8570p:/usr/obj/usr/src/amd64.amd64/sys/GENERIC-NODEBUG 1400026 1400026 % pkg info -x firefox nvidia | sort firefox-90.0.2,2 linux-nvidia-libs-460.84 nvidia-driver-470.57.02 nvidia-settings-470.42.01 nvidia-xconfig-470.42.01 % kldstat | grep -e nv -e drm 19 1 0xffffffff836ae000 106310 nvidia-modeset.ko 20 1 0xffffffff83800000 1fa0a48 nvidia.ko %
Thanks for patches. Looks working fine with patched driver ATM, but as I wrote previously, would need at least 4 or 5 more days (maybe 1 or 2 weeks to be shure) to confirm. Tested mainly on stable/13 and partially main, amd64. BTW, I've experienced a first hangup (formerly, all were sudden reboot), at conditions below: *Just the time built/installed patched nvidia-driver but before reboot. *Clicked a tab on firefox, and then, hanged up. *NumLock and Shift-Caps toggles keyboard LED. *Ctrl-Alt-BS, Ctrl-Ald-Del, Ctrl-Alt-Fn and short pushdown of power SW didin't work. Just a long press of power SW (forced power-off) worked. *No core left. So this hang should be a problem of nvidia-driver without patch.
No sudden-reboot nor hang until now. Will keep on watching and report back if whichever happened.
(In reply to Graham Perrin from comment #16) This was reported by me. For both 460.x and 470.x the symptom is insta-reboot; this may or may not be a property of GhostBSD - my gut feeling is however that it may also be hardware dependent whether you see a hang or a reboot. However: This *did* happen with 460 too; but not as reproducable or frequent as with 470. The patch suggested below - is it reasonable to assume that this was a problem present-but-less-frequent in the older drivers too?
FYI, 470.63.01 was released today.
Like Alex said, 470.63 is out. I verified that it has fixes for both known panics. It should be good to update the port to that version. Please let me know if there are more stability problems. > This *did* happen with 460 too; but not as reproducible or frequent This is probably not the fops_mtx panic, but the other one from nvidia-modeset. I only got that to reproduce a couple times when I was using multiple monitors, and haven't seen anyone else run into it. My money is on that being your issue.
(In reply to Austin Shafer from comment #25) >Like Alex said, 470.63 is out. I verified that it has fixes for both known panics. It should be good to update the port to that version. Please let me know if there are more stability problems. I'll test this on my workstation as soon as I'm able. >> This *did* happen with 460 too; but not as reproducible or frequent >This is probably not the fops_mtx panic, but the other one from nvidia-modeset. I only got that to reproduce a couple times when I was using multiple monitors, and haven't seen anyone else run into it. My money is on that being your issue. It could be; I have a monitor with a ridiculous resolution - LG 34WK95U at 5120x2160. This is more than most dual-display configurations used to have until very recently, so I'm not surprised if it tickles similar bugs.
A commit in branch main references this bug: URL: https://cgit.FreeBSD.org/ports/commit/?id=77a2452a895b80a7136d1453ce4ad5fe00b58773 commit 77a2452a895b80a7136d1453ce4ad5fe00b58773 Author: Kevin Bowling <kbowling@FreeBSD.org> AuthorDate: 2021-08-11 02:47:16 +0000 Commit: Kevin Bowling <kbowling@FreeBSD.org> CommitDate: 2021-08-11 02:47:16 +0000 x11/{linux-nvidia-libs,nvidia-driver}: Update to 470.63.01 This is the new stable branch and adds support for Vulkan. See https://www.nvidia.com/Download/driverResults.aspx/177146/en-us and https://www.nvidia.com/download/driverResults.aspx/179601/en-us for additional changes. PR: 257456 Approved by: danfe x11/linux-nvidia-libs/Makefile | 2 +- x11/linux-nvidia-libs/distinfo | 6 +++--- x11/nvidia-driver/Makefile | 2 +- x11/nvidia-driver/distinfo | 6 +++--- 4 files changed, 8 insertions(+), 8 deletions(-)
Thanks, Austin and danfe for your work!
Thanks for the update, albeit technically I did not approve it.
(In reply to Alexey Dokuchaev from comment #29) Sorry for the misunderstanding, I read comment 3 and 19 as go for it once it's ready.
root@mowa219-gjp4-zbook:~ # pkg info -x nvidia nvidia-driver-470.63.01_1 nvidia-xconfig-470.42.01 root@mowa219-gjp4-zbook:~ # sysrc kld_list kld_list: nvidia-modeset root@mowa219-gjp4-zbook:~ # Are these backtraces significant? At the tail of /var/log/messages ---- … nvidia0: <Unknown> on vgapci0 vgapci0: child nvidia0 requested pci_enable_io vgapci0: child nvidia0 requested pci_enable_io nvidia-modeset: Loading NVIDIA Kernel Mode Setting Driver for UNIX platforms 470.63.01 Tue Aug 3 20:24:32 UTC 2021 acpi_wmi0: <ACPI-WMI mapping> on acpi0 acpi_wmi0: Embedded MOF found ACPI: \134_SB.WMID.WQAB: 1 arguments were passed to a non-method ACPI object (Buffer) (20210730/nsarguments-361) acpi_wmi1: <ACPI-WMI mapping> on acpi0 acpi_wmi1: Embedded MOF found ACPI: \134_SB.PCI0.WMI1.WQXM: 1 arguments were passed to a non-method ACPI object (Buffer) (20210730/nsarguments-361) ichsmb0: <Intel Lynx Point SMBus controller> port 0xef80-0xef9f mem 0xd2137000-0xd21370ff at device 31.3 on pci0 smbus0: <System Management Bus> on ichsmb0 iwm0: <Intel(R) Dual Band Wireless AC 7260> at device 0.0 on pci4 iwm0: hw rev 0x140, fw ver 17.352738.0, address ⋯ wlan0: Ethernet address: ⋯ lo0: link state changed to UP em0: link state changed to UP wlan0: link state changed to UP Security policy loaded: MAC/ntpd (mac_ntpd) ACPI Warning: \134_SB.PCI0.PEGP.DGFX._DSM: Argument #4 type mismatch - Found [Buffer], ACPI requires [Package] (20210730/nsarguments-212) acquiring duplicate lock of same type: "os.lock_mtx" 1st os.lock_mtx @ nvidia_os.c:882 2nd os.lock_mtx @ nvidia_os.c:882 stack backtrace: #0 0xffffffff80c90dd1 at witness_debugger+0x71 #1 0xffffffff80bfc554 at __mtx_lock_flags+0x94 #2 0xffffffff8424f00b at os_acquire_spinlock+0x1b #3 0xffffffff83f4c4bc at _nv035262rm+0xc acquiring duplicate lock of same type: "fops_sx" 1st fops_sx @ nvidia_subr.c:400 2nd fops_sx @ nvidia_subr.c:1051 stack backtrace: #0 0xffffffff80c90dd1 at witness_debugger+0x71 #1 0xffffffff80c2a9a7 at _sx_xlock+0x67 #2 0xffffffff8425281f at nv_add_mapping_context_to_file+0x7f #3 0xffffffff8419285d at _nv036018rm+0x59d
(In reply to Kevin Bowling from comment #8) Would someone like to update the topic in the NVIDIA forum? (OT: I tried, failed repeatedly, to sign in with Google.) (In reply to Graham Perrin from comment #1) > I'm particularly interested in wake from sleep (resume from suspend) > when the computer (HP ZBook 17 G2) is docked, with a display on DisplayPort. Without docking the notebook – with power and Ethernet cables alone attached: * still, there's failure to wake from sleep. Not worth me reporting the bug, because I'll no longer use this computer.