Bug 280772 - x11/nvidia-driver: Update to 550.107.02 with x11/linux-nvidia-libs and related DRM ports
Summary: x11/nvidia-driver: Update to 550.107.02 with x11/linux-nvidia-libs and relate...
Status: New
Alias: None
Product: Ports & Packages
Classification: Unclassified
Component: Individual Port(s) (show other bugs)
Version: Latest
Hardware: Any Any
: --- Affects Only Me
Assignee: freebsd-ports-bugs (Nobody)
URL:
Keywords:
Depends on: 279539
Blocks:
  Show dependency treegraph
 
Reported: 2024-08-12 13:29 UTC by Tomoaki AOKI
Modified: 2024-08-27 13:57 UTC (History)
5 users (show)

See Also:
junchoon: maintainer-feedback? (danfe)
junchoon: maintainer-feedback? (x11)


Attachments
Patch to upgrade to 550.107.02 (5.74 KB, patch)
2024-08-12 13:34 UTC, Tomoaki AOKI
no flags Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Tomoaki AOKI 2024-08-12 13:29:41 UTC
Update x11/nvidia-driver, x11/linux-nvidia-libs, graphics/nvidia-drm-510-kmod, graphics/nvidia-drm-515-kmod and graphics/nvidia-drm-61-kmod to latest production branch of nvidia proprietary driver, 550.107.02.

Release Highlights:
  nvidia-driver: https://www.nvidia.com/Download/driverResults.aspx/230359/en-us/
  linux-nvidia-libs: https://www.nvidia.com/Download/driverResults.aspx/230357/en-us/

Note that stable/14 additionally requires the patch attached at Bug 279539 to fix build. It is intentionally excluded here as a prerequisite.
Comment 1 Tomoaki AOKI 2024-08-12 13:34:55 UTC
Created attachment 252707 [details]
Patch to upgrade to 550.107.02

Patch to upgrade to 550.107.02.
Also includes fixes to allow 560 series of Beta Branch driver which adds 2 new kernel modules (GSP firmwares). See Readme of the branch [1].

[1] https://us.download.nvidia.com/XFree86/FreeBSD-x86_64/560.31.02/README/gsp.html
Comment 2 Tomoaki AOKI 2024-08-12 13:44:23 UTC
Additional note.
I don't have any GPUs supporting GSP firmware. So these firmwares are untested.
Just install them on 560 series of drivers and later.

Added firmware modules for 560 drivers are:
  /boot/modules/nvidia_gsp_ga10x_fw.ko
  /boot/modules/nvidia_gsp_tu10x_fw.ko


Tested with Quadro P1000 (notebook) on ThinkPad P52, iGPU disabled, xorg.

Tested on stable/14 for 550.107.02, 555.58.02, 560.28.03 and 560.31.02
at commit 644d81447118692ced65bc63829998150a646bec, amd64

Tested on main for 550.107.02, 555.58.02 and 560.31.02
at commit d349bd35330d3ec7ce1d3e7d6c2d6fc1f6a95704, amd64


Not tested for Wayland as I don't use any of compositors currently.
Comment 3 Chad Jacob Milios 2024-08-12 20:27:09 UTC
(In reply to Tomoaki AOKI from comment #1)

when you say "includes fixes to allow 560 series" am i to understand your intention is that i twiddle the NVIDIA_DISTVERSION myself then? i will try your patches and let you know how it goes.

I have an MSI GeForce RTX 4090 Suprim that only runs FreeBSD, currently 13.3-RELEASE-p5 but i could be coaxed to try 13.4 or 14.1. How likely is this to brick my GPU? (1 being probably and 5 being certainly lol)

Do you think i will be okay with the Linuxulator bits having my DEFAULT_VERSIONS= linux=rl9 ?

you are a god, sir. how might i help you further this cause until FreeBSD natively supports CUDA and UVM as well or better than Linux? when it comes to kernel drivers i wouldnt know how to tie my own shoes.
Comment 4 Tomoaki AOKI 2024-08-12 23:17:59 UTC
(In reply to Chad Jacob Milios from comment #3)
Nice! Latest GPU architecture, Ada Lovelace!

> when you say "includes fixes to allow 560 series" am i to understand your intention is that i twiddle the NVIDIA_DISTVERSION myself then? i will try your patches and let you know how it goes.

Yes. You must specify which DISTVERSION to be wanted.
Without it, 550.107.02 is built with this patch applied.
Technically, NVIDIA_DISTVERSION is set with "=", means, forcibly set it.
OTOH, DISTVERSION is set with "?=", means, set with right-side value if not yet set, thus, possible to override.

For example, I have configurations below for 560.31.02 in my /etc/make.conf.
It works fine for x11/nvidia-driver and x11/linux-nvidia-libs, but couldn't determine why, graphics/nvidia-drm-*-kmod and graphics/nvidia-drm-kmod always pick version from x11/nvidia-driver/Makefile.version ("NO_CHECKSUM= YES" works, though).

     ===== Quote =====
NVIDIA_OVERRIDE_VERSION= 560.31.02

.if ${.CURDIR:M/usr/ports/x11/nvidia-driver} && defined (NVIDIA_OVERRIDE_VERSION)
  DISTVERSION=	${NVIDIA_OVERRIDE_VERSION}
  NO_CHECKSUM=	YES
.endif

.if ${.CURDIR:M/usr/ports/x11/linux-nvidia-libs} && defined (NVIDIA_OVERRIDE_VERSION)
  DISTVERSION=	${NVIDIA_OVERRIDE_VERSION}
  NO_CHECKSUM=	YES
.endif

.if ${.CURDIR:M/usr/ports/graphics/nvidia-drm-*-kmod} && defined (NVIDIA_OVERRIDE_VERSION)
  DISTVERSION=	${NVIDIA_OVERRIDE_VERSION}
  NO_CHECKSUM=	YES
.endif

.if ${.CURDIR:M/usr/ports/graphics/nvidia-drm-kmod} && defined (NVIDIA_OVERRIDE_VERSION)
  DISTVERSION=	${NVIDIA_OVERRIDE_VERSION}
  NO_CHECKSUM=	YES
.endif

     ===== End quote =====

Strangely, if I build graphics/nvidia-drm-*-kmod ports with overriding DISTVERSION with 560.31.03 but without editing x11/nvidia-driver/Makefile.version (kept for 550.107.02 as is), `% pkg version -v | grep nvidia-drm-61-kmod` would show

     ===== Quote =====
nvidia-drm-61-kmod-560.31.02_1     =   up-to-date with index
     ===== End Quote =====

but actually it is nvidia-drm-61-kmod-550.107.02, thus, not work with version mismatch.


> How likely is this to brick my GPU? (1 being probably and 5 being certainly lol)

First of all, do not load GPU related modules via /boot/loader.conf[.local].
It could brick your PC when something goes wrong.
If you load GPU related modules via kld_list variable in /etc/rc.conf[.local], don't worry. If your PC seems to become a brick, try Ctrk-Alt-Del and wait in front of your PC, then, go to single user mode when loader menu is displayed.
You can rebuild/reinstall safer version of drivers. (Remount of FS with RW would be needed just as installworld in single user mode.)

If 560 series of drivers exposed problems, you can still try 555 series of New Feature drivers. 555 series doesn't have GSP firmware, so would be expected less problematic.


> Do you think i will be okay with the Linuxulator bits having my DEFAULT_VERSIONS= linux=rl9 ?

Not sure. I've not even tried rl9. c7 only.
And IIRC/IIUC, rl9 ports are largely restructured compared with c7.


> how might i help you further this cause until FreeBSD natively supports CUDA and UVM as well or better than Linux?

Unfortunately, CUDA and UVM are not stated to be supported even on 560.31.02.
At least, modules/components needed for them, which Linux version has, are not built on FreeBSD yet.
Comment 5 Chad Jacob Milios 2024-08-15 02:46:46 UTC
(In reply to Tomoaki AOKI from comment #4)

i tried patch #252707 as is first. it works well in so much as it is a drop in replacement upgrade for the 550.54.14 i had. it runs glxgears and foobillard under kde5 plasma after logging in with lightdm. i tested it for all of 25 seconds. LGTM

> Yes. You must specify which DISTVERSION to be wanted.

then i edited Makefile.version to 560.31.02 and ran make -C /usr/ports/{x11/{nvidia-driver;linux-nvidia-libs};graphics/nvidia-drm-510-kmod} {makesum;package;{de;re}install} then reboot and everything also works beautifully fine immediately. i've been running for a couple days now on 560.31.02 flawlessly.

by flawlessly i mean my windowing system is dependable. i've been underutilizing my 4090 for a couple of days like its 2012 integrated graphics while i'm trying out this driver because i dont even know how i would go about telling this thing to break a sweat under FreeBSD. the fans have comfortably stayed literally off the whole time on my GPU when in FreeBSD no matter how many glxgears windows i open and nvidia-smi says its using like 27 watts out of 480.

i must confess my 4090 typically hangs out sadly in pptdevs under the cruel heel of Linux. i had used the drivers in ports before just to configure and test out the capability but FreeBSD usually shows my host console through the cpu/mobo graphics while some bhyve vm takes over the 4090.

i dont "game" so i dont even own a modern commercial game that i could load up on wine-proton or something. i'm all ears if anyone has any ideas what i can play with just to test out the card and drivers natively on FreeBSD. believe me i will be right there to make sure the fans kick on right away too. it seems that while there is no driver talking to the card yet it idles at a minimum fan speed but when the driver gets ahold of it and sees that i dont have any real work for it they go off off.

i am experimenting with emulators/libc6-shim and nvidia-smi does report CUDA 12.6 support using your new modded patch while it reported 12.4 in both current ports and with the attached patch unmodded. (all using an rl9 compat)

the mere mention of CUDA where once was N/A is as far as i got. so far i havent actually properly compiled nor configured any CUDA workloads using FreeBSD yet, mostly for lack of trying for any more than seven minutes.

its my next off hours project. ive heard its been done using the NVidia CUDA SDK, Linuxulator and FreeBSD native driver from ports. just no ones done UVM yet, only GPU VRAM.

i am guessing the two new .ko's on the plist are for GSP's on prior hardware generations and the GSP bits for Ada Lovelace are in the main .ko??

nvidia_gsp_{ga,tu}10x_fw.ko both do get installed, of course, but my first instinct was to not call upon them in any way. According to kldstat, neither did nvidia{-drm,,-modeset}.ko.

HOWEVER nvidia-smi -q indeed reports "GSP Firmware Version: 560.31.02". I don't know how I'd go about actually testing what functions that rely on it specifically.

in ALL [non-pptdev] configurations i've only ever simply added nvidia-drm to rc.conf.local->kld_list and added Device.{Driver,BusID} to xorg.conf. (i.o.w. i changed nothing config-wise from what i'd been using day to day out of the ports tree)

now, while using the 560 driver for days kldstat does NOT list any gsp_*_fw.ko's. does that track with what you're expecting? are they unneeded at all or maybe i never used the feature that might auto-load one/both?

> Unfortunately, CUDA and UVM are not stated to be supported even on 560.31.02. At least, modules/components needed for them, which Linux version has, are not built on FreeBSD yet.

are these needed modules/components NVidia driver bits or FreeBSD kernel bits? whose palms do i gotta grease? where can we start? what books do we gotta read? lol
Comment 6 Tomoaki AOKI 2024-08-15 13:46:27 UTC
(In reply to Chad Jacob Milios from comment #5)
Thanks for your feedback!

About CUDA:
Unfortunately, CUDA libraries such as libcuda.so.560.31.02 are not provided for FreeBSD native driver. Only available via linux compatible libraries via x11/linux-nvidia-libs port.

About kernel modules:
Currently, nvidia-peermem.ko and nvidia-uvm.ko, which drivers for Linux have, is not provided for FreeBSD (checked on https://us.download.nvidia.com/XFree86/Linux-x86_64/560.31.02/README/installedcomponents.html). As these are nvidia proprietary driver, it's purely on nvidia whether providing them to FreeBSD native or not.

About firmware modules:
Firmware for GSP should not be included in any of nvidia.ko, nvidia-modeset.ko nor nvidia-drm.ko. Linux driver seems to be the same here, except theirs has extention *.bin instead of our *_fw.ko and no prepending "nvidia_" (checked by simply `make extract` on x11/linux-nvidia-libs, with x11/nvidia-driver/Makefile.version edited).
Unfortunately, nvidia doesn't providing which module supports what GPUs.
Possibly, whichever relevant would be auto-loaded when actually required.

I know it should be risky, so I wouldn't force you, but if you're OK, what happenes when you manually kldload any of firmware module?
As nvidia_gsp_ga10x_fw.ko is much larger than nvidia_gsp_tu10x_fw.ko and "tu" implies me "Turing", I assume nvidia_gsp_ga10x_fw.ko could be better fit.
Comment 7 Chad Jacob Milios 2024-08-16 18:08:07 UTC
(In reply to Tomoaki AOKI from comment #6)

i tried kldloading those both and everything seems just fine. they hang out there in kldstat just fine but i cant seem to see a difference either way. so i unloaded them and restarted again. i will try to see if theres any difference with or without them loaded or if one ends up getting loaded on its own eventually

i'm going to figure out some gaming experience one way or another through freebsd and this driver tonight. something that gets my fans going
Comment 8 Tomoaki AOKI 2024-08-17 13:20:58 UTC
(In reply to Chad Jacob Milios from comment #7)

Thanks for your bravery!

For GSP, what's the output of `nvidia-smi -q | grep GSP`?
For me, as my GPU is too old (Pascal microarchitecture), the output is as below.

% nvidia-smi -q | grep GSP
    GSP Firmware Version                  : N/A

If your output is not N/A, the firmware should be loaded and recognized.
Comment 9 Tomoaki AOKI 2024-08-25 12:28:18 UTC
Just a FYI.
Found that 560 series of driver becomes New Feature Branch of drivers as 560.35.03.
Not yet tested, though. Just found it now at nvidia driver page [1] [2] [3].

[1] https://www.nvidia.com/en-us/drivers/unix/

[2] https://www.nvidia.com/en-us/drivers/details/230920/

[3] https://www.nvidia.com/en-us/drivers/details/230918/
Comment 10 Chad Jacob Milios 2024-08-27 13:57:06 UTC
(In reply to Tomoaki AOKI from comment #8)

when i set NVIDIA_DISTVERSION = 560.31.02 in Makefile.version

jedi@rick:~ % nvidia-smi -q | grep GSP
    GSP Firmware Version                  : 560.31.02

i believe previously that line was entirely absent

also, CUDA version increases from 12.4 to 12.6 with said update as reported by `nv-sglrun nvidia-smi` (of course it's N/A without nv-sglrun which came from libc6-shim-20240512 i.e. emulators/libc6-shim)