The Nvidia 580* appears to be broken on FreeBSD 15.0. I am using latest ports and ports-kmods. I'm using `kld_list=nvidia-drm`. Trying to use seatd to launch labwc says it finds no devices. Trying to use startx to launch twm says it cannot run in framebuffer mode. The chip is an RTX A1000 and I can see it in sysctl.
(In reply to Alexander Ziaee from comment #0) Related pkgs can be in out of sync if you're using official kmod repo (FreeBSD-ports-kmods). Only x11/nvidia-kmod is built there for now and bapt@ is willing to work on it, but currently too busy to take time for it. Works bapt@ requested to us are already done and landed. So you should need to confirm nvidia driver versions of all the related ports are in sync. I think graphics/nvidia-drm-66-kmod should be out of sync as of the above. *x11/nvidia-kmod *x11/nvidia-driver *graphics/nvidia-drm-*-kmod (graphics/nvidia-drm-66-kmod should be chosen for 15.0) and optionally *x11/linux-nvidia-libs are needed be in sync. No worries about graphics/nvidia-drm-kmod, as it's just a metaport to choose which variant (510, 515, 61, 66 or latest) depending on OSVERSION and ARCH.
i found a bunch of stuff like nvidia-smi wasn't installed anymoer, nvidia-settings was installed but didn't find anything, until I (re) installed the nvidia-driver package. I have this installed now on -HEAD: adrian@francine:~ % pkg info | grep nvidia nvidia-driver-580.126.09 NVidia graphics card binary drivers for hardware OpenGL rendering nvidia-drm-66-kmod-580.126.09.1600011 NVIDIA DRM Kernel Module nvidia-kmod-580.126.09.1600011 kmod part of NVidia graphics card binary drivers for hardware OpenGL rendering nvidia-settings-580.126.09 Display Control Panel for X NVidia driver nvidia-xconfig-580.126.09 Tool to manipulate X configuration files for the NVidia driver .. without nvidia-driver-580.126.09 installed the tooling just didn't work. With it re-installed? It works. I did scrape the nvidia stuff clean and reinstall it recently; so I guess I missed that? but shouldn't nvidia-driver be a dep somewhere?
I have a successful start (according to the Xorg.0.log) but the resulting :0 server doesn't answer to clients. Clients just sit there. Screen has a hung text vt with cursor at 0x0. Tested on 15-stable and 16-current, identical.
(In reply to Adrian Chadd from comment #2) On splitting out x11/nvidia-kmod* from x11/nvidia-driver*, the requirement from bapt@ as kmod builder manager was not to depend on anything heavy-to-buld like x11-servers/xorg-server as of the quite limited builder resources. To meet the requirement, we finally made decision to *Split out kmod only parts from x11/nvidia-driver* into corresponding x11/nvidia-kmod* *graphics/nvidia-drm-{510|515|61|66}-kmod to switch dependency from x11/nvidia-driver to x11/nvidia-kmod And this caused to require a tricky solution. As graphics/nvidia-drm-{510|515|61|66}-kmod are no longer allowed to depend on x11/nvidia-driver not to be rejected by kmods builder, x11/nvidia-driver is no longer pulled into by them. To resolve this, we made graphics/nvidia-drm-kmod ports, which is the metaport to pull in prober-for-base graphics/nvidia-drm-{510|515|61|66}-kmod, to depend on x11/nvidia-driver to pull it in. Following the procedure in the Handbook, graphics/nvidia-drm-kmod metaport should pull in what's needed. In older way, installing x11/nvidia-driver* automatically pulls in corresponding x11/nvidia-kmod*. Because of the above restrictions, installing directly graphics/nvidia-drm-{510|515|61|66|latest}-kmod{-devel} without graphics/nvidia-drm-kmod{-devel} causes missing dependency of x11/nvidia-driver{-devel}. Note that supports for latest variant of graphics/drm-*-kmod and devel variant of nvidia drivers are introduced after the split.
(In reply to Martin Cracauer from comment #3) How did you intalled them? From pkg using both FreeBSD-ports and FreeBSD-ports-kmods? From pkg using both FreeBSD-ports alone? Or building locally via ports? I've not encountered it while testing builds from ports, both on bare-metal (stable/15 and main) and local poudriere builds (stable/15 only). GPU is RTX A400 and iGPU is not used (connected monitor via DP on RTX A400 and disabled iGPU via UEFI firmware config). Checking `pkg search` resulted as below now (on stable/15). # pkg search -r FreeBSD-ports-kmods nvidia nvidia-kmod-580.126.18.1500506 kmod part of NVidia graphics card binary drivers for hardware OpenGL rendering # pkg search -r FreeBSD-ports nvidia libva-nvidia-driver-0.0.15 NVDEC-based backend for VAAPI linux-nvidia-libs-580.126.18 NVidia graphics libraries and programs (Linux version) linux-nvidia-libs-304-304.137 NVidia graphics libraries and programs (Linux version) linux-nvidia-libs-340-340.108 NVidia graphics libraries and programs (Linux version) linux-nvidia-libs-390-390.157 NVidia graphics libraries and programs (Linux version) linux-nvidia-libs-470-470.256.02 NVidia graphics libraries and programs (Linux version) linux-nvidia-libs-devel-590.48.01 NVidia graphics libraries and programs (Linux version) nvidia-driver-580.126.18 NVidia graphics card binary drivers for hardware OpenGL rendering nvidia-driver-304-304.137_11 NVidia graphics card binary drivers for hardware OpenGL rendering nvidia-driver-340-340.108_5 NVidia graphics card binary drivers for hardware OpenGL rendering nvidia-driver-390-390.157_1 NVidia graphics card binary drivers for hardware OpenGL rendering nvidia-driver-470-470.256.02_2 NVidia graphics card binary drivers for hardware OpenGL rendering nvidia-driver-devel-590.48.01 NVidia graphics card binary drivers for hardware OpenGL rendering nvidia-drm-515-kmod-580.126.18.1500068 NVIDIA DRM Kernel Module nvidia-drm-515-kmod-devel-590.48.01.1500068 NVIDIA DRM Kernel Module nvidia-drm-61-kmod-580.126.18.1500068 NVIDIA DRM Kernel Module nvidia-drm-61-kmod-devel-590.48.01.1500068 NVIDIA DRM Kernel Module nvidia-drm-66-kmod-580.126.18.1500068 NVIDIA DRM Kernel Module nvidia-drm-66-kmod-devel-590.48.01.1500068 NVIDIA DRM Kernel Module nvidia-drm-kmod-580.126.18 NVIDIA DRM Kernel Module nvidia-drm-kmod-devel-590.48.01 NVIDIA DRM Kernel Module nvidia-drm-latest-kmod-580.126.18.1500068 NVIDIA DRM Kernel Module nvidia-drm-latest-kmod-devel-590.48.01.1500068 NVIDIA DRM Kernel Module nvidia-kmod-580.126.18.1500068 kmod part of NVidia graphics card binary drivers for hardware OpenGL rendering nvidia-kmod-304-304.137.1500068 kmod part of NVidia graphics card binary drivers for hardware OpenGL rendering nvidia-kmod-340-340.108.1500068 kmod part of NVidia graphics card binary drivers for hardware OpenGL rendering nvidia-kmod-390-390.157.1500068 kmod part of NVidia graphics card binary drivers for hardware OpenGL rendering nvidia-kmod-470-470.256.02.1500068 kmod part of NVidia graphics card binary drivers for hardware OpenGL rendering nvidia-kmod-devel-590.48.01.1500068_1 kmod part of NVidia graphics card binary drivers for hardware OpenGL rendering nvidia-settings-580.126.18 Display Control Panel for X NVidia driver nvidia-texture-tools-2.1.2 Texture Tools with support for DirectX 10 texture formats nvidia-xconfig-580.126.18 Tool to manipulate X configuration files for the NVidia driver nvidia_gpu_prometheus_exporter-g20181028_37 NVIDIA GPU Prometheus exporter py311-nvidia_ml_py-13.590.48 Python bindings to the NVIDIA Management Library xlibre-nvidia-driver-580.126.18 NVidia graphics card binary drivers for hardware OpenGL rendering xlibre-nvidia-driver-304-304.137_11 NVidia graphics card binary drivers for hardware OpenGL rendering xlibre-nvidia-driver-340-340.108_5 NVidia graphics card binary drivers for hardware OpenGL rendering xlibre-nvidia-driver-390-390.157_1 NVidia graphics card binary drivers for hardware OpenGL rendering xlibre-nvidia-driver-470-470.256.02_2 NVidia graphics card binary drivers for hardware OpenGL rendering xlibre-nvidia-driver-devel-590.48.01 NVidia graphics card binary drivers for hardware OpenGL rendering
(In reply to Tomoaki AOKI from comment #5) I don't remember how I installed them. What do you want me to test?
(In reply to Martin Cracauer from comment #6) As most likely situation is that you've upgraded the (unavoidable) time window that FreeBSD-ports-kmods repo alone has upgraded version and you've used pkg with both FreeBSD-ports repo and FreeBSD-ports-kmods repo used, trying upgrades again (limiting to FreeBSD-ports repo alone would be safer here) or building all nvidia things from ports.
(In reply to Tomoaki AOKI from comment #7) Can you give me a list of ports you want me to compile for this test?
(In reply to Martin Cracauer from comment #8) Listed in Comment #1.
> most likely situation is that you've upgraded the (unavoidable) > time window that FreeBSD-ports-kmods repo alone has upgraded version How can this be? The FreeBSD 15 series is on its first release, and this is the problem that creating FreeBSD-ports-kmods was supposed to solve, right? > you've used pkg with both FreeBSD-ports repo and FreeBSD-ports-kmods repo used They are both enabled out of the box, so this is the expected configuration, right? I thought FreeBSD-ports-kmods will take precedence over FreeBSD-ports.
Are the packages in kmod-repo and ports-repo up to date? surely this can be fixed by pkg update -f && pkg upgrade -f on the relevant packages?
My 15-stable machine was on quarterly. I switched to head and got nvidia-driver: 580.119.02_1 -> 580.126.18 [FreeBSD-ports] nvidia-kmod: 580.119.02.1500068_1 -> 580.126.18.1500506 However, behavior is the same.
(In reply to Adrian Chadd from comment #11) Unfortunately, it's out of our (x11 nvidia team's) control. But as FreeBSD-ports builders and FreeBSD-ports-kmods builders are (AFAIK) independent servers with different configurations / scales. So lags in time are unavoidable even after bapt@ addresses current issues.
(In reply to Martin Cracauer from comment #12) Looking your hard volunteer works at forums, I think it's unlikely, though, but are you sure you restarted the system after upgrading? IIRC, at least kmods from graphics/drm-*-kmod ports required by corresponding graphics/nvidia-drm-*-kmod{-devel} crashes on unloading except on sane shutdown process. So restarting would be mandatory here, as graphics/nvidia-drm-*-kmod{-devel} borrows parts of codes from graphics/drm-*-kmod. And more to confirm. does kldstat sanely shows all 3 kmods (plus ones from graphics/drm-*-kmod)? Examples on my system (-devel variants, though) including pulled in related ones (possibly including noise): Id Refs Address Size Name 1 203 0xffffffff80200000 2350ed0 kernel (snip) 32 2 0xffffffff84ece000 1462f8 nvidia-modeset.ko 33 2 0xffffffff85200000 5c1fc60 nvidia.ko 34 2 0xffffffff85015000 31240 linux.ko 35 2 0xffffffff85047000 6d98 mqueuefs.ko 36 6 0xffffffff8504e000 cdc8 linux_common.ko 37 1 0xffffffff8505b000 14a88 nvidia-drm.ko 38 1 0xffffffff85070000 8b190 drm.ko 39 1 0xffffffff850fc000 22b8 iic.ko 40 1 0xffffffff850ff000 4120 linuxkpi_video.ko 41 2 0xffffffff85104000 7360 dmabuf.ko 42 1 0xffffffff8510c000 3378 lindebugfs.ko 43 1 0xffffffff85110000 2d300 linux64.ko (snip to the end) If all the above are working fine, there should be /dev/dri/ containing card0 (symlink to ../drm/0) renderD128 (symlink to ../drm/128) These symlinks are specific for cases nvidia-drm.ko is in use. If you're not using nvidia-drm.ko, below wouldn't be shown unless you manually specified to load. 37 1 0xffffffff8505b000 14a88 nvidia-drm.ko 38 1 0xffffffff85070000 8b190 drm.ko 39 1 0xffffffff850fc000 22b8 iic.ko 40 1 0xffffffff850ff000 4120 linuxkpi_video.ko 41 2 0xffffffff85104000 7360 dmabuf.ko 42 1 0xffffffff8510c000 3378 lindebugfs.ko
Yes, I did reboot. I have difficulties compiling. Trying to compile x11/nvidia-driver I wanted to pkg-install build-time dependencies, which came to 606 pkgs and multiple SAT solver errors. Need more time.
I recompiled - x11/nvidia-kmod - x11/nvidia-driver - graphics/nvidia-drm-kmod Same result, X11 server starts but then hangs with 100% of one CPU. nvidia-smi also hangs with 100% CPU. 15-stable, ports tree at 6b50d3885e31662c439e89f43c74617100ce0bd0 (main).
BTW, the hanging X11 server also prevents reboot(8) from succeeding. Reset never happens after printing uptime.
(In reply to Martin Cracauer from comment #15) It would because you've switched from quarterly to latest, ins't it?
(In reply to Martin Cracauer from comment #16) On switching from quarterly to latest ports tree, you may need rebuilding corresponding graphics/drm-*-kmod (on stable/15, it should be graphics/drm-66-kmod) to be safe, if not rebuilt / reinstalled. So would be x11-servers/xorg-server (if you'd prefered Xlibre, x11-servers/xlibre-server. In this case, x11/nvidia-driver port should be built with xlibre flavor instead of default xorg). Not sure when it was, but at some point, x11-servers/xorg-server needed to be rebuilt to run sanely after upgrading x11/nvidia-driver (at the moment, it hadn't be splitted out). Would be excessive, but to be more safer, also graphics/libglvnd, graphics/eglexternalplatform, graphics/egl-wayland, graphics/egl-x11, graphics/libdrm, graphics/mesa-libs. Possibly basic X11 related libs, too.
(In reply to Martin Cracauer from comment #17) This reminds me of mangled old libraries and new libraries. Happened before when I've not cleared /usr/local/lib/compat/pkg/. Old libraries there were picked instead of correct ones. Experienced with Qt5, too. In this case, if there are libraries having "nvidia" or "egl" in filenames, removing (or moved to somewhere /libexec/ld-elf.so.1 doesn't look for) could fix the issue.
Tried reproducing but still cannot reproduce. Both xorg (with Mate DE + Compiz) and Wayland (with Wayfire I've tried before) started up fine on 580.126.18. RTX A400 (Ampere generation of arch), stable/15 at commit base a9f454a9c79810d60261d03dbec73c29396bf128, amd64. Core i9-12900H with iGPU disabled via UEFI settings and monitor is connected via miniDP on the A400 card. Ports tree is on main branch at commit ports f076286b8aaa4277e975abad13183c43432d7d4f. What I did is overriding version as below and rebuilt x11/nvidia-kmod-devel x11/nvidia-driver-devel x11/linux-nvidia-libs-devel graphics/nvidia-drm-66-kmod-devel graphics/nvidia-drm-kmod-devel using pkg_replace. Restarted whole OS, try starting Wayfire on the user I've created to try Wayland compositors (currently partially configured for Wayfire) and started up fine. Restarted whole OS and installed labwc and partially rewrote the startup script for Wayfire to invoke labwc instead, starts fine. Not sure why, but as console (vty0) stops working normally after Wayland compositors exits, I always restart once I've tried any compositor. Note that I've kept compositors tested for at least in few minutes to see it crashes or not, but neither crashed while running. Started Xorg after restart as usual on my usual user, starts and runs fine. Commented out NVIDIA_OVERRIDE_VERSION= line below in /etc/make.conf and rebuilt above-mentioned ports to restore 590.48.01 (current -devel version), restarted, and I'm here now as usual. ===== Applicable configs in /etc/make.cond ===== NVIDIA_OVERRIDE_VERSION= 580.126.18 .if ${.CURDIR:M/usr/ports/x11/nvidia-driver*} && defined (NVIDIA_OVERRIDE_VERSION) DISTVERSION= ${NVIDIA_OVERRIDE_VERSION} NO_CHECKSUM= YES .endif .if ${.CURDIR:M/usr/ports/x11/nvidia-kmod*} && defined (NVIDIA_OVERRIDE_VERSION) DISTVERSION= ${NVIDIA_OVERRIDE_VERSION} NO_CHECKSUM= YES .endif .if ${.CURDIR:M/usr/ports/x11/linux-nvidia-libs*} && defined (NVIDIA_OVERRIDE_VERSION) DISTVERSION= ${NVIDIA_OVERRIDE_VERSION} NO_CHECKSUM= YES .endif ## graphics/nvidia-drm-*-kmod supports 550 series and above only. ## Don't attempt to override before 550 series! .if ${.CURDIR:M/usr/ports/graphics/nvidia-drm-*-kmod*} && defined (NVIDIA_OVERRIDE_VERSION) NVIDIA_DISTVERSION= ${NVIDIA_OVERRIDE_VERSION} NO_CHECKSUM= YES .endif ===== End quotes =====
I did force reinstall all packages. startx now just gives me a black screen. I am running FreeBSD 15.0-RELEASE-p3 (1500068) with the following binary packages: nvidia-driver 580.126.18 nvidia-drm-66-kmod 580.126.18.1500068 nvidia-kmod 580.126.18.1500068 I install via `pkg ins nvidia-drm-kmod`, as I told users to do in the handbook. kldstat shows nvidia.ko, nvidia-modeset.ko, and nvidia-drm.ko loaded.
(In reply to Alexander Ziaee from comment #22) As I've written in my previous post, I cannot reproduce the issue (not using official pkg, though). Your A1000 is at the same generation with my A400, with larger scale. So GPU itself should be OK unless it's somehow broken. (And assuming you're connecting your monitor directly to your A1000, not the ones on motherboard tied to iGPU alone.) And as you've mentioned forcible reinstall of all pkgs, I think your graphics/drm-66-kmod installation should be OK, too. If there are any mis-matches, nvidia-drm.ko installed should NOT work, as it attempts to load some of kmods from corresponding graphics/drm-*-kmod installs. Anyway, if this kind of mis-matches exists, nothing tied to NVIDIA GPU in /dev/dri should appear. So, assuming everything above are fine, what needed to reconfirm would be: *Confirm no active nvidia*_load= lines in your /boot/loader.conf. This is because, if the installation is a continuously upgraded (including GPUs) one, remnants from when nvidia.ko was small enough still exist and does harm. *Confirm hw.nvidiadrm.modeset=1 line exists in your /boot/loader.conf. Beware of typos. *Confirm NO hw.nvidiadrm.fbdev=1 line in your /boot/loader.conf. This doesn't work on FreeBSD, which lacks required supports in graphics/drm-*-kmod side (SimpleDRM). *Confirm typos in kmod names in /etc/rc.conf{.local}. *Try changing hw.nvidia.registry.EnableGpuFirmware= line in your /boot/loader.conf between 0 and 1 and see anything differs or not. Note that this shouldn't be set to 1 on pre-Turing GPUs. *If you're setting the tunables above (yes, all of them are tunables!) are missingly set in /etc/sysctl.conf, move them to /boot/loader.conf. If set in /etc/sysctl.conf, it could be in race condition between setting it and nvidia{|-drm|-modeset}.ko are loaded.