Bug 282308

Summary: nvidia-driver: NVIDIA MEM resource alloc failed
Product: Ports & Packages Reporter: christian <contato>
Component: Individual Port(s)Assignee: freebsd-ports-bugs (Nobody) <ports-bugs>
Status: New ---    
Severity: Affects Only Me CC: ashafer, junchoon
Priority: ---    
Version: Latest   
Hardware: amd64   
OS: Any   

Description christian 2024-10-24 20:20:17 UTC
I'm sorry, but i'm trying all not sucess. Edited /boot/loader.conf in FreeBSD 14 and now 15-current (current worked my integrated graphics) it's not worked my card nvidia 3050 6gb:
dmesg output
nvidia0: <NVIDIA GeForce RTX 3050 6GB Laptop GPU> on vgapci0
vgapci0: child nvidia0 requested pci_enable_io
vgapci0: 0x1000000 bytes of rid 0x10 res 3 failed (0, 0xffffffffffffffff).
nvidia0: NVRM: NVIDIA MEM resource alloc failed, BAR0 @ 0x10.
nvidia0: NVRM: NVIDIA hardware alloc failed.
device_attach: nvidia0 attach returned 6

my /etc/rc.conf
# cat /etc/rc.conf
hostname="legion5"
keymap="br.kbd"
wlans_iwlwifi0="wlan0"
ifconfig_wlan0="WPA  DHCP"
ifconfig_wlan0_ipv6="inet6 accept_rtadv"
sshd_enable="YES"
moused_enable="YES"
powerd_enable="YES"
# Set dumpdev to "AUTO" to enable crash dumps, "NO" to disable
dumpdev="AUTO"
kld_list="i915kms"
linux_enable="YES"
dbus_enable="YES"
utility_enable="YES"

/boot/loader.conf
hw.pci.allow_unsupported_io_range=1

my output  pciconf -lv | grep -B3 VGA
    vendor     = 'Intel Corporation'
    device     = 'Raptor Lake-P [UHD Graphics]'
    class      = display
    subclass   = VGA
--
    vendor     = 'NVIDIA Corporation'
    device     = 'GN20-P0-R-K2 [GeForce RTX 3050 6GB Laptop GPU]'
    class      = display
    subclass   = VGA


using nvidia-driver-550.120
trying file package or compiled ports no sucess.Sorry my bad english.
Comment 1 Tomoaki AOKI 2024-10-25 10:35:01 UTC
How do you load nvidia driver? Via /boot/loader.conf[.local]? Or just missed but actually /etc/rc.conf[.local]?

If the former, your driver could be half-read and not working normally.
(Memory area that loader allocates for kernel and kernel modules before starting kernel is quite limited.)

If the latter, but actually missing nvidia-modeset in kld_list, try adding it first.

If the latter and actually specifying nvidia-modeset to be added in kld_list somewhere after kld_list="i915kms" line, trying latest Production branch of driver, 550.127.05, or latest BETA branch of the driver, 565.57.01 by overriding driver version and disable checksum could help.

There is my PR, Bug 282312 – x11/nvidia-driver: Update to 550.127.05 with x11/linux-nvidia-libs and related DRM ports [1]. This is needed to try 565.57.01 if you also want x11/linux-nvidia-libs, graphics/nvidia-drm-[510|515|61]-kmod ports.

[1] https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=282312
Comment 2 christian 2024-10-25 11:59:17 UTC
I'm sorry, in rc.conf kld_list="nvidia-modeset", but trying nvidia-drm, only nvidia and no works. Black screen freeze.
(In reply to Tomoaki AOKI from comment #1)
Comment 3 christian 2024-10-25 12:11:19 UTC
I used this wiki:
https://wiki.freebsd.org/Graphics
https://badland.io/prime-configuration.md

I tried all options. but I don't think it's the driver. Exists something tunable in kernel to alloc mem card device?

thank you for helping me.

(In reply to Tomoaki AOKI from comment #1)
Comment 4 christian 2024-10-25 12:15:51 UTC
I tried in terminal, but this error output command, so i don'n know if option works perfectly:
# sysctl hw.pci.allow_unsupported_io_range=1
sysctl: unknown oid 'hw.pci.allow_unsupported_io_range'
Comment 5 christian 2024-10-25 14:23:47 UTC
I compiled nvidia-driver 565.57.01, bus not worked.
nvidia0: <NVIDIA GeForce RTX 3050 6GB Laptop GPU> on vgapci0
vgapci0: child nvidia0 requested pci_enable_io
vgapci0: 0x1000000 bytes of rid 0x10 res 3 failed (0, 0xffffffffffffffff).
nvidia0: NVRM: NVIDIA MEM resource alloc failed, BAR0 @ 0x10.
nvidia0: NVRM: NVIDIA hardware alloc failed.
device_attach: nvidia0 attach returned 6
nvidia-modeset: Loading NVIDIA Kernel Mode Setting Driver for UNIX platforms  565.57.01  Thu Oct 10

suggestions?
thanks!
Comment 6 christian 2024-10-25 15:41:22 UTC
my dmesg output
1st os.rwlock_sx @ nvidia_os.c:782
2nd os.rwlock_sx @ nvidia_os.c:782
Comment 7 Tomoaki AOKI 2024-10-25 16:09:41 UTC
(In reply to christian from comment #4)
The tunable is already gone long before [2].

[2] https://cgit.freebsd.org/src/commit/?id=e4b59fc50065e183c020898b461f49b7ba1483bd


author	Warner Losh <imp@FreeBSD.org>	2004-01-11 06:52:31 +0000
committer	Warner Losh <imp@FreeBSD.org>	2004-01-11 06:52:31 +0000
commit	e4b59fc50065e183c020898b461f49b7ba1483bd (patch)

Add support for subtractive decoding bridges. These bridges pass all
signals to addresses to the child busses.  Typically, ProgIf of 1
means a subtractive bridge.  However, Intel has a whole lot of ones
with a ProgIf of 80 that are also subtractive.  We cope with these
bridges too.  This eliminates hw.pci.allow_unsupported_io_range
because that had almost the same effect as these patches (almost means
'buggy').  Remove the bogus checks for ISA bus locations: these cycles
aren't special and are only passed by transparent bridges.

We allow any range to succeed.  If the range is a superset of the
range that's decoded, trim the resource to that range.  Otherwise,
pass the range unchanged.  This will change the location that PC Card
and CardBus cards are attached.  This might bogusly cause some
overlapping allocation that wasn't present before, but the overlapping
fixes need to be in the pci level.

There's also a few formatting changes here.
Notes

Notes:
    svn path=/head/; revision=124365
Comment 8 Tomoaki AOKI 2024-10-25 16:29:34 UTC
(In reply to christian from comment #5)
My quite wild prediction was that GSP firmware, which were introduced starting from 560 series of drivers, could manage resources used for GPUs, and your GPU is new enough to have GSP in it.
GSP firmwares are contained in x11/nvidia-driver and should be automatically loaded for supported GPUs. (My Pascal generation of GPU is too old to even test it, as it does not contain GSP in it.)

Other possibilities I can think of for now:
 *If possible, disable iGPU (in CPU) via UEFI firmware (or legacy BIOS if applicable).
 *Disable devices you don't actually need via UEFI firmware / legacy BIOS.
 *Change VRAM size configured in UEFI firmware / legacy BIOS.

And if you're using graphics/nvidia-drm-[510|515|61]-kmod in conjunction with x11/nvidia-driver, legacy way for overriding driver version is insufficient.
Both Austin Shafer (the developer of the nvidia DRM ports) and me couldn't determine why, but graphics/nvidia-drm-[510|515|61]-kmod ports requires x11/nvidia-driver/Makefile.version to be edited for the version you want (here, 565.57.01).
Without it, /dev/dri/* would not appear on boot and DRM driver shouldn't work.
Comment 9 christian 2024-10-25 17:58:51 UTC
(In reply to Tomoaki AOKI from comment #8)
thanks, i did.
add in distinfo:
SHA256 (NVIDIA-FreeBSD-x86_64-565.57.01.tar.xz) = 808227e4250d019501834f536157e2a3271cc16d58862a73d20d315862c2425e
SIZE (NVIDIA-FreeBSD-x86_64-565.57.01.tar.xz) = 21610082

Makefile.version

# NVIDIA Distversion
#
# This will be included from x11/nvidia-driver and the nvidia-drm po
NVIDIA_DISTVERSION = 565.57.01

make install works. But not success driver.
nvidia0: <NVIDIA GeForce RTX 3050 6GB Laptop GPU> on vgapci0
vgapci0: child nvidia0 requested pci_enable_io
vgapci0: 0x1000000 bytes of rid 0x10 res 3 failed (0, 0xffffffffffffffff).
nvidia0: NVRM: NVIDIA MEM resource alloc failed, BAR0 @ 0x10.
nvidia0: NVRM: NVIDIA hardware alloc failed.
device_attach: nvidia0 attach returned 6
nvidia-modeset: Loading NVIDIA Kernel Mode Setting Driver for UNIX platforms  565.57.01  Thu Oct 10
Comment 10 christian 2024-10-25 18:01:15 UTC
(In reply to Tomoaki AOKI from comment #8)
I go to try. thanks
Comment 11 christian 2024-10-25 18:04:22 UTC
My laptop is Prime. No disable iGPU. Legion 5. rtx 3050 6GB VRAM
Comment 12 christian 2024-10-25 18:42:25 UTC
(In reply to Tomoaki AOKI from comment #8)
driver 470 ( 470.161.03_1 ) not works. :(

nvidia0: <Unknown> on vgapci0
vgapci0: child nvidia0 requested pci_enable_io
nvidia0: NVRM: NVIDIA MEM resource alloc failed, BAR0 @ 0x10.
nvidia0: NVRM: NVIDIA hardware alloc failed.
device_attach: nvidia0 attach returned 6
nvidia-modeset: Loading NVIDIA Kernel Mode Setting Driver for UNIX platforms  470.161.03  Wed Oct 19 00:01:15 UTC 2022

with nvidia-drm-kmod

acquiring duplicate lock of same type: "os.rwlock_sx"
 1st os.rwlock_sx @ nvidia_os.c:782
 2nd os.rwlock_sx @ nvidia_os.c:782
stack backtrace:
#0 0xffffffff80bc761c at witness_debugger+0x6c
#1 0xffffffff80b5c7cd at _sx_xlock+0x5d
#2 0xffffffff84bef012 at os_acquire_rwlock_write+0x32
#3 0xffffffff8490a5f0 at _nv044172rm+0x10
nvidia0: <NVIDIA GeForce RTX 3050 6GB Laptop GPU> on vgapci0
vgapci0: child nvidia0 requested pci_enable_io
vgapci0: 0x1000000 bytes of rid 0x10 res 3 failed (0, 0xffffffffffffffff).
nvidia0: NVRM: NVIDIA MEM resource alloc failed, BAR0 @ 0x10.
nvidia0: NVRM: NVIDIA hardware alloc failed.
device_attach: nvidia0 attach returned 6
nvidia-modeset: Loading NVIDIA Kernel Mode Setting Driver for UNIX platforms  550.120  Fri Sep 13 09:32:47 UTC 2024

sry I'm trying, but nothing...
Comment 13 Austin Shafer 2024-10-26 04:46:11 UTC
I have not seen this before. The odd thing is I do actually have a legion 5 with freebsd on it but have never seen an issue like this. Only difference from yours is I have an AMD CPU, so maybe that's part of the reason.

If it happens on older drivers like 470 that should rule out GSP.

> vgapci0: 0x1000000 bytes of rid 0x10 res 3 failed (0, 0xffffffffffffffff).

This line seems to indicate something in the vgapci driver for the kernel is going wrong. I would guess something is up with the way it is accessing the PCI bus. I'm not sure what that would be though. You might want to test with a linux liveusb to quickly verify that it's not a problem with the hardware.
Comment 14 christian 2024-10-28 13:26:54 UTC
Hello, and apologies for the delay in my response. I tested the GPU on both Linux and Windows, and it works as expected on both platforms. This confirms that the hardware is functioning correctly, so it seems the issue is specific to FreeBSD.

Thank you for your patience and support.
Comment 15 christian 2024-10-28 13:31:12 UTC
(In reply to Austin Shafer from comment #13)
In Brazil, not find amd cpu only intel. :(
o FreeBSD 15 have support the card graphics integrated. I'm using it. But tested 13 and 14 and both with error.