Created attachment 188856 [details] patch to install libnvidia-fatbinaryloader Recent versions of nvidia-driver installs a 32bit libcuda.so which can be used for CUDA. However, this lib depends on libnvidia-fatbinaryloader.so, which is also shipped, but not installed. A simple patch is attached to resolve this issue.
Can you try the patch in bug 217901? It reworks the installation of the Linux libraries.
(In reply to Tijl Coosemans from comment #1) I've tried your patch, and Google Earth works fine. CUDA test program deviceQueryDrv can find libcuda.so and libnvidia-fatbinaryloader.so, so your patch fixes this PR. On the other hand, it still reports cuInit() returns 999, so there are other problems, before we can use CUDA.
CUDA is unlikely to work without the nvidia-uvm kernel module from the Linux driver package.
(In reply to Henry Hu from comment #2) Can you run the CUDA test program using ktrace? That should give us some more information about what it tries to do. ktrace -i -f /where/you/want/ktrace.out testprogram kdump -H -f /where/you/want/ktrace.out > /where/you/want/ktrace.txt Then attach ktrace.txt to this bug (compressed with bzip2 or something if it's too big).
Created attachment 199156 [details] ktrace log It seems to be accessing /dev/nvidia-uvm.
(In reply to Henry Hu from comment #5) Right, nvidia-uvm seems to be open source so it should be possible to port it (perhaps using linuxkpi in base and linuxkpi_gplv2 in graphics/drm-devel-kmod), but I don't have time for that right now.
Created attachment 199178 [details] patch I noticed that nvidia-uvm also has an unsupported mode which is trivial to port so here's a new version of the patch for x11/nvidia-driver. Now it should install a dummy nvidia-uvm kernel module that you can load with kldload nvidia-uvm. You probably also need to adjust the permissions on /dev/nvidia-uvm. Please give it a try. If it doesn't work then create another ktrace.
I tried unsupported mode before (https://github.com/shkhln/nvshim/blob/master/src/libc/sys/ioctl.c#L15) and I think it is, well, actually unsupported. Too lazy to setup a proper Linux system for testing.
> I tried unsupported mode before (https://github.com/shkhln/nvshim/blob/master/src/libc/sys/ioctl.c#L15) and I think it is, well, actually unsupported. Ok, turns out I'm just dumb. Disregard that.
Comment on attachment 199178 [details] patch I've uploaded a new patch to bug 217901 addressing issues with the ioctl handler.
The port of nvidia-uvm is incomplete. It doesn't handle ioctl calls from Linux programs yet.
Patch4 in bug 217901 contains an updated nvidia-uvm module (still unsupported mode). Please give it a try and provide another ktrace if it doesn't work.
There are quite a few stubs in nvidia.ko, some of them might be required for CUDA. For example, running matrixMul from CUDA SDK and glxgears with this dtrace script: #!/usr/sbin/dtrace -s nvidia:*:entry, nvidia-modeset:*:entry /execname == "matrixMul"/ { @counts[probefunc] = min(1); } nvidia:*:entry, nvidia-modeset:*:entry /execname != "matrixMul"/ { @counts[probefunc] = min(0); } Gives me: os_lock_user_pages 1
(In reply to Alex S from comment #13) Does the CUDA program get further now that /dev/nvidia-uvm exists? Or do you see this without /dev/nvidia-uvm as well?
(In reply to Tijl Coosemans from comment #14) Are you able to test 390.87 yourself? > Or do you see this without /dev/nvidia-uvm as well? Without /dev/nvidia-uvm I see CUDA trying to pass an error code (-2) into ioctl call: https://forums.freebsd.org/threads/linux-binary-compatibility-nvidia-drivers-and-cuda-for-blender.65065/#post-382015. > Does the CUDA program get further now that /dev/nvidia-uvm exists? Please note that I'm on 11.2-RELEASE and I'm not currently able to test your patches. Other that that, with "unsupported mode UVM" matrixMul sample prints some vague "all CUDA-capable devices are busy or unavailable" error message. Replacing "return NV_ERR_NOT_SUPPORTED" with "return NV_OK" (without proper implementation) in os_lock_user_pages and os_unlock_user_pages seems to trick it into actually reading some (garbage) data. Dtrace reports nv_register_user_pages and nv_unregister_user_pages being called.
(In reply to Alex S from comment #15) > that that * than that
(In reply to Alex S from comment #15) I cannot test 390, only 304. And I'm only interested in enabling linux64 in the nvidia-driver to make linux-c7 the default. The nvidia-driver is the last blocker for that. If I can get CUDA working at the same time that would be a nice bonus, but it's not a priority for me. I can take a look at os_lock_user_pages and friends, but no promises. If you modified your 11.2 kernel as in bug 206711 you should be able to test my patch for x11/nvidia-driver.
The linux-nvidia-libs installs libnvidia-fatbinaryloader.so and linux-c7 is default as of now. Can this be closed?
(In reply to Gleb Popov from comment #18) I think that it can be closed.
(In reply to Alex S from comment #13) For the record, upstream was kind enough to implement os_lock_user_pages (and related functions) in 495.29.05, so now we only have to deal with the UVM stuff.