with compat.linux.osrelease=2.6.16 and linux_base-f8 almost every 3d linux application crashes when using the closed source nvidia driver. when switching to graphics/linux_dri thus replacing the nvidia linux version of libGL.so.1 the error disappears. it seems the linuxulator 2.6 is missing a vital syscall (or doesn't fully support it) which is required by the nvidia version of libGL.so.1. switching to compat.linux.osrelease=2.4.2 and replacing linux_base-f8 with linux_base-fc4 resolves the problem. here are 2 excerpts from a linux_kdump: dump from unreal tournament 2004 demo: --- 1180 ut2004-bin RET close 0 1180 ut2004-bin CALL linux_brk(0xae5c000) 1180 ut2004-bin RET linux_brk 182829056/0xae5c000 1180 ut2004-bin CALL linux_getpid 1180 ut2004-bin RET linux_getpid 1180/0x49c 1180 ut2004-bin CALL linux_getpid 1180 ut2004-bin RET linux_getpid 1180/0x49c 1180 ut2004-bin CALL linux_getpid 1180 ut2004-bin RET linux_getpid 1180/0x49c 1180 ut2004-bin CALL linux_sys_futex(0x2b406e30,0x81,0x7fffffff,0,0x49c,0x7) 1180 ut2004-bin RET linux_sys_futex 1 1180 ut2004-bin PSIG SIGSEGV caught handler=0x874bd50 mask=0x0 code=0x0 1180 ut2004-bin CALL linux_fstat64(0x1,0xbfbfa9e8,0x28fe8ff4) 1180 ut2004-bin UNKNOWN(8) 1180 ut2004-bin RET linux_fstat64 0 1180 ut2004-bin CALL linux_mmap2(0,0x1000,0x3,0x22,0xffffffff,0) 1180 ut2004-bin RET linux_mmap2 688971776/0x2910e000 1180 ut2004-bin CALL write(0x1,0x2910e000,0x25) 1180 ut2004-bin GIO fd 1 wrote 37 bytes "Signal: SIGSEGV [segmentation fault] " 1180 ut2004-bin RET write 37/0x25 1180 ut2004-bin CALL write(0x1,0x2910e000,0xa) 1180 ut2004-bin GIO fd 1 wrote 10 bytes "Aborting. " 1180 ut2004-bin RET write 10/0xa 1180 ut2004-bin CALL write(0x1,0x2910e000,0x1) 1180 ut2004-bin GIO fd 1 wrote 1 byte " " 1180 ut2004-bin RET write 1 1180 ut2004-bin CALL write(0x1,0x2910e000,0x1) 1180 ut2004-bin GIO fd 1 wrote 1 byte " " 1180 ut2004-bin RET write 1 1180 ut2004-bin CALL write(0x1,0x2910e000,0x31) 1180 ut2004-bin GIO fd 1 wrote 49 bytes "Crash information will be saved to your logfile. " 1180 ut2004-bin RET write 49/0x31 1180 ut2004-bin CALL linux_sys_futex(0x28feba34,0x81,0x7fffffff,0,0xbfbfab14,0xbfbfaaec) 1180 ut2004-bin RET linux_sys_futex 1 1180 ut2004-bin CALL linux_sys_futex(0x28e8eb48,0x81,0x7fffffff,0,0xbfbfaa30,0xbfbfa93c) 1180 ut2004-bin RET linux_sys_futex 1 1180 ut2004-bin CALL write(0x4,0x937c3c8,0xc) --- dump from quake 4 demo: --- 1285 quake4.x86 RET close 0 1285 quake4.x86 CALL linux_getpid 1285 quake4.x86 RET linux_getpid 1285/0x505 1285 quake4.x86 CALL linux_getpid 1285 quake4.x86 RET linux_getpid 1285/0x505 1285 quake4.x86 CALL linux_getpid 1285 quake4.x86 RET linux_getpid 1285/0x505 1285 quake4.x86 CALL linux_sys_futex(0x2dbece30,0x81,0x7fffffff,0,0x505,0x7) 1285 quake4.x86 RET linux_sys_futex 1 1285 quake4.x86 PSIG SIGSEGV caught handler=0x8254b10 mask=0x0 code=0x0 1285 quake4.x86 CALL linux_sys_futex(0x286cd620,0x81,0x7fffffff,0,0x505,0xbfbfc51c) 1285 quake4.x86 RET linux_sys_futex 1 1285 quake4.x86 CALL write(0x1,0x283dd000,0x22) 1285 quake4.x86 GIO fd 1 wrote 34 bytes "signal caught: Segmentation fault " 1285 quake4.x86 RET write 34/0x22 1285 quake4.x86 CALL write(0x1,0x283dd000,0xa) 1285 quake4.x86 GIO fd 1 wrote 10 bytes "si_code 1 " 1285 quake4.x86 RET write 10/0xa 1285 quake4.x86 CALL write(0x1,0x283dd000,0x1c) 1285 quake4.x86 GIO fd 1 wrote 28 bytes "Trying to exit gracefully.. " 1285 quake4.x86 RET write 28/0x1c 1285 quake4.x86 CALL write(0x1,0x283dd000,0x2e) 1285 quake4.x86 GIO fd 1 wrote 46 bytes "--------------- BSE Shutdown ---------------- " 1285 quake4.x86 RET write 46/0x2e 1285 quake4.x86 CALL write(0x1,0x283dd000,0x2e) 1285 quake4.x86 GIO fd 1 wrote 46 bytes "--------------------------------------------- " 1285 quake4.x86 RET write 46/0x2e 1285 quake4.x86 CALL write(0x1,0x283dd000,0x35) 1285 quake4.x86 GIO fd 1 wrote 53 bytes "WARNING: rvServerScanGUI::Clear() - invalid scanGUI " 1285 quake4.x86 RET write 53/0x35 1285 quake4.x86 CALL munmap(0x2d0ee000,0x101000) 1285 quake4.x86 RET munmap 0 1285 quake4.x86 CALL munmap(0x2d1ef000,0x101000) --- for a discussion concerning this problem please take a look at the following thread: http://lists.freebsd.org/pipermail/freebsd-current/2009-March/004563.html i'm not sure the linux_kdump excerpts document the actual problem. if a complete dump is required (~40MB) or a different excerpt please drop me a note. i've also applied the futex patch. yet that didn't solve the issue. here's a linux_kdump from the quake 4 demo after appliying the patch: --- 1837 quake4.x86 CALL linux_sys_futex(0x2dbece30,0x81,0x7fffffff,0,0x72d,0x7) 1837 quake4.x86 RET linux_sys_futex 0 1837 quake4.x86 PSIG SIGSEGV caught handler=0x8254b10 mask=0x0 code=0x0 1837 quake4.x86 CALL linux_sys_futex(0x286ce620,0x81,0x7fffffff,0,0x72d,0xbfbfc4fc) 1837 quake4.x86 RET linux_sys_futex 0 1837 quake4.x86 CALL write(0x1,0x283dd000,0x22) 1837 quake4.x86 GIO fd 1 wrote 34 bytes "signal caught: Segmentation fault " 1837 quake4.x86 RET write 34/0x22 1837 quake4.x86 CALL write(0x1,0x283dd000,0xa) 1837 quake4.x86 GIO fd 1 wrote 10 bytes "si_code 1 " 1837 quake4.x86 RET write 10/0xa 1837 quake4.x86 CALL write(0x1,0x283dd000,0x1c) 1837 quake4.x86 GIO fd 1 wrote 28 bytes "Trying to exit gracefully.. " --- cheers.
here's an entire linux_kdump from a little linux game called gridwars. it's a lot smaller than those produced by unreal tournament 2004 or quake 4 so it should be easier to find the problem. cheers.
Can you test with graphics/linux-f8-dri? WBR -- bsam
thanks, but that's not really my goal. installing linux-f8-dri overwrites the nvidia libraries. i'm able to run linux 3d apps after installing the linux-dri port, but i want to run games with the nvidia libraries which are highly optimized for nvidia graphic cards. somebody needs to fix the linuxulator, because obviously it's buggy. at least when emulating the 2.6 linux kernel. cheers. alex
I have the same problem here. It was working on 6-STABLE and 7-STABLE using linux_base-fc4 and compat.l= inux.osrelease: 2.4.2. It never worked on 7-STABLE with linux_base-fc6/linux_base-f8 and compat.= linux.osrelease: 2.6.16. And the same setup it's not working on 8-CURRENT too. I've tried with linux-enemyterritory but I'm getting: ...loading libGL.so.1: Received signal 11, exiting... Segmentation fault: 11 On dmesg I'm getting the following 2 lines: pid 26151 (et.x86), uid 1001: exited on signal 11 linux_sys_futex: unknown op 800164673 As the original OP said, it's working using libGL.so.1 from linux-f8-dri,= but with very bad performance. $ uname -a FreeBSD satanasso.local.net 8.0-CURRENT FreeBSD 8.0-CURRENT #0: Sun May 1= 0 16:18:47 CEST 2009 root@satanasso.local.net:/usr/obj/usr/src/sys/SA= TANASSO i386 $ pkg_info -Ix linux nvidia linux-enemyterritory-2.60b Wolfenstein: Enemy Territory (Linux version) linux-f8-dri-7.0.2 Mesa libGL runtime libraries and DRI drivers (Linux F= edora linux-f8-expat-2.0.1 Linux/i386 binary port of Expat XML-parsing library = (Linux linux-f8-fontconfig-2.4.2 An XML-based font configuration API for X Windo= ws (Linux Fe linux-f8-xorg-libs-7.3_2 Xorg libraries (Linux Fedora 8) linux_base-f8-8_11 Base set of packages needed in Linux mode (for i386/a= md64) nvidia-driver-180.44 NVidia graphics card binary drivers for hardware Ope= nGL ren $ sysctl -a compat compat.linux.oss_version: 198144 compat.linux.osrelease: 2.6.16 compat.linux.osname: Linux
As asked by Chagin Dmitry... > hmm, please, make a trace by ktrace or truss. You can find the full dump here: http://filebin.ca/owgdhn/l_kdmp.bz2 And these are some lines from the end: 49332 et.x86 CALL linux_getpid 49332 et.x86 RET linux_getpid 49332/0xc0b4 49332 et.x86 CALL linux_modify_ldt(0x11,0xbfbfdaf4,0x10) 49332 et.x86 RET linux_modify_ldt 666/0x29a 49332 et.x86 PSIG SIGSEGV caught handler=3D0x808c720 mask=3D0x0 code=3D0x= 0 49332 et.x86 CALL linux_fstat64(0x1,0xbfbfd13c,0x2847aff4) 49332 et.x86 UNKNOWN(8) 49332 et.x86 RET linux_fstat64 0 49332 et.x86 CALL linux_mmap2(0,0x1000,0x3,0x22,0xffffffff,0) 49332 et.x86 RET linux_mmap2 760414208/0x2d530000 49332 et.x86 CALL write(0x1,0x2d530000,0x1f) 49332 et.x86 GIO fd 1 wrote 31 bytes "Received signal 11, exiting... " 49332 et.x86 RET write 31/0x1f 49332 et.x86 CALL linux_sys_futex(0x2847c0b0,0x2fb18b41,0x1,0x2847b4c0,0x= d,0xbfbfd81c) 49332 et.x86 RET linux_sys_futex -1 errno 38 Socket operation on non-sock= et 49332 et.x86 PSIG SIGSEGV SIG_DFL 49332 et.x86 NAMI "et.x86.core" Hope it will help. Please ask if you need more info. Thanks Barbara
this problem report can be closed! the reason all the linux 3d applications crashed was due to a wrong linux libraries which got shipped with the nvidia freebsd driver. the fix will be in one of the next driver releases. for a quick fix do the following: 1. go to ftp://download.nvidia.com/XFree86/Linux-x86/ and enter the directory which is named after the release of the nvidia drivers which you are currently using. (`sysctl hw.nvidia.version`) 2. download the file NVIDIA-Linux-x86-XXX-pkg0.run (XXX being the relase you're running) 3. sh NVIDIA-Linux-x86-XXX-pkg0.run -x (XXX being the relase you're running) 4. cp -pR NVIDIA-Linux-x86-XXX-pkg0/usr/lib/tls/libnvidia-tls.so.XXX \ /compat/linux/usr/lib (XXX being the relase you're running) this should fix the issue and let you run linux 3d apps with compat.linux.osrelease set to 2.6.16 and a linux linux_base port > fc4. for more information have a look at this thread: http://www.nvnews.net/vbulletin/showthread.php?t=129584 cheers.
i talked to zander who is responsible for the freebsd nvidia driver and he said the following about this PR: "the two libnvidia-tls libraries support different TLS models: the one currently shipped with the NVIDIA FreeBSD graphics driver supports the old-style TLS model, the tls/ one the new ELF TLS model. The crashes you were seeing were not due to a problem with the Linux emulation layer. Future NVIDIA FreeBSD graphics driver releases will automatically determine which library to install." so even if the modify_ldt() linux syscall isn't implemented properly, this PR is not related to it. oh...btw: there have been some changes to modify_ldt() in HEAD. i think running the linux test project now passes that syscall. cheers.
Responsible Changed From-To: freebsd-bugs->freebsd-emulation Apparently this is actually a problem in our linuxulator, involving the threading model used. Submitter will provide more details shortly.
it took some time to entirely identify the cause of the problems reported in this PR. please disregard all previous comments trying to describe problem! they merely dealt with symptoms and not the actual cause! they're superseded by this comment! 1. although the problem report deals with a segfault related to a linux lib supplied with the nvidia closed source freebsd driver the problem isn't limited to this specific linux lib. 2. the problem should occur with any linux binary/lib which was built under/for a linux version which uses one of the old linux threading models. this comment from http://wiki.freebsd.org/linux-kernel provides a short description of the problem: "Linux has gone through two threading model changes. If a Linux application or library has been linked against the old pthreads without fast TLS support or pthreads with internal TLS support libraries it will segfault." a detailed description of the threading situation under linux as well as under freebsd can be found in this thread: http://lists.freebsd.org/pipermail/freebsd-threads/2003-June/000530.html 3. the nvidia closed source drivers are no longer suffering from the problem described in this PR. the reason for that is that during installation of the driver an application is run which detects the linux kernel version. the application detects whether libnvidia-tls.so (old threading model) or libnvidia-tls.so (new threading model) needs to be installed. the old threading model is used on linux kernel < 2.6, the new one on >= 2.6. the symptoms described in this PR were caused by this libnvidia-tls.so the whole time and NOT by libGL.so (it's merely linked against libnvidia-tls.so). the following short statement by zander@nvidia.com is added as a reference: "the two libnvidia-tls libraries support different TLS models: the one currently shipped with the NVIDIA FreeBSD graphics driver supports the old-style TLS model, the tls/ one the new ELF TLS model. The crashes you were seeing were not due to a problem with the Linux emulation layer. Future NVIDIA FreeBSD graphics driver releases will automatically determine which library to install." 4. right now the only way to run linux bins/libs which got build against a linux kernel with an old threading model is to alter compat.linux.osrelease and revert to 2.4 linux emulation mode. 5. what needs to be done to solve this PR is to determine the threading model of a bin/lib and a) figure out a way to execute it under 2.6 linux emulation or b) issue a warning and abort execution. right now this PR should be considered a 2.6.26 emulation stopper and makes it impossible to remove 2.4.2 emulation legacy code since this would prevent certain bins/libs to run at all. alex
Ok, from what I've understood it should be a linuxlator problem. Certainly it's because of my ignorance, but I'm a little confused, becaus= e from what I've tested in the past (after the post by zander on nvidia f= orum) and also form what I've got from your words (@ 2,3), the tls versio= n should work, am I wrong? The problem I'm facing is that now it's not working, so I made some tests= (*) with wolfsp (games/rtcw) and the lastest version of different major v= ersions of the nvidia driver: 180.60 -> it doesn't work, it works after replacing libnvidia-tls.so (not= e that it's the same major version for which zander suggested the fix) 185.18.36 -> it works, no workaround required (nvidia fixed it on new ver= sions?) 190.53 -> it doesn't work - even replacing libnvidia-tls.so 195.22 (ports)(**) -> it doesn't work - same as above(***) So I'm wondering why it stopped working between 185 and 190? Shouldn't it= be working with the tls version? It's a nvidia fault and should be reported, or >185 are exposing new "bug= s" in linuxlator, or because of changes in linuxlator having a bad impact= on >185,...? Sorry but my English is not good, so I hope I don't get misunderstood. If you need more tests, kdump, or anything else, I will be happy to help.= Sorry again and thank you for the patience... Barbara (*) # uname -a FreeBSD satanasso.local.net 8.0-STABLE FreeBSD 8.0-STABLE #0: Fri Jan 1 = 18:47:59 CET 2010 root@satanasso.local.net:/usr/obj/usr/src/sys/SATAN= ASSO i386 # sysctl compat.linux.osrelease compat.linux.osrelease: 2.6.16 # pkg_info -Ix linux_base linux_base-f10-10_2 (**) wolfsp doesn't work *anymore* also on RELENG_7, linux_base-fc-4_15, compa= t.linux.osrelease: 2.4.2. On July it was working. Anyway, just to add more confusion, linux-enemyterritory is working!!!(?)= (not tested on RELENG_8). (***) ...loading libGL.so.1: QGL_Init: Can't load libGL.so.1 from /etc/ld.so.co= nf or current dir: /usr/local/share/rtcw/libGL.so.1: cannot open shared o= bject file: No such file or directory
i remember having a similar problem a while ago. it seems some games from id software use a hardcoded libGL.so path. please try if the attached script solves the problem. cheers. alex p.s.: please keep in mind that the nvidia drivers performs some checks in places like /compat/linux/usr/{local|X11R6} and removes any graphic libs it finds in those locations. that way nvidia wants to make sure that no existing graphic libs conflict with their libs. this means you have to re-run the script everytime you re-install the nvidia drivers.
> i remember having a similar problem a while ago. it seems some games fr= om id > software use a hardcoded libGL.so path. please try if the attached scri= pt > solves the problem. > > cheers. > alex Yes, I know that perfectly: http://www.freebsd.org/cgi/query-pr.cgi?pr=3D= 118230. As you can see the one reporting that was me. Thank you anyway. The recent answer to that pr, now more than 2 years old, has been the rea= son to do some tests and to report here the failures. But that wasn't the problem, in fact rtcw and linux-enemyterritory never = required that fix. As wolfsp is working with 185.18.36 and not with 190.53, I was able to st= art it (on both RELENG_7 and RELENG_8) with nvidia-driver-195.22 from por= ts, setting the generated extension string to a pre 190 version: $ __GL_ExtensionStringVersion=3D18999 wolfsp Sorry for all the noise about that. Maybe this should be added to rtcw pkg-message.in, I will ask the maintai= ner. Anyway, doing some more tests, it seems that linux-doom3 and linux-quake4= , both working in the past, are now failing on RELENG_7. But I want to ch= eck again to make sure that the ports are still installed correctly. Then I tried installing linux-doom3 on RELENG_8 and surprisingly it works= perfectly! I'll try with linux-quake4 as soon as I can. If someone need it I have ktrace/linux_kdump collected on RELENG_7 that I= can upload on the web. Thanks Barbara
For who is still interested, linux-quake4 works on RELENG_8. It just needs some "updated" workarounds. On RELENG_7, both linux-doom3 and linux-quake4 are working, but they need= s some "new and updated" workarounds too. For detail, look in my PR, ports/118230. Best Regards Barbara
State Changed From-To: open->suspended Suspend this PR for now. This can't be easily fixed. The Linux 2.6.x emulation layer is missing support for pre 2.6.x TLS models.
Linuxulator 2.6 by now works fine with recent versions of x11/nvidia-driver (with LINUX option enabled.)