Bug 181144 - x11/nvidia-driver: CURRENT coredumps after starting X
Summary: x11/nvidia-driver: CURRENT coredumps after starting X
Status: Closed FIXED
Alias: None
Product: Ports & Packages
Classification: Unclassified
Component: Individual Port(s) (show other bugs)
Version: Latest
Hardware: Any Any
: Normal Affects Only Me
Assignee: Alexey Dokuchaev
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2013-08-08 16:10 UTC by O. Hartmann
Modified: 2013-08-22 08:11 UTC (History)
0 users

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description O. Hartmann 2013-08-08 16:10:00 UTC
Installing/updating most recent port x11/nvidia-driver (319.32 AND 325.15) brings the system immediately down when starting X.

Running CURRENT r253990 even with most recent nVidia GPU driver BLOB 325.15 doesn't have problems. With the most recent CURRENT (r254090), after booting, when the system loads the kernel-module nvidia.ko the box immediately core dumps.

I have to start /load the nvidia.ko via /etc/rc.conf.local since loading the kermel module in /boot/loader.conf very often results in immediately crashes.

How-To-Repeat: Update to the mst recent (r254090) sources of FreeBSD CURRENT and try loading nvidia.ko, starting X.
Comment 1 Edwin Groothuis freebsd_committer freebsd_triage 2013-08-08 16:10:08 UTC
Responsible Changed
From-To: freebsd-ports-bugs->danfe

Over to maintainer (via the GNATS Auto Assign Tool)
Comment 2 david 2013-08-09 20:56:05 UTC
My experience was a little different.

I managed to get head/i386 @r254135 built and booting ... by removing
the "options DEBUG_MEMGUARD" from my kernel.

However, that merely prevented a (very!) early panic, and got me to the
point where trying to start xdm with the x11/nvidia-driver as the
display driver causes an immediate reboot (no crash dump, despite
'dumpdev="AUTO"' in /etc/rc.conf).  No drop to debugger, either.

So I don't get a core dump.

Booting & starting xdm with the nv driver works.

However, the panic with DEBUG_MEMGUARD may offer a clue.  Unfortunately,
it's early enough that screen lock/scrolling doesn't work, and I only
had the patience to write down partof the panic information.  (This is
on my laptop; no serial console, AFAICT -- and no device to capture the
output if I did, since I'm not at home.)

The top line of the screen (at the panic) reads:

s/kern/subr_vmem.c:1050

The backtrace has the expected stuff near the top (about kbd, panic, and
memguard stuff); just below that is:

vmem_alloc(c1226100,6681000,2,c1820cc0,3b5,...) at 0xc0ac5673=vmem_alloc+0x53/frame 0xc1820ca0

Caveat: that was hand-transcribed from the screen to papaer, then
hand-transcribed from paper to this email message.  And my highest grade
in "Penmanship" was a D+.

Be that as it may, here's the relevant section of subr_vmem.c with line
numbers (cut/pasted, so tabs get munged):

   1039 /*
   1040  * vmem_alloc: allocate resource from the arena.
   1041  */
   1042 int
   1043 vmem_alloc(vmem_t *vm, vmem_size_t size, int flags, vmem_addr_t *addrp)
   1044 {
   1045         const int strat __unused = flags & VMEM_FITMASK;
   1046         qcache_t *qc;
   1047 
   1048         flags &= VMEM_FLAGS;
   1049         MPASS(size > 0);
   1050         MPASS(strat == M_BESTFIT || strat == M_FIRSTFIT);
   1051         if ((flags & M_NOWAIT) == 0)
   1052                 WITNESS_WARN(WARN_GIANTOK | WARN_SLEEPOK, NULL, "vmem_alloc");
   1053
   1054         if (size <= vm->vm_qcache_max) {
   1055                 qc = &vm->vm_qcache[(size - 1) >> vm->vm_quantum_shift];
   1056                 *addrp = (vmem_addr_t)uma_zalloc(qc->qc_cache, flags);
   1057                 if (*addrp == 0)
   1058                         return (ENOMEM);
   1059                 return (0);
   1060         }
   1061
   1062         return vmem_xalloc(vm, size, 0, 0, 0, VMEM_ADDR_MIN, VMEM_ADDR_MAX,
   1063             flags, addrp);
   1064 }


This is at r254025.

Peace,
david
-- 
David H. Wolfskill				david@catwhisker.org
Taliban: Evil men with guns afraid of truth from a 14-year old girl.

See http://www.catwhisker.org/~david/publickey.gpg for my public key.
Comment 3 danfe 2013-08-11 12:24:17 UTC
I've contacted Jeff about r254025, and he indeed confirmed that calling
kva_free(...) instead of kmem_free(kernel_arena, ...) is wrong.  There is
another patch submitted to address this: http://pastebin.com/RELMnQxY

As you can see, the difference is kmem_arena vs. kernel_arena as the first
parameter of kmem_ functions.  I will take a look to see what is actually
correct one.  Jeff said it should be kernel_arena, so I wonder where does
the kmem_arena come from?..
Comment 4 O. Hartmann 2013-08-11 13:14:09 UTC
On Sun, 11 Aug 2013 18:24:17 +0700
Alexey Dokuchaev <danfe@nsu.ru> wrote:

> I've contacted Jeff about r254025, and he indeed confirmed that
> calling kva_free(...) instead of kmem_free(kernel_arena, ...) is
> wrong.  There is another patch submitted to address this:
> http://pastebin.com/RELMnQxY
> 
> As you can see, the difference is kmem_arena vs. kernel_arena as the
> first parameter of kmem_ functions.  I will take a look to see what
> is actually correct one.  Jeff said it should be kernel_arena, so I
> wonder where does the kmem_arena come from?..


As I wrote in my posting/PR, I have taken the working solution without
questioning it - I'm no developer. I was asked to file a PR, so I
followed up the existing one.

Please feel free to correct the patch as you feel it is correct. I'll
try swapping the kmem_arena to kernel_arena and see if it is working or
not.

Oliver

Comment 5 sean_bruno 2013-08-11 16:22:58 UTC
Confirmed.  The update to the Makefile does indeed restore nvidia-driver
after the arena updates to head.
Comment 6 O. Hartmann 2013-08-12 10:26:21 UTC
On Sun, 11 Aug 2013 18:24:17 +0700
Alexey Dokuchaev <danfe@nsu.ru> wrote:

> I've contacted Jeff about r254025, and he indeed confirmed that
> calling kva_free(...) instead of kmem_free(kernel_arena, ...) is
> wrong.  There is another patch submitted to address this:
> http://pastebin.com/RELMnQxY
> 
> As you can see, the difference is kmem_arena vs. kernel_arena as the
> first parameter of kmem_ functions.  I will take a look to see what
> is actually correct one.  Jeff said it should be kernel_arena, so I
> wonder where does the kmem_arena come from?..



The patch at pastebin looks much cleaner and more informative to me.
Comment 7 danfe 2013-08-12 10:39:08 UTC
On Mon, Aug 12, 2013 at 11:26:21AM +0200, O. Hartmann wrote:
> On Sun, 11 Aug 2013 18:24:17 +0700
> Alexey Dokuchaev <danfe@nsu.ru> wrote:
> > I've contacted Jeff about r254025, and he indeed confirmed that
> > calling kva_free(...) instead of kmem_free(kernel_arena, ...) is
> > wrong.  There is another patch submitted to address this:
> > http://pastebin.com/RELMnQxY
> > 
> > As you can see, the difference is kmem_arena vs. kernel_arena as the
> > first parameter of kmem_ functions.  I will take a look to see what
> > is actually correct one.  Jeff said it should be kernel_arena, so I
> > wonder where does the kmem_arena come from?..
> 
> The patch at pastebin looks much cleaner and more informative to me.

I've pinged jeff@ about it, and currently awaiting answer from him.
Comment 8 John Baldwin freebsd_committer freebsd_triage 2013-08-19 18:30:40 UTC
The pastebin patch is correct.  kernel_arena corresponds to the old kernel_map 
and is what should be used.  kmem_arena corresponds to the old kmem_map and is 
used for malloc(9).  Please commit the pastebin patch so we can get this 
package fixed.

-- 
John Baldwin
Comment 9 dfilter service freebsd_committer freebsd_triage 2013-08-20 04:22:03 UTC
Author: danfe
Date: Tue Aug 20 03:21:50 2013
New Revision: 325027
URL: http://svnweb.freebsd.org/changeset/ports/325027

Log:
  Fix NVidia drivers correctly after KVA space allocation API changes in
  recent -CURRENT (after r254025).  Previously it would immediately core
  dump upon loading of nvidia.ko.
  
  PR:		ports/181144 (fix suggested in the audit trail)
  Reviewed by:	jhb
  Timeout from:	jeff (no cookie)

Modified:
  head/x11/nvidia-driver/Makefile

Modified: head/x11/nvidia-driver/Makefile
==============================================================================
--- head/x11/nvidia-driver/Makefile	Tue Aug 20 02:01:23 2013	(r325026)
+++ head/x11/nvidia-driver/Makefile	Tue Aug 20 03:21:50 2013	(r325027)
@@ -149,6 +149,11 @@ post-patch: .SILENT
 	${REINPLACE_CMD} -E 's/(VM_OBJECT_)(UN)?(LOCK)/\1W\2\3/' \
 		${WRKSRC}/src/nvidia_subr.c
 .endif
+# Adjust kmem(9) calls after FreeBSD src SVN r254025
+.if ${OSVERSION} > 1000040
+	${REINPLACE_CMD} -e '/kmem_/s/kernel_map/kernel_arena/' \
+		${WRKSRC}/src/nvidia_subr.c
+.endif
 # Fix stack buffer overflow in nvidia_sysctl_bus_type()
 .if ${NVVERSION} < 3192300
 	${REINPLACE_CMD} -E '/bus_type\[4\]/d ; \
@@ -156,11 +161,6 @@ post-patch: .SILENT
 		/return SYSCTL_OUT\(req, bus_type/d' \
 			${WRKSRC}/src/nvidia_sysctl.c
 .endif
-# Catch up with KVA space allocation API changes in recent -CURRENT
-.if ${OSVERSION} > 1000040
-	${REINPLACE_CMD} -e 's/kmem_free(kernel_map,/kva_free(/ ; \
-		/kmem_alloc_contig/s/map/arena/' ${WRKSRC}/src/nvidia_subr.c
-.endif
 # Process OPTIONS
 .if ${PORT_OPTIONS:MFREEBSD_AGP}
 	${REINPLACE_CMD} -E 's/undef (NV_SUPPORT_OS_AGP)/define \1/' \
_______________________________________________
svn-ports-all@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/svn-ports-all
To unsubscribe, send any mail to "svn-ports-all-unsubscribe@freebsd.org"
Comment 10 Alexey Dokuchaev freebsd_committer freebsd_triage 2013-08-22 08:11:19 UTC
State Changed
From-To: open->closed

Committed as r325027.