When i use ports/emulators/qemu with ports/emulators/kqemu-kmod under an AMD64 SMP kernel, it causes a panic (trap 12, supervisor read, page not present) it works fine under i386 SMP, or AMD64 UP i have tried qemu/kqemu binaries from pkg_add and compiled from source, as well as the latest qemu-devel How-To-Repeat: start qemu-system-x86_64 with kqemu (either -kernel-kqemu or without -no-kqemu) system will panic immediately.
I tested the i386 SMP and AMD64 UP on the exact same system, not different systems. And i also tried AMD64 SMP with machdep.hlt_cpus=2 (to halt the second cpu, and leave just the first running) and it still crashed.
Responsible Changed From-To: freebsd-ports-bugs->nox Over to maintainer
State Changed From-To: open->feedback Hmm a backtrace may be useful (this may be a little tricky since kqemu is a kld, maybe you can use the scripts in src/tools/debugscripts, or, failing that, use the KDB_TRACE kernel option.)
i just ran another test with a debug kernel (GENERIC SMP plus KDB, KDB_TRACE, DDB, GDB) got another kernel panic, trap 12, instruction pointer was: 0xffffffff804383f2 nm -n /boot/debug/kernel | grep ffffffff804383 gives: ====================================================================== ffffffff80438300 T taskqueue_create_fast ffffffff80438320 T taskqueue_enqueue_fast ffffffff80438330 t taskqueue_fast_enqueue ffffffff80438350 t taskqueue_fast_run ffffffff80438370 t taskqueue_define_fast ffffffff804383d0 T userret ====================================================================== did a kgdb on the vmcore that was generated: ====================================================================== Unread portion of the kernel message buffer: Fatal trap 12: page fault while in kernel mode cpuid = 0; apic id = 00 fault virtual address = 0x202 fault code = supervisor read, page not present instruction pointer = 0x8:0xffffffff804383f2 stack pointer = 0x10:0xffffffffb38f5ba0 frame pointer = 0x10:0xffffffffb38f5d10 code segment = base rx0, limit 0xfffff, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = 818 (qemu-system-x86_64) panic: from debugger cpuid = 0 KDB: stack backtrace: Uptime: 3m48s Dumping 4078 MB (3 chunks) chunk 0: 1MB (155 pages) ... ok chunk 1: 3310MB (847280 pages) 3294 3278 3262 3246 3230 3214 3198 3182 3166 3150 3134 3118 3102 3086 3070 3054 3038 3022 3006 2990 2974 2958 2942 2926 2910 2894 2878 2862 2846 2830 2814 2798 2782 2766 2750 2734 2718 2702 2686 2670 2654 2638 2622 2606 2590 2574 2558 2542 2526 2510 2494 2478 2462 2446 2430 2414 2398 2382 2366 2350 2334 2318 2302 2286 2270 2254 2238 2222 2206 2190 2174 2158 2142 2126 2110 2094 2078 2062 2046 2030 2014 1998 1982 1966 1950 1934 1918 1902 1886 1870 1854 1838 1822 1806 1790 1774 1758 1742 1726 1710 1694 1678 1662 1646 1630 1614 1598 1582 1566 1550 1534 1518 1502 1486 1470 1454 1438 1422 1406 1390 1374 1358 1342 1326 1310 1294 1278 1262 1246 1230 1214 1198 1182 1166 1150 1134 1118 1102 1086 1070 1054 1038 1022 1006 990 974 958 942 926 910 894 878 862 846 830 814 798 782 766 750 734 718 702 686 670 654 638 622 606 590 574 558 542 526 510 494 478 462 446 430 414 398 382 366 350 334 318 302 286 270 254 238 222 206 190 174 158 142 126 110 94 78 62 46 30 14 ... ok chunk 2: 768MB (196608 pages) 753 737 721 705 689 673 657 641 625 609 593 577 561 545 529 513 497 481 465 449 433 417 401 385 369 353 337 321 305 289 273 257 241 225 209 193 177 161 145 129 113 97 81 65 49 33 17 1 #0 doadump () at pcpu.h:172 172 __asm __volatile("movq %%gs:0,%0" : "=r" (td)); (kgdb) getsyms During symbol reading, Incomplete CFI data; unspecified registers at 0xffffffff8040e0dc. Id Refs Address Size Name 1 7 0x80100000 9bbec8 kernel 2 4 0xb6624000 8472 netgraph.ko 3 1 0xb662d000 12fd ng_ether.ko 4 1 0xb662f000 2da9 ng_pppoe.ko 5 1 0xb6632000 1bad ng_socket.ko 6 1 0xb6634000 4a07 aio.ko 7 1 0xb6639000 276da kqemu.ko Select the list above with the mouse, paste into the screen and then press ^D. Yes, this is annoying. 1 7 0x80100000 9bbec8 kernel 2 4 0xb6624000 8472 netgraph.ko 3 1 0xb662d000 12fd ng_ether.ko 4 1 0xb662f000 2da9 ng_pppoe.ko 5 1 0xb6632000 1bad ng_socket.ko 6 1 0xb6634000 4a07 aio.ko 7 1 0xb6639000 276da kqemu.ko add symbol table from file "/usr/obj/usr/src/sys/DEBUG/modules/usr/src/sys/modules/aio/aio.ko.debug" at .text_addr = 0xb6634000 .data_addr = 0xb6634000 .bss_addr = 0xb6634000 add symbol table from file "/usr/obj/usr/src/sys/DEBUG/modules/usr/src/sys/modules/netgraph/ether/ng_ether.ko.debug" at .text_addr = 0xb662d000 .data_addr = 0xb662d000 .bss_addr = 0xb662d000 add symbol table from file "/usr/obj/usr/src/sys/DEBUG/modules/usr/src/sys/modules/netgraph/netgraph/netgraph.ko.debug" at .text_addr = 0xb6624000 .data_addr = 0xb6624000 .bss_addr = 0xb6624000 add symbol table from file "/usr/obj/usr/src/sys/DEBUG/modules/usr/src/sys/modules/netgraph/pppoe/ng_pppoe.ko.debug" at .text_addr = 0xb662f000 .data_addr = 0xb662f000 .bss_addr = 0xb662f000 add symbol table from file "/usr/obj/usr/src/sys/DEBUG/modules/usr/src/sys/modules/netgraph/socket/ng_socket.ko.debug" at .text_addr = 0xb6632000 .data_addr = 0xb6632000 .bss_addr = 0xb6632000 (kgdb) where #0 doadump () at pcpu.h:172 During symbol reading, Incomplete CFI data; unspecified registers at 0xffffffff8040e0dc. #1 0xffffffff8040e735 in boot (howto=0x104) at /usr/src/sys/kern/kern_shutdown.c:409 #2 0xffffffff8040ee45 in panic (fmt=0xffffff00c3965000 "°6\211¿") at /usr/src/sys/kern/kern_shutdown.c:565 #3 0xffffffff801b0312 in db_panic (addr=0x0, have_addr=0x0, count=0x0, modif=0x0) at /usr/src/sys/ddb/db_command.c:438 #4 0xffffffff801b0855 in db_command_loop () at /usr/src/sys/ddb/db_command.c:350 #5 0xffffffff801b277d in db_trap (type=0xb38f5930, code=0x0) at /usr/src/sys/ddb/db_main.c:222 #6 0xffffffff8042e329 in kdb_trap (type=0xc, code=0x0, tf=0xffffffffb38f5af0) at /usr/src/sys/kern/subr_kdb.c:473 #7 0xffffffff80650975 in trap_fatal (frame=0xffffffffb38f5af0, eva=0xffffff00c3965000) at /usr/src/sys/amd64/amd64/trap.c:651 #8 0xffffffff80650d03 in trap_pfault (frame=0xffffffffb38f5af0, usermode=0x0) at /usr/src/sys/amd64/amd64/trap.c:573 #9 0xffffffff80650f5d in trap (frame= {tf_rdi = 0xffffff012f655720, tf_rsi = 0x4, tf_rdx = 0x46, tf_rcx = 0xffffffff8063c05b, tf_r8 = 0xffffffff8094e768, tf_r9 = 0xffffff012f655720, tf_rax = 0x2, tf_rbx = 0xf4240, tf_rbp = 0xffffffffb38f5d10, tf_r10 = 0xffffff012b39e108, tf_r11 = 0x2, tf_r12 = 0xffffff012f655720, tf_r13 = 0xffffffffb38f5bd0, tf_r14 = 0x0, tf_r15 = 0xffffffff801c4cd0, tf_trapno = 0x4, tf_addr = 0x2, tf_flags = 0xfffffffd, tf_err = 0x0, tf_rip = 0xffffffff804383f2, tf_cs = 0x8, tf_rflags = 0x10282, tf_rsp = 0xffffffffb38f5bb0, tf_ss = 0xffffffff806468c8}) at /usr/src/sys/amd64/amd64/trap.c:352 #10 0xffffffff80640eca in lapic_handle_timer (frame= {cf_rdi = 0xffffffff8094e768, cf_rsi = 0xffffff012f655720, cf_rdx = 0x2, cf_rcx = 0xf4240, cf_r8 = 0xffffffffb38f5d10, cf_r9 = 0xffffff012b39e108, cf_rax = 0x2, cf_rbx = 0xffffff012f655720, cf_rbp = 0xffffffffb38f5bd0, cf_r10 = 0x0, cf_r11 = 0xffffffff801c4cd0, cf_r12 = 0x4, cf_r13 = 0x2, cf_r14 = 0xfffffffd, cf_r15 = 0x0, cf_rip = 0xffffffff806468c8, cf_cs = 0x8, cf_rflags = 0x202, cf_rsp = 0xffffffffb38f5bd0, cf_ss = 0x10}) at /usr/src/sys/amd64/amd64/local_apic.c:657 #11 0xffffffff8063c05b in Xcpustop () at apic_vector.S:282 #12 0xffffffff806468c8 in mp_grab_cpu_hlt () at /usr/src/sys/amd64/amd64/mp_machdep.c:1226 #13 0x000000000000000c in __set_modmetadata_set_sym__mod_metadata_md_aio () Cannot access memory at address 0x8 (kgdb) quit ======================================================================
On Thu, Jun 07, 2007 at 07:10:14PM +0000, Allan Jude wrote: >[...] > got another kernel panic, trap 12, instruction pointer was: > 0xffffffff804383f2 Hmm can you do an `i li *0xffffffff804383f2' in kgdb?
Line 82 of "/usr/src/sys/kern/subr_trap.c" starts at address 0xffffffff804383f2 <userret+34> and ends at 0xffffffff804383f5 <userret+37>. before: Line 81 of "/usr/src/sys/kern/subr_trap.c" starts at address 0xffffffff804383d0 <userret> and ends at 0xffffffff804383f2 <userret+34>. after: Line 81 of "/usr/src/sys/kern/subr_trap.c" starts at address 0xffffffff804383f5 <userret+37> and ends at 0xffffffff804383f8 <userret+40>.
I recreated it again, and the 'stopped at' in the kernel panic is: userret+0x22 movq 0(%rdi),%rbx
On Fri, Jun 08, 2007 at 03:10:10PM +0000, Allan Jude wrote: > I recreated it again, and the 'stopped at' in the kernel panic is: > > userret+0x22 movq 0(%rdi),%rbx Ok so apparently userret was called with a bogus td arg, can you find out from where? (there should be a return address on the stack, userret here starts with a sub $0x28,%rsp (hmm, no frame pointer?) so add that or whatever yours subtracts.) Btw, > fault virtual address = 0x202 > fault code = supervisor read, page not present >[...] > #9 0xffffffff80650f5d in trap (frame= > {tf_rdi = 0xffffff012f655720, tf_rsi = 0x4, tf_rdx = 0x46, tf_rcx >[...] shouldnt tf_rdi here be rdi at the time of the fault, i.e. 0x202? Anyone know why its different? Also, as mentioned above userret doesnt save a frame pointer here (rbp) and indeed, > 0xffffff012f655720, tf_rax = 0x2, tf_rbx = 0xf4240, tf_rbp = > 0xffffffffb38f5d10, tf_r10 = 0xffffff012b39e108, tf_r11 = 0x2, tf_r12 = >[...] > tf_rflags = 0x10282, tf_rsp = 0xffffffffb38f5bb0, tf_ss = tf_rbp seems to be way off compared to tf_rsp, are parts of the kernel now compiled with -fomit-frame-pointer? (even for a debug kernel?) This may explain why we dont see who called userret in the kgdb backtrace...
Can you please check if this is still a problem with the current port? (It may have been caused by the kld not being compiled with SMP defined.)
State Changed From-To: feedback->closed Feedback timeout (> 6 months).
State Changed From-To: closed->suspended Assignee notes that the problem really still exists, but is very difficult to reproduce. Reopen.
nox 2008-05-01 13:29:16 UTC FreeBSD ports repository Modified files: emulators/kqemu-kmod Makefile Added files: emulators/kqemu-kmod/files patch-common-Makefile patch-tssworkaround Log: - Add a workaround for the amd64 SMP shared gdt issue that caused the host panics - longer explanation in this post: http://docs.freebsd.org/cgi/mid.cgi?20080501101951.GA30274 [1] - Get rid of superfluous "kqemu " in IGNORE message when kernel source is missing - Pass down DEBUG_FLAGS to the build - Bump PORTREVISION PR: ports/113430 [1] Revision Changes Path 1.23 +4 -2 ports/emulators/kqemu-kmod/Makefile 1.1 +22 -0 ports/emulators/kqemu-kmod/files/patch-common-Makefile (new) 1.1 +70 -0 ports/emulators/kqemu-kmod/files/patch-tssworkaround (new) _______________________________________________ cvs-all@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/cvs-all To unsubscribe, send any mail to "cvs-all-unsubscribe@freebsd.org"
State Changed From-To: suspended->closed Workaround committed. Thanks!
nox 2008-05-12 19:09:52 UTC FreeBSD ports repository Modified files: emulators/kqemu-kmod Makefile emulators/kqemu-kmod/files patch-tssworkaround Log: - Fix multiple qemu processes on amd64 SMP by actually using seperate per-cpu gdts (the previous fix was only stable for one qemu process at a time) Relevant thread: http://lists.freebsd.org/pipermail/freebsd-emulation/2008-May/004902.html - Bump PORTREVISION PR: ports/113430 Revision Changes Path 1.25 +1 -1 ports/emulators/kqemu-kmod/Makefile 1.3 +49 -8 ports/emulators/kqemu-kmod/files/patch-tssworkaround _______________________________________________ cvs-all@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/cvs-all To unsubscribe, send any mail to "cvs-all-unsubscribe@freebsd.org"