I'm using ports/net/mpd4 to set up a netgraph ng0 device. This is for my PPPoE DSL connection. The firewall is PF with ALTQ enabled. When I kill the process from mpd4 and then restart pf using 'sh /etc/rc.d/pf restart' I get a 'fatal trap 12'. This doesn't happen if ALTQ is DISABLED in the KERNEL-config! Restarting pf without killing/restarting mpd4 is no problem. Line disconnect/reconnect (getting new ip) with pf restart call from linkup script is no problem. - This is reproducable nearly every time - same happens if mpd4 is started again before restart of pf. - seems to happen after 'pfctl -f /etc/pf.conf' in the 'enable' or 'reload' part of the script is called. - doesn't matter if 'reload' instead of 'restart' is used in /etc/rc.d/pf - kernel compile time optimization options doesn't matter. - the version of mpd doesn't matter, happened since old mpd3 ports. This problem exist since i've created this setup in the FreeBSD 5.X days when ALTQ was integrated in the tree. Since then i've used 3 different mainboard/CPU/RAM/NIC/DSL-MODEM combinations. This bug is really hardware independent! Reading the good FreeBSD debugging docs i've managed to get a kernel dump and some lines out of the debugger. Tell me if you need more. Copyright 2004 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "i386-marcel-freebsd". Unread portion of the kernel message buffer: kernel trap 12 with interrupts disabled Fatal trap 12: page fault while in kernel mode cpuid = 0; apic id = 00 fault virtual address = 0x104 fault code = supervisor read, page not present instruction pointer = 0x20:0xc05d77dc stack pointer = 0x28:0xd6c72944 frame pointer = 0x28:0xd6c72958 code segment = base rx0, limit 0xfffff, type 0x1b = DPL 0, pres 1, def32 1, gran 1 processor eflags = resume, IOPL = 0 current process = 1924 (pfctl) trap number = 12 panic: page fault cpuid = 0 GEOM_MIRROR: Device gm0: rebuilding provider ad0 stopped. Uptime: 5m55s Dumping 511 MB (2 chunks) chunk 0: 1MB (159 pages) ... ok chunk 1: 511MB (130816 pages) 496 480 464 448 432 416 400 384 368 352 336 320 304 288 272 256 240 224 208 192 176 160 144 128 112 96 80 64 48 32 16 #0 doadump () at pcpu.h:165 165 __asm __volatile("movl %%fs:0,%0" : "=r" (td)); (kgdb) list *0xc05d77dc 0xc05d77dc is in _mtx_lock_sleep (../../../kern/kern_mutex.c:548). 543 * If the current owner of the lock is executing on another 544 * CPU, spin instead of blocking. 545 */ 546 owner = (struct thread *)(v & MTX_FLAGMASK); 547 #ifdef ADAPTIVE_GIANT 548 if (TD_IS_RUNNING(owner)) { 549 #else 550 if (m != &Giant && TD_IS_RUNNING(owner)) { 551 #endif 552 turnstile_release(&m->mtx_object); (kgdb) backtrace #0 doadump () at pcpu.h:165 #1 0xc05e3ac9 in boot (howto=260) at ../../../kern/kern_shutdown.c:409 #2 0xc05e3e96 in panic (fmt=0xc08006a1 "%s") at ../../../kern/kern_shutdown.c:565 #3 0xc07ce8cc in trap_fatal (frame=0xd6c72904, eva=0) at ../../../i386/i386/trap.c:837 #4 0xc07cdf84 in trap (frame= {tf_fs = 8, tf_es = 40, tf_ds = 40, tf_edi = -1004787840, tf_esi = 4, tf_ebp = -691590824, tf_isp = -691590864, tf_ebx = -1014618868, tf_edx = 6, tf_ecx = -1004787840, tf_eax = 1, tf_trapno = 12, tf_err = 0, tf_eip = -1067616292, tf_cs = 32, tf_eflags = 65538, tf_esp = -1014618868, tf_ss = 0}) at ../../../i386/i386/trap.c:270 #5 0xc07b791a in calltrap () at ../../../i386/i386/exception.s:139 #6 0xc05d77dc in _mtx_lock_sleep (m=0xc386250c, tid=3290179456, opts=0, file=0x0, line=0) at ../../../kern/kern_mutex.c:546 #7 0xc0469619 in priq_class_destroy (cl=0xc3d156c0) at ../../../contrib/altq/altq/altq_priq.c:416 #8 0xc04691d6 in priq_clear_interface (pif=0xc3e9d100) at ../../../contrib/altq/altq/altq_priq.c:252 #9 0xc0469025 in priq_remove_altq (a=0x6) at ../../../contrib/altq/altq/altq_priq.c:161 #10 0xc046d2af in altq_remove (a=0x6) at ../../../contrib/altq/altq/altq_subr.c:647 #11 0xc04a7d77 in pf_commit_altq (ticket=1) at ../../../contrib/pf/net/pf_ioctl.c:1122 #12 0xc04abfd4 in pfioctl (dev=0xc392ad00, cmd=4, addr=0x3 <Address 0x3 out of bounds>, flags=3, td=0x1) at ../../../contrib/pf/net/pf_ioctl.c:3055 #13 0xc0578559 in devfs_ioctl_f (fp=0xc42200d8, com=3222029394, data=0xc41df180, cred=0xc41a9b80, td=0xc41c2780) at ../../../fs/devfs/devfs_vnops.c:479 #14 0xc060f0fb in ioctl (td=0xc41c2780, uap=0xd6c72d04) at file.h:264 #15 0xc07cecd3 in syscall (frame= {tf_fs = 59, tf_es = 59, tf_ds = 59, tf_edi = -1077945268, tf_esi = 0, tf_ebp = -1077945256, tf_isp = -691589788, tf_ebx = -1077942704, tf_edx = 134737920, tf_ecx = 0, tf_eax = 54, tf_trapno = 0, tf_err = 2, tf_eip = 672815751, tf_cs = 51, tf_eflags = 582, tf_esp = -1077945300, tf_ss = 59}) at ../../../i386/i386/trap.c:983 #16 0xc07b796f in Xint0x80_syscall () at ../../../i386/i386/exception.s:200 #17 0x00000033 in ?? () Previous frame inner to this frame (corrupt stack?) (kgdb) up 7 #7 0xc0469619 in priq_class_destroy (cl=0xc3d156c0) at ../../../contrib/altq/altq/altq_priq.c:416 416 IFQ_LOCK(cl->cl_pif->pif_ifq); (kgdb) list 411 #ifdef __NetBSD__ 412 s = splnet(); 413 #else 414 s = splimp(); 415 #endif 416 IFQ_LOCK(cl->cl_pif->pif_ifq); 417 418 #ifdef ALTQ3_CLFIER_COMPAT 419 /* delete filters referencing to this class */ 420 acc_discard_filters(&cl->cl_pif->pif_classifier, cl, 0); (kgdb) up #8 0xc04691d6 in priq_clear_interface (pif=0xc3e9d100) at ../../../contrib/altq/altq/altq_priq.c:252 252 priq_class_destroy(cl); (kgdb) up #9 0xc0469025 in priq_remove_altq (a=0x6) at ../../../contrib/altq/altq/altq_priq.c:161 161 (void)priq_clear_interface(pif); (kgdb) up #10 0xc046d2af in altq_remove (a=0x6) at ../../../contrib/altq/altq/altq_subr.c:647 647 error = priq_remove_altq(a); (kgdb) up #11 0xc04a7d77 in pf_commit_altq (ticket=1) at ../../../contrib/pf/net/pf_ioctl.c:1122 1122 err = altq_remove(altq); (kgdb) up #12 0xc04abfd4 in pfioctl (dev=0xc392ad00, cmd=4, addr=0x3 <Address 0x3 out of bounds>, flags=3, td=0x1) at ../../../contrib/pf/net/pf_ioctl.c:3055 3055 if ((error = pf_commit_altq(ioe.ticket))) (kgdb) up #13 0xc0578559 in devfs_ioctl_f (fp=0xc42200d8, com=3222029394, data=0xc41df180, cred=0xc41a9b80, td=0xc41c2780) at ../../../fs/devfs/devfs_vnops.c:479 479 error = dsw->d_ioctl(dev, com, data, fp->f_flag, td); (kgdb) quit **** kernel config **** machine i386 cpu I686_CPU ident XXX makeoptions DEBUG=-g # Build kernel with gdb(1) debug symbols options SCHED_4BSD # 4BSD scheduler options PREEMPTION # Enable kernel thread preemption options INET # InterNETworking options INET6 # IPv6 communications protocols options FFS # Berkeley Fast Filesystem options SOFTUPDATES # Enable FFS soft updates support options UFS_ACL # Support for access control lists options UFS_DIRHASH # Improve performance on big directories options MD_ROOT # MD is a potential root device options NFSCLIENT # Network Filesystem Client options NFSSERVER # Network Filesystem Server options NFS_ROOT # NFS usable as /, requires NFSCLIENT options MSDOSFS # MSDOS Filesystem options CD9660 # ISO 9660 Filesystem options PROCFS # Process filesystem (requires PSEUDOFS) options PSEUDOFS # Pseudo-filesystem framework options GEOM_GPT # GUID Partition Tables. options COMPAT_43 # Compatible with BSD 4.3 [KEEP THIS!] options COMPAT_FREEBSD4 # Compatible with FreeBSD4 options COMPAT_FREEBSD5 # Compatible with FreeBSD5 options SCSI_DELAY=5000 # Delay (in ms) before probing SCSI options KTRACE # ktrace(1) support options SYSVSHM # SYSV-style shared memory options SYSVMSG # SYSV-style message queues options SYSVSEM # SYSV-style semaphores options _KPOSIX_PRIORITY_SCHEDULING # POSIX P1003_1B real-time extensions options KBD_INSTALL_CDEV # install a CDEV entry in /dev options AHC_REG_PRETTY_PRINT # Print register bitfields in debug # output. Adds ~128k to driver. options AHD_REG_PRETTY_PRINT # Print register bitfields in debug # output. Adds ~215k to driver. options ADAPTIVE_GIANT # Giant mutex is adaptive. options DEVICE_POLLING options NETGRAPH options NETGRAPH_ETHER options NETGRAPH_SOCKET options NETGRAPH_PPP options NETGRAPH_PPPOE options NETGRAPH_IFACE options NETGRAPH_BPF options NETGRAPH_TCPMSS options NETGRAPH_VJC options GEOM_MIRROR options COMPAT_LINUX options LINPROCFS device cpufreq device pf device pflog device pfsync options ALTQ options ALTQ_CBQ options ALTQ_RED options ALTQ_RIO options ALTQ_HFSC options ALTQ_CDNR options ALTQ_PRIQ device drm options MSGBUF_SIZE=40960 options VESA options VGA_WIDTH90 options SC_PIXEL_MODE options SC_HISTORY_SIZE=1000 # number of history buffer lines options PANIC_REBOOT_WAIT_TIME=10000 # To make an SMP kernel, the next two lines are needed options SMP # Symmetric MultiProcessor Kernel device apic # I/O APIC device acpi # Bus support. Do not remove isa, even if you have no isa slots device isa device pci # Floppy drives device fdc # ATA and ATAPI devices device ata device atadisk # ATA disk drives device ataraid # ATA RAID drives device atapicd # ATAPI CDROM drives device atapifd # ATAPI floppy drives device atapist # ATAPI tape drives options ATA_STATIC_ID # Static device numbering # SCSI peripherals device scbus # SCSI bus (required for SCSI) device ch # SCSI media changers device da # Direct Access (disks) device sa # Sequential Access (tape etc) device cd # CD device pass # Passthrough device (direct SCSI access) device ses # SCSI Environmental Services (and SAF-TE) # atkbdc0 controls both the keyboard and the PS/2 mouse device atkbdc # AT keyboard controller device atkbd # AT keyboard device psm # PS/2 mouse device kbdmux # keyboard multiplexer device vga # VGA video card driver device splash # Splash screen and screen saver support # syscons is the default console driver, resembling an SCO console device sc device agp # support several AGP chipsets # Floating point support - do not disable. device npx device pmtimer # Serial (COM) ports device sio # 8250, 16[45]50 based serial ports # Parallel port device ppc device ppbus # Parallel port bus (required) device lpt # Printer device plip # TCP/IP over parallel device ppi # Parallel port interface device #device vpo # Requires scbus and da # If you've got a "dumb" serial or parallel PCI card that is # supported by the puc(4) glue driver, uncomment the following # line to enable it (connects to the sio and/or ppc drivers): #device puc device em # Intel PRO/1000 adapter Gigabit Ethernet Card # PCI Ethernet NICs that use the common MII bus controller code. # NOTE: Be sure to keep the 'device miibus' line in order to use these NICs! device miibus # MII bus support device fxp # Intel EtherExpress PRO/100B (82557, 82558) # Pseudo devices. device loop # Network loopback device mem # Memory and kernel memory devices device io # I/O device device random # Entropy device device ether # Ethernet support device sl # Kernel SLIP device ppp # Kernel PPP device tun # Packet tunnel. device tap # für openvpn device pty # Pseudo-ttys (telnet etc) device md # Memory "disks" device gif # IPv6 and IPv4 tunneling device faith # IPv6-to-IPv4 relaying (translation) # The `bpf' device enables the Berkeley Packet Filter. # Be aware of the administrative consequences of enabling this! # Note that 'bpf' is required for DHCP. device bpf # Berkeley packet filter # USB support device uhci # UHCI PCI->USB interface device ohci # OHCI PCI->USB interface device ehci # EHCI PCI->USB interface (USB 2.0) device usb # USB Bus (required) #device udbp # USB Double Bulk Pipe devices device ugen # Generic device uhid # "Human Interface Devices" device ukbd # Keyboard device ulpt # Printer device umass # Disks/Mass storage - Requires scbus and da device ums # Mouse #device ural # Ralink Technology RT2500USB wireless NICs #device urio # Diamond Rio 500 MP3 player #device uscanner # Scanners ******** extract from pf.conf ********* ### variables ### Intern = "em0" Extern = "ng0" IntNet = "10.0.0.0/24" Loop = "lo0" ### options ### set loginterface $Extern set optimization aggressive scrub in on $Extern all scrub out on $Extern all fragment reassemble random-id max-mss 1440 ### ALTQ #### altq on $Extern priq bandwidth 1000Kb queue { q_pri, q_higher, q_def } queue q_pri priority 15 priq queue q_higher priority 5 priq queue q_def priority 0 priq(default) ### NAT & fwd ### nat on $Extern from $IntNet to any -> $Extern static-port rdr-anchor redirect ### filter ### block log on $Extern block quick on $Extern inet6 pass quick on $Loop all pass quick on $Intern all block in log quick on $Extern inet proto tcp from any to any flags FUP/FUP block in log quick on $Extern inet proto tcp from any to any flags SF/SFRA block in log quick on $Extern inet proto tcp from any to any flags /SFRA antispoof quick for $Extern inet block log quick on $Extern inet proto tcp from any to any port { 135, 137, 138, 139, 445, 593 } block log quick on $Extern inet proto udp from any to any port { 135, 137, 138, 445 } block quick on $Extern inet proto pfsync pass in quick on $Extern inet proto icmp all icmp-type 8 code 0 keep state queue q_def pass in quick on $Extern inet proto icmp all icmp-type { 0 , 3 , 11 } keep state queue q_def ### dynamic rules ### pass out quick on $Extern inet proto tcp from 10.0.0.13 to any flags S/SA keep state queue (q_def, q_pri) pass out quick on $Extern inet proto {udp, icmp} from 10.0.0.13 to any keep state queue ( q_def ) pass out quick on $Extern inet proto tcp from $Extern to any flags S/SA keep state queue (q_higher, q_pri) pass out quick on $Extern inet proto {udp, icmp} from $Extern to any keep state queue ( q_higher ) How-To-Repeat: - setup mpd4 to create a ng0 (netgraph) device - start pf with altq enabled. Use altq rules. - kill or restart mpd4 - execute 'sh /etc/rc.d/pf restart' or 'sh /etc/rc.d/pf reload' Fatal trap 12: page fault while in kernel mode
Responsible Changed From-To: freebsd-bugs->freebsd-pf This seems more like something for the pf group.
First I would suggest to use ALTQ w/ mpd not on ng0 but on the real physical interface (for example fxp0, xl0) which is being used by netgraph/mpd. On the other side I also do have trouble using ALTQ with mpd but I'm using mpd for a 3G connection (based on a tty device, not a NIC). Avoiding ALTQ rules in pf.conf for the ng0 interface (not using ALTQ on ng0) doesn't produce a fatal trap 12. So disabling ALTQ in your kernel is not the only workaround. You may still use ALTQ on your internal NIC without a trap. Unlike your experience, I always do experience a kernel trap when reloading pf rules w/ ALTQ on ng0 (whether or not pf rules are reloaded by a script or manually). This also occours while the ng0 interface is still there and from my experience it's not related to a reload of mpd.
I use ALTQ primarily for priorizing tcp acks. Tell me if I'm wrong, but I think it is not possible to priorize TCP ACKS on encapsulated PPPoE data on the 'real' interface. Bandwidth limiting on ng0 works great if I left some bandwidth for the PPPoE overhead. Beside this, I can't currently limit the real interface, because the dsl-modem is connected in another room on the main LAN. I don't have a dedicated NIC for the modem. Boris
Boris, On 12/06/06 20:17, Boris S. wrote: > I use ALTQ primarily for priorizing tcp acks. > Tell me if I'm wrong, but I think it is not possible to priorize TCP > ACKS on encapsulated PPPoE data on the 'real' interface. You do this for example: altq on xl0 .... queue blabla ... pass out on ng0 all queue(blablabla) > Bandwidth limiting on ng0 works great if I left some bandwidth for the > PPPoE overhead. > > Beside this, I can't currently limit the real interface, because the > dsl-modem is connected in another room on the main LAN. I don't have a > dedicated NIC for the modem. As I understand your NAT gateway has just one NIC and you're using a PPPoE pass-thru capable router? If so, you may still be able to use one queue for local traffic and one queue for external traffic (and sub-queues of both of course) on your NIC. But that's a question of personal taste. If ALTQ works for you your way, I would not effort a change. Greetings, Volker
Volker schrieb: > As I understand your NAT gateway has just one NIC and you're using a > PPPoE pass-thru capable router? nearly, it's a DSL Modem, not a router. It speaks only PPPoE on the Ethernet. (acting like a router with PPPoE pass-thru) > If so, you may still be able to use one queue for local traffic and > one queue for external traffic (and sub-queues of both of course) on > your NIC. But that's a question of personal taste. If ALTQ works for > you your way, I would not effort a change. I'll probably not change, but I'm open for alternate configuration possibilities. Boris
Okay, this is highly untested and certainly needs more work, but I don't have a crashbox set up right now, so if you could give it a try we might be getting somewhere quick. Please turn on misc debugging (pfctl -xm). This also might be a way to use ALTQ on not yet created interfaces, though this needs even more work. Report back if this changes anything. If you get a crash I'd like to see a dump and dmesg if possible. Thanks a lot. -- Max
Max Laier schrieb: > Okay, this is highly untested and certainly needs more work, but I don't > have a crashbox set up right now, so if you could give it a try we might > be getting somewhere quick. > > Please turn on misc debugging (pfctl -xm). > > This also might be a way to use ALTQ on not yet created interfaces, though > this needs even more work. > > Report back if this changes anything. If you get a crash I'd like to see > a dump and dmesg if possible. This test patch works great! I've connected, disconnected, restarted and reloaded very many times in random order and nothing bad happens! If I kill my mpd4 (without touching pf) I get the debug log: pf: remove altq ng0. ...22 22 22 I get always "22 22 22". No other numbers after serval restarts of mpd4, pf and FreeBSD. Thank you for your promptly investigation! Boris
mlaier 2008-03-29 00:24:36 UTC FreeBSD src repository Modified files: contrib/pf/pfctl pfctl_altq.c pfctl_qstats.c sys/contrib/pf/net pf_if.c pf_ioctl.c pfvar.h Log: Make ALTQ cope with disappearing interfaces (particularly common with mpd and netgraph in gernal). This also allows to add queues for an interface that is not yet existing (you have to provide the bandwidth for the interface, however). PR: kern/106400, kern/117827 MFC after: 2 weeks Revision Changes Path 1.10 +12 -0 src/contrib/pf/pfctl/pfctl_altq.c 1.7 +26 -0 src/contrib/pf/pfctl/pfctl_qstats.c 1.15 +6 -0 src/sys/contrib/pf/net/pf_if.c 1.31 +116 -2 src/sys/contrib/pf/net/pf_ioctl.c 1.17 +7 -0 src/sys/contrib/pf/net/pfvar.h _______________________________________________ cvs-all@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/cvs-all To unsubscribe, send any mail to "cvs-all-unsubscribe@freebsd.org"
Here are MFC patches for RELENG_6 and RELENG_7, please test and report back, thanks! http://people.freebsd.org/%7Emlaier/pf.dyn_altq.R6.diff http://people.freebsd.org/%7Emlaier/pf.dyn_altq.R7.diff -- Max
State Changed From-To: open->feedback MFC patches need testing, thanks.
OK, thank you! I'm going to test it in the next days on two machines (6-Stable and 7-Stable). Boris
Feedback for RELENG_7: I can't reproduce this bug at RELENG_7 currently. I tried mpd4 and mpd5. It seems to be fixed otherwise. I applied this patch anyway and tried to trigger the bug, but it doesn't happen. All is working fine so far! I'll try this patch on RELENG_6 in the next days. On RELENG_6 I can reproduce this bug for sure. Boris
As I said, the bug was always reproducible on a RELENG_6 server. I applied the patch and now I can kill/restart mpd and pf without a crash. I tried it many times and in random order. My RELENG_7 server is running this patch now for serval days without a problem. Problem solved! THANK YOU VERY MUCH! Boris
mlaier 2008-04-12 18:26:48 UTC FreeBSD src repository Modified files: (Branch: RELENG_7) contrib/pf/pfctl pfctl_altq.c pfctl_qstats.c sys/contrib/pf/net pf_if.c pf_ioctl.c pfvar.h Log: MFC: Make ALTQ cope with disappearing interfaces (particularly common with mpd and netgraph in gernal). This also allows to add queues for an interface that is not yet existing (you have to provide the bandwidth for the interface, however). PR: kern/106400, kern/117827 Tested by: Florian Smeets, Boris S. Revision Changes Path 1.9.2.1 +13 -1 src/contrib/pf/pfctl/pfctl_altq.c 1.6.10.1 +27 -1 src/contrib/pf/pfctl/pfctl_qstats.c 1.11.2.3 +7 -1 src/sys/contrib/pf/net/pf_if.c 1.28.2.2 +117 -3 src/sys/contrib/pf/net/pf_ioctl.c 1.16.2.1 +8 -1 src/sys/contrib/pf/net/pfvar.h _______________________________________________ cvs-all@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/cvs-all To unsubscribe, send any mail to "cvs-all-unsubscribe@freebsd.org"
mlaier 2008-04-12 19:52:13 UTC FreeBSD src repository Modified files: (Branch: RELENG_6) contrib/pf/pfctl pfctl_altq.c pfctl_qstats.c sys/contrib/pf/net pf_if.c pf_ioctl.c pfvar.h Log: MFC: Make ALTQ cope with disappearing interfaces (particularly common with mpd and netgraph in gernal). This also allows to add queues for an interface that is not yet existing (you have to provide the bandwidth for the interface, however). PR: kern/106400, kern/117827 Tested by: Florian Smeets, Boris S. Revision Changes Path 1.7.2.2 +13 -1 src/contrib/pf/pfctl/pfctl_altq.c 1.6.2.1 +27 -1 src/contrib/pf/pfctl/pfctl_qstats.c 1.10.2.1 +7 -1 src/sys/contrib/pf/net/pf_if.c 1.20.2.6 +117 -3 src/sys/contrib/pf/net/pf_ioctl.c 1.11.2.3 +8 -1 src/sys/contrib/pf/net/pfvar.h _______________________________________________ cvs-all@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/cvs-all To unsubscribe, send any mail to "cvs-all-unsubscribe@freebsd.org"
State Changed From-To: feedback->closed Committed to RELENG_6 and _7. Thanks for testing.