Bug 207080 - pfctl crash when load pf.conf, libc/resolv problem ?
Summary: pfctl crash when load pf.conf, libc/resolv problem ?
Status: New
Alias: None
Product: Base System
Classification: Unclassified
Component: bin (show other bugs)
Version: 10.3-STABLE
Hardware: amd64 Any
: --- Affects Only Me
Assignee: freebsd-pf mailing list
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2016-02-10 15:54 UTC by fabrice.bruel
Modified: 2020-02-13 07:50 UTC (History)
3 users (show)

See Also:


Attachments
pf.conf file (32.89 KB, text/plain)
2016-02-10 15:54 UTC, fabrice.bruel
no flags Details
valgrind output (31.27 KB, text/plain)
2016-02-15 09:24 UTC, fabrice.bruel
no flags Details
Valgrind output in 10.3-STABLE (32.33 KB, text/plain)
2016-08-26 09:06 UTC, fabrice.bruel
no flags Details
Valgrind output in 10.3-STABLE with debug (32.39 KB, text/plain)
2016-08-26 13:48 UTC, fabrice.bruel
no flags Details
Truss output with a burning pfctl in background (716.12 KB, text/plain)
2016-08-29 14:52 UTC, fabrice.bruel
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description fabrice.bruel 2016-02-10 15:54:20 UTC
Created attachment 166833 [details]
pf.conf file

Hello

I'using FreeBSD 9_STABLE to do firewall with pf.

# uname -a
FreeBSD FreeBSD 9.3 9.3-STABLE FreeBSD 9.3-STABLE #0 r294729: Tue Jan 26 22:00:32 CET 2016     root@9_STABLE:/usr/obj/usr/src/sys/FBSD9PF  amd64

With a specific pf.conf file (join with this message), in some case pftcl -f pf.conf crash with :
pfctl: failed to create table __automatic_4130873d_220 in : Cannot allocate memory
Segmentation fault: 11 (core dumped)

Ok my pf.conf file is bad and not optimize, but syntax is ok.
To be sure to reproduce the bug, just do with attach pf.conf :
while true;do pftcl -f pf.conf;done 
and wait a few minutes.

I've tried to understand the core file, but I'm a newbie in gdb usage, so I reproduce here what I've done :

# gdb
GNU gdb 6.1.1 [FreeBSD]
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "amd64-marcel-freebsd".
(gdb) core pfctl.core
Core was generated by `pfctl'.
Program terminated with signal 11, Segmentation fault.
#0  0x0000000800cfe6e6 in ?? ()
(gdb) add-symbol-file /usr/lib/debug/lib/libc.so.7.debug 0x0000000800cfe6e6
add symbol table from file "/usr/lib/debug/lib/libc.so.7.debug" at
        .text_addr = 0x800cfe6e6
(y or n) y
Reading symbols from /usr/lib/debug/lib/libc.so.7.debug...done.
(gdb) bt
#0  0x0000000800cfe6e6 in .text ()
#1  0x0000000000000001 in ?? ()
#2  0x0000000000639668 in ?? ()
#3  0x00007fffffffd870 in ?? ()
#4  0x0000000801400000 in ?? ()
#5  0x0000000800000001 in ?? ()
#6  0x00000008018009d0 in ?? ()
#7  0x00000000ffffffff in ?? ()
#8  0x00000008014045d0 in ?? ()
#9  0x00000000ffffffff in ?? ()
#10 0x0000000801402ad0 in ?? ()
#11 0x00000008ffffffff in ?? ()
#12 0x00000008014024d0 in ?? ()
#13 0x00000008ffffffff in ?? ()
#14 0x00000008014021d0 in ?? ()
#15 0x00000000ffffffff in ?? ()
#16 0x0000000801401ed0 in ?? ()
#17 0x00007fffffffffff in ?? ()
#18 0x0000000801401a50 in ?? ()
#19 0x0000000800000001 in ?? ()
#20 0x0000000801401a50 in ?? ()
#21 0x0000000000000017 in ?? ()
#22 0x00007fffffffd5e0 in ?? ()
#23 0x0000000800d6dc29 in __printf_render_int (io=0x7, pi=0x6394b0, arg=<value optimized out>) at /usr/src/lib/libc/stdio/xprintf_int.c:422
#24 0x0000000800faab40 in ?? ()
#25 0x00007fffffffd33b in ?? ()
#26 0x0000000800d06eca in files_rpcent (retval=0x800cfc36f, mdata=<value optimized out>, ap=<value optimized out>) at /usr/src/lib/libc/rpc/getrpcent.c:317
#27 0x65726168732f6c61 in ?? ()
#28 0x62696c2f736c6e2f in ?? ()
#29 0x0074616300432f63 in ?? ()
#30 0x00007fffffffd400 in ?? ()
#31 0x0000000800652c00 in ?? ()
#32 0x00007fffffffd410 in ?? ()
#33 0x00007fffffffd3b0 in ?? ()
#34 0x0000000000000000 in ?? ()
(gdb) add-symbol-file /usr/lib/debug/lib/libc.so.7.debug 0x00007fffffffd3b0
add symbol table from file "/usr/lib/debug/lib/libc.so.7.debug" at
        .text_addr = 0x7fffffffd3b0
(y or n) y
Reading symbols from /usr/lib/debug/lib/libc.so.7.debug...done.
(gdb) bt
#0  0x0000000800cfe6e6 in .text ()
#1  0x0000000000000001 in ?? ()
#2  0x0000000000639668 in ?? ()
#3  0x00007fffffffd870 in wcsxfrm_l (dest=0x7fffffffd0b0, src=0x7fffffffd0d0, len=6526232, locale=<value optimized out>) at /usr/src/lib/libc/string/wcsxfrm.c:126
#4  0x0000000000000002 in ?? ()
#5  0x0000000000000002 in ?? ()
#6  0x0000000800faab40 in ?? ()
#7  0x0000000800faab40 in ?? ()
#8  0x0000000800faab40 in ?? ()
#9  0x00007fffffffd33b in ?? ()
#10 0x0000000800d06eca in files_rpcent (retval=0x800d06eca, mdata=<value optimized out>, ap=<value optimized out>) at /usr/src/lib/libc/rpc/getrpcent.c:317
#11 0x0000000800d83e3e in __res_pquery (statp=0x7fffffffd320, msg=<value optimized out>, len=<value optimized out>, file=0x800cfc11a) at /usr/src/lib/libc/resolv/res_debug.c:305
#12 0x0000000000000000 in ?? ()
(gdb) 


If my use of gdb is correct, it seems to be a problem in /usr/src/lib/libc/resolv/res_debug.c ?

I can send the core file but 14Mo ...

Thanks for your help
Fabrice
Comment 1 Kristof Provost freebsd_committer 2016-02-12 13:05:11 UTC
I've had a quick look at this, and I think there are two problems.

The first is 'pfctl: failed to create table __automatic_4130873d_220 in : Cannot allocate memory'.
For some reason the kernel is unable to create this table. That might be simple memory pressure (i.e. a combination of memory use and memory fragmentation).

The second is the crash of pfctl. That looks like heap corruption as a result of incorrect handling of the error from the kernel.
For that one rebuilding world with 'DEBUG_FLAGS=-g' and running pfctl in valgrind is quite useful.

I've had a quick test on 10 as well, and I've been unable to reproduce the problem there.
Comment 2 fabrice.bruel 2016-02-15 09:24:15 UTC
Created attachment 167016 [details]
valgrind output
Comment 3 fabrice.bruel 2016-02-15 09:25:26 UTC
Hello,

I've recompiled the world with DEBUG_FLAGS=-g in /etc/make.conf. 

So I run pfctl with my special pf.conf in valgrind, find in the attached file the ouptut (valgrind.output)

Just for information, I used PF compiled in the kernel with the following options :

# les options de pf
device          pf
device          pflog
device          pfsync
# altq(9). Enable the base part of the hooks with the ALTQ option.
# Individual disciplines must be built into the base system and can not be
# loaded as modules at this point. In order to build a SMP kernel you must
# also have the ALTQ_NOPCC option.
options         ALTQ
options         ALTQ_CBQ        # Class Bases Queueing
options         ALTQ_RED        # Random Early Drop
options         ALTQ_RIO        # RED In/Out
options         ALTQ_HFSC       # Hierarchical Packet Scheduler
options         ALTQ_CDNR       # Traffic conditioner
options         ALTQ_PRIQ       # Priority Queueing
options         ALTQ_NOPCC      # Required for SMP build
options         ALTQ_DEBUG



Thanks
Fabrice
Comment 4 Kristof Provost freebsd_committer 2016-02-15 09:41:30 UTC
Yeah, so this:
==17184==    by 0x404B46: pfctl_rules (pfctl.c:1486)
==17184==    by 0x406DA7: main (pfctl.c:2378)
==17184==  Address 0x6aa8a08 is 56 bytes inside a block of size 64 free'd
==17184==    at 0x4C1E2DC: free (in /usr/local/lib/valgrind/vgpreload_memcheck-amd64-freebsd.so)
==17184==    by 0x4210A0: superblock_free (pfctl_optimize.c:1640)
==17184==    by 0x4233BE: pfctl_optimize_ruleset (pfctl_optimize.c:357)
==17184==    by 0x40453B: pfctl_load_ruleset (pfctl.c:1297)
==17184==    by 0x404B46: pfctl_rules (pfctl.c:1486)
==17184==    by 0x406DA7: main (pfctl.c:2378)

Is likely the reason your pfctl segfaults. There's a use after free. It's not the direct cause though, that's the kernel rejecting your rules.

Would it be possible to upgrade the machine to stable/10? It looks like the problem is fixed there.
Comment 5 fabrice.bruel 2016-02-17 10:19:53 UTC
Hello,

This dirty pf.conf load in a loop during last 24h on FreeBSD 10_STABLE without problem.

So, I think, I need to migrate ...

Thanks for your help
Fabrice
Comment 6 fabrice.bruel 2016-08-26 09:02:15 UTC
Hello,

I was too hasty: the problem has not disappeared in 10 Stable but is less easy to reproduce.

Actually, pfctl doesn't crash directly. But it can used all of the CPU load. I'm also using the same dirty pf.conf.

I join the new valgrind output on :
# uname -a
FreeBSD FBSD10STABLE 10.3-STABLE FreeBSD 10.3-STABLE #2 r304805: Thu Aug 25 16:38:19 CEST 2016     root@FBSD10STABLE:/usr/obj/usr/src/sys/FBSD10PF  amd64

Thanks for your help
Fabrice
Comment 7 fabrice.bruel 2016-08-26 09:06:22 UTC
Created attachment 174090 [details]
Valgrind output in 10.3-STABLE
Comment 8 fabrice.bruel 2016-08-26 09:13:42 UTC
Sorry, I've forgot DEBUG_FLAGS, the new valgrind ouputin a few minutes !
Comment 9 fabrice.bruel 2016-08-26 13:48:32 UTC
Created attachment 174097 [details]
Valgrind output in 10.3-STABLE with debug
Comment 10 Kristof Provost freebsd_committer 2016-08-28 16:47:39 UTC
Valgrind is not really producing anything useful here.
It's be interesting to see what pfctl is doing when it gets stuck using a lot of CPU time.
Did truss show anything interesting?
Comment 11 fabrice.bruel 2016-08-29 14:46:46 UTC
Hello,

Ok, if I run truss pfctl.conf.anon, the output seems to be normal for me newbie level.

Si in a first time, I run a script that call a lot of pfctl and I have a pfctl that burn cpu. In a second time I run again truss pfctl.conf.anon

I join the output here

Hth
Thanks
Fabrice
Comment 12 fabrice.bruel 2016-08-29 14:52:12 UTC
Created attachment 174193 [details]
Truss output with a burning pfctl in background