Summary: | pfctl: Cannot allocate memory (after a time) | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | Base System | Reporter: | tech-lists | ||||||
Component: | bin | Assignee: | Mark Johnston <markj> | ||||||
Status: | Closed FIXED | ||||||||
Severity: | Affects Many People | CC: | diego, email, freebsdbugzilla, jc, jjasen, kp, markj, olivierw1+bugzilla-freebsd, pi, r4, tech-lists, vegeta | ||||||
Priority: | --- | ||||||||
Version: | CURRENT | ||||||||
Hardware: | arm64 | ||||||||
OS: | Any | ||||||||
URL: | https://lists.freebsd.org/archives/freebsd-pf/2021-December/000163.html | ||||||||
Attachments: |
|
Description
tech-lists
2021-12-14 02:24:09 UTC
also: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=259798 some more stats 1. when it gets the error, before reboot # vmstat -m | grep -E 'pf|Size' Type InUse MemUse Requests Size(s) pfil 11 1K 19 64,128 tcpfunc 1 1K 1 64 pf_hash 5 11524K 5 2048 pf_ifnet 7 4K 55 256,2048 pf_rule 21 42K 57 2048 pf_osfp 1191 123K 3573 64,128 pf_table 6 12K 62 2048 pfs_nodes 20 8K 20 384 pfs_vncache 1 16K 4 32,16384 2. after reboot, after running: # doas -u _pfbadhost pf-badhost -O freebsd Password: pf-badhost 1552 - - Using experimental "aggy" aggregator... 1 table created. 18912 addresses added. pf-badhost 1620 - - IPv4 addresses in table: 619200855 but before loading pf.conf: # vmstat -m | grep -E 'pf|Size' Type InUse MemUse Requests Size(s) pfil 3 1K 3 128 tcpfunc 1 1K 1 64 pf_hash 5 11524K 5 2048 pf_ifnet 7 4K 10 256,2048 pf_table 1 2K 2 2048 pfs_nodes 20 8K 20 384 pfs_vncache 1 16K 1 16384 after loading pf.conf # vmstat -m | grep -E 'pf|Size' Type InUse MemUse Requests Size(s) pfil 11 1K 11 64,128 tcpfunc 1 1K 1 64 pf_hash 5 11524K 5 2048 pf_ifnet 7 4K 25 256,2048 pf_rule 19 38K 19 2048 pf_osfp 1191 123K 1191 64,128 pf_table 6 12K 15 2048 pfs_nodes 20 8K 20 384 pfs_vncache 1 16K 1 16384 Let's start by figuring out which ioctl fails to allocate memory. Run your pfctl command under truss and attach the output to the bug please. Created attachment 230110 [details]
truss output of pfctl -evf /etc/pf.conf
(In reply to tech-lists from comment #3) That's strange. DIOCRINADEFINE does allocate memory, but the first (and largest) allocation is done with M_WAITOK and cannot fail (or produce ENOMEM). The alternative sources seem to mostly be very small routine allocations that I'd only expect to fail when the system is completely out of memory. Can you confirm what revision of main you're running, and then show the output of `dtrace -n 'fbt:pf:pfr_ina_define:return { printf("%x %x", arg0, arg1); }'` run during a call to the failing pfctl command. You are running CURRENT, right? 13.0 does use M_NOWAIT for the large allocation, but that was fixed in stable/13 back in June. (In reply to Kristof Provost from comment #4) version is FreeBSD 14.0-CURRENT #1 main-n251261-25d0ccbe101: Thu Dec 2 20:22:08 GMT 2021 kernel looks like this: [...] cpu ARM64 ident RPI4B include "std.arm64" include "std.dev" include "std.broadcom" include "../../conf/std.nodebug" # enable Pf without ALTQ (HFSC) device pf device pflog # forward packets without decrementing # the time to live (TTL) counter options IPSTEALTH options FUSEFS device filemon # needed in order to use WITH_META_MODE=yes in /etc/src-env.conf You said: [...] "Can you confirm what revision of main you're running, and then show the output of `dtrace -n 'fbt:pf:pfr_ina_define:return { printf("%x %x", arg0, arg1); }'` run during a call to the failing pfctl command." [...] Do you mean run it like this: # pfctl `dtrace -n 'fbt:pf:pfr_ina_define:return { printf("%x %x", arg0, arg1); }'` -evf /etc/pf.conf ? The results I get from that are: "dtrace: invalid probe specifier fbt:pf:pfr_ina_define:return { printf("%x %x", arg0, arg1); }: probe description fbt:pf:pfr_ina_define:return does not match any probes" (In reply to tech-lists from comment #5) No, run the dtrace command in one terminal and then (after dtrace has started) the pfctl command in another. If you have pf built into the kernel (why?) you'll want to probe fbt:kernel:pfr_ina_define:return instead. (In reply to Kristof Provost from comment #6) # dtrace -n 'fbt:kernel:pfr_ina_define:return { printf("%x %x", arg0, arg1); }' dtrace: description 'fbt:kernel:pfr_ina_define:return ' matched 1 probe CPU ID FUNCTION:NAME 0 36537 pfr_ina_define:return c ffffa000c1e55460 0 36537 pfr_ina_define:return 0 0 0 36537 pfr_ina_define:return 0 ffffa001d5139d80 0 36537 pfr_ina_define:return 0 0 0 36537 pfr_ina_define:return 0 ffffa001977e9000 0 36537 pfr_ina_define:return 0 ffffa0001a8a3401 0 36537 pfr_ina_define:return 0 ffffa0016fb04a80 if I try the command (pfctl -f /etc/pf.conf) repeatedly the values of the probe change a bit: 2 36537 pfr_ina_define:return c ffffa000c1e48be0 2 36537 pfr_ina_define:return 0 0 2 36537 pfr_ina_define:return 0 ffffa0017783fa80 2 36537 pfr_ina_define:return 0 0 2 36537 pfr_ina_define:return 0 ffffa001df262a80 2 36537 pfr_ina_define:return 0 ffffa0001a8a3401 2 36537 pfr_ina_define:return 0 ffffa0014e050a80 each time in the pf window I get this result: # pfctl -f /etc/pf.conf /etc/pf.conf:18: cannot define table pfbadhost: Cannot allocate memory pfctl: Syntax error in config file: pf rules not loaded if I comment out the big table and its clock rules from pf.conf, it loads without error: # pfctl -f /etc/pf.conf # in the probe window: # dtrace -n 'fbt:kernel:pfr_ina_define:return { printf("%x %x", arg0, arg1); }' dtrace: description 'fbt:kernel:pfr_ina_define:return ' matched 1 probe CPU ID FUNCTION:NAME 0 36537 pfr_ina_define:return 0 0 0 36537 pfr_ina_define:return 0 ffffa00184267780 0 36537 pfr_ina_define:return 0 0 0 36537 pfr_ina_define:return 0 ffffa001c64ef780 0 36537 pfr_ina_define:return 0 ffffa0001a8a3401 0 36537 pfr_ina_define:return 0 ffffa001c0d94c00 You asked: > why? maybe this is no longer the case, but I thought in-kernel was "faster" than dynamic or via a kld particularly on low power devices should be "block" rather than "clock" > 2 36537 pfr_ina_define:return c ffffa000c1e48be0
> 2 36537 pfr_ina_define:return 0 0
What? That's ... that can't be right. The first number should be the offset in the function where we returned from. Both 0xc and 0x0 make no sense there. The second value is the return value. That should be zero or an error number. Not a pointer like we're clearly getting here.
Sigh, it looks like Dtrace's fbt is buggy on aarch64, and indeed if I'm reading sys/cddl/dev/fbt/aarch64/fbt_isa.c correctly we're passing x[0] / x[1] in the return probe, rather than the return offset and the return value. So we should look for the return value in arg0, and indeed we see a return value of 0x0c (or 12, ENOMEM).
So we now know at least that the big allocation succeeds, but that pfr_ina_define() or one of the functions it calls fail to allocate memory. Let's dig a bit more.
Try:
dtrace -n 'fbt:kernel:pfr_create_ktable:return { printf("%x", arg0); }' -n 'fbt:kernel:pf_find_or_create_kruleset:return { printf("%x", arg0); }' -n 'fbt:kernel:pfr_create_kentry:return { printf("%x", arg0); }'
OK - two tests, first one with the large table commented out (so pfctl -f returns without error), second one with it uncommented 1. # dtrace -n 'fbt:kernel:pfr_create_ktable:return { printf("%x", arg0); }' -n 'fbt:kernel:pf_find_or_create_kruleset:return { printf("%x", arg0); }' -n 'fbt:kernel:pfr_create_kentry:return { printf("%x", arg0); }' dtrace: description 'fbt:kernel:pfr_create_ktable:return ' matched 1 probe dtrace: description 'fbt:kernel:pf_find_or_create_kruleset:return ' matched 1 probe dtrace: description 'fbt:kernel:pfr_create_kentry:return ' matched 1 probe CPU ID FUNCTION:NAME 1 36465 pf_find_or_create_kruleset:return ffff000041aef0e0 1 36465 pf_find_or_create_kruleset:return ffff000041aef0e0 1 36465 pf_find_or_create_kruleset:return ffff000041aef0e0 1 36465 pf_find_or_create_kruleset:return ffff000041aef0e0 1 36465 pf_find_or_create_kruleset:return ffff000041aef0e0 1 36465 pf_find_or_create_kruleset:return ffff000041aef0e0 1 36465 pf_find_or_create_kruleset:return ffff000041aef0e0 1 36465 pf_find_or_create_kruleset:return ffff000041aef0e0 1 36465 pf_find_or_create_kruleset:return ffff000041aef0e0 1 36465 pf_find_or_create_kruleset:return ffff000041aef0e0 1 36465 pf_find_or_create_kruleset:return ffff000041aef0e0 1 36465 pf_find_or_create_kruleset:return ffff000041aef0e0 1 13251 pfr_create_ktable:return 1 1 13251 pfr_create_ktable:return ffffa000133a6000 1 13251 pfr_create_ktable:return 1 1 13251 pfr_create_ktable:return ffffa00080bc2000 1 13259 pfr_create_kentry:return ffffa0011698b1e0 1 13259 pfr_create_kentry:return ffffa0011698b280 1 13259 pfr_create_kentry:return ffffa0011698b320 1 13259 pfr_create_kentry:return ffffa0011698b3c0 1 13259 pfr_create_kentry:return ffffa0011698b460 1 13259 pfr_create_kentry:return ffffa0011698b500 1 13259 pfr_create_kentry:return ffffa0011698b5a0 1 13259 pfr_create_kentry:return ffffa0011698b640 1 13259 pfr_create_kentry:return ffffa0011698b6e0 1 13259 pfr_create_kentry:return ffffa0011698b780 1 13259 pfr_create_kentry:return ffffa0011698b820 1 13259 pfr_create_kentry:return ffffa0011698b8c0 1 13259 pfr_create_kentry:return ffffa0011698b960 1 13259 pfr_create_kentry:return ffffa0011698ba00 1 13259 pfr_create_kentry:return ffffa0011698baa0 1 13259 pfr_create_kentry:return ffffa0011698bb40 1 13251 pfr_create_ktable:return 1 1 13251 pfr_create_ktable:return ffffa001d822a000 1 13251 pfr_create_ktable:return 1 1 13251 pfr_create_ktable:return ffffa0004a9b1800 1 13259 pfr_create_kentry:return ffffa0011698bbe0 1 13259 pfr_create_kentry:return ffffa0011698bc80 1 36465 pf_find_or_create_kruleset:return ffff000041aef0e0 1 36465 pf_find_or_create_kruleset:return ffff000041aef0e0 1 13251 pfr_create_ktable:return 1 1 13251 pfr_create_ktable:return ffffa001d71a2800 1 13251 pfr_create_ktable:return 1 1 13251 pfr_create_ktable:return ffffa001d71a2000 1 13259 pfr_create_kentry:return ffffa0011698bd20 1 13259 pfr_create_kentry:return ffffa0011698bdc0 1 13251 pfr_create_ktable:return 1 1 13251 pfr_create_ktable:return ffffa00108f7e800 1 13259 pfr_create_kentry:return ffffa0011698be60 1 13259 pfr_create_kentry:return ffffa0011698bf00 1 13259 pfr_create_kentry:return ffffa0011698a000 2. pfctl -f /etc/pf.conf with the large table enabled, so # pfctl -f /etc/pf.conf /etc/pf.conf:18: cannot define table pfbadhost: Cannot allocate memory pfctl: Syntax error in config file: pf rules not loaded # dtrace -n 'fbt:kernel:pfr_create_ktable:return { printf("%x", arg0); }' -n 'fbt:kernel:pf_find_or_create_kruleset:return { printf("%x", arg0); }' -n 'fbt:kernel:pfr_create_kentry:return { printf("%x", arg0); }' dtrace: description 'fbt:kernel:pfr_create_ktable:return ' matched 1 probe dtrace: description 'fbt:kernel:pf_find_or_create_kruleset:return ' matched 1 probe dtrace: description 'fbt:kernel:pfr_create_kentry:return ' matched 1 probe CPU ID FUNCTION:NAME 2 36465 pf_find_or_create_kruleset:return ffff000041aef0e0 2 36465 pf_find_or_create_kruleset:return ffff000041aef0e0 2 36465 pf_find_or_create_kruleset:return ffff000041aef0e0 2 36465 pf_find_or_create_kruleset:return ffff000041aef0e0 2 36465 pf_find_or_create_kruleset:return ffff000041aef0e0 2 36465 pf_find_or_create_kruleset:return ffff000041aef0e0 2 36465 pf_find_or_create_kruleset:return ffff000041aef0e0 2 36465 pf_find_or_create_kruleset:return ffff000041aef0e0 2 36465 pf_find_or_create_kruleset:return ffff000041aef0e0 2 36465 pf_find_or_create_kruleset:return ffff000041aef0e0 2 36465 pf_find_or_create_kruleset:return ffff000041aef0e0 2 36465 pf_find_or_create_kruleset:return ffff000041aef0e0 2 13251 pfr_create_ktable:return 1 2 13251 pfr_create_ktable:return ffffa0001a8a2800 2 13259 pfr_create_kentry:return ffffa0011698cdc0 2 13259 pfr_create_kentry:return ffffa000c1e48640 2 13259 pfr_create_kentry:return ffffa000c1e485a0 2 13259 pfr_create_kentry:return ffffa00116988e60 2 13259 pfr_create_kentry:return ffffa000c1e48500 2 13259 pfr_create_kentry:return ffffa000c1e48780 2 13259 pfr_create_kentry:return ffffa000c1e48aa0 2 13259 pfr_create_kentry:return ffffa000c1e488c0 2 13259 pfr_create_kentry:return ffffa000c1e48a00 2 13259 pfr_create_kentry:return ffffa000c1e48960 2 13259 pfr_create_kentry:return ffffa000c1e48820 2 13259 pfr_create_kentry:return ffffa000c1e48b40 2 13259 pfr_create_kentry:return ffffa000c1e486e0 2 13259 pfr_create_kentry:return ffffa0011698b0a0 2 13259 pfr_create_kentry:return ffffa000c1e48be0 2 13259 pfr_create_kentry:return ffffa000c1e553c0 2 13259 pfr_create_kentry:return ffffa0011698ce60 2 13259 pfr_create_kentry:return ffffa0011698cd20 2 13259 pfr_create_kentry:return ffffa0011698cf00 2 13259 pfr_create_kentry:return ffffa0011698b000 2 13259 pfr_create_kentry:return ffffa000c1e55280 2 13259 pfr_create_kentry:return ffffa000c1c24280 2 13259 pfr_create_kentry:return ffffa000c1cab780 2 13259 pfr_create_kentry:return ffffa000c1c25780 2 13259 pfr_create_kentry:return ffffa000c1e555a0 2 13259 pfr_create_kentry:return ffffa000c1eaff00 2 13259 pfr_create_kentry:return ffffa000c1b5c0a0 2 13259 pfr_create_kentry:return ffffa000c1e55640 2 13259 pfr_create_kentry:return ffffa000c1a80dc0 2 13259 pfr_create_kentry:return ffffa000c1e55320 2 13259 pfr_create_kentry:return 0 2 13251 pfr_create_ktable:return 1 2 13251 pfr_create_ktable:return ffffa0001a8a2800 2 13251 pfr_create_ktable:return 1 2 13251 pfr_create_ktable:return ffffa00044ed5800 2 13259 pfr_create_kentry:return ffffa0011698cdc0 2 13259 pfr_create_kentry:return ffffa000c1e48640 2 13259 pfr_create_kentry:return ffffa000c1e485a0 2 13259 pfr_create_kentry:return ffffa00116988e60 2 13259 pfr_create_kentry:return ffffa000c1e48500 2 13259 pfr_create_kentry:return ffffa000c1e48780 2 13259 pfr_create_kentry:return ffffa000c1e48aa0 2 13259 pfr_create_kentry:return ffffa000c1e488c0 2 13259 pfr_create_kentry:return ffffa000c1e48a00 2 13259 pfr_create_kentry:return ffffa000c1e48960 2 13259 pfr_create_kentry:return ffffa000c1e48820 2 13259 pfr_create_kentry:return ffffa000c1e48b40 2 13259 pfr_create_kentry:return ffffa000c1e486e0 2 13259 pfr_create_kentry:return ffffa0011698b0a0 2 13259 pfr_create_kentry:return ffffa000c1e48be0 2 13259 pfr_create_kentry:return ffffa000c1e553c0 2 13251 pfr_create_ktable:return 1 2 13251 pfr_create_ktable:return ffffa00008fcb800 2 13251 pfr_create_ktable:return 1 2 13251 pfr_create_ktable:return ffffa000f7955000 2 13259 pfr_create_kentry:return ffffa0011698ce60 2 13259 pfr_create_kentry:return ffffa0011698cd20 2 36465 pf_find_or_create_kruleset:return ffff000041aef0e0 2 36465 pf_find_or_create_kruleset:return ffff000041aef0e0 2 13251 pfr_create_ktable:return 1 2 13251 pfr_create_ktable:return ffffa000f7956000 2 13251 pfr_create_ktable:return 1 2 13251 pfr_create_ktable:return ffffa0011697b800 2 13259 pfr_create_kentry:return ffffa0011698cf00 2 13259 pfr_create_kentry:return ffffa0011698b000 2 13251 pfr_create_ktable:return 1 2 13251 pfr_create_ktable:return ffffa00116977000 2 13259 pfr_create_kentry:return ffffa000c1e55280 2 13259 pfr_create_kentry:return ffffa000c1c24280 2 13259 pfr_create_kentry:return ffffa000c1cab780 this time, I couldn't reload pf even with the big table commented out again. it won't load any tables at all. I'll have to restart the machine. # pfctl -f /etc/pf.conf /etc/pf.conf:23: cannot define table rfc6890: Cannot allocate memory /etc/pf.conf:25: cannot define table gooDNS4: Cannot allocate memory /etc/pf.conf:26: cannot define table gooDNS6: Cannot allocate memory /etc/pf.conf:27: cannot define table friends: Cannot allocate memory pfctl: Syntax error in config file: pf rules not loaded I had to kill -9 the dtrace to run it again: CPU ID FUNCTION:NAME 0 36465 pf_find_or_create_kruleset:return ffff000041aef0e0 0 36465 pf_find_or_create_kruleset:return ffff000041aef0e0 0 36465 pf_find_or_create_kruleset:return ffff000041aef0e0 0 36465 pf_find_or_create_kruleset:return ffff000041aef0e0 0 36465 pf_find_or_create_kruleset:return ffff000041aef0e0 0 36465 pf_find_or_create_kruleset:return ffff000041aef0e0 0 36465 pf_find_or_create_kruleset:return ffff000041aef0e0 0 36465 pf_find_or_create_kruleset:return ffff000041aef0e0 0 36465 pf_find_or_create_kruleset:return ffff000041aef0e0 0 36465 pf_find_or_create_kruleset:return ffff000041aef0e0 0 36465 pf_find_or_create_kruleset:return ffff000041aef0e0 0 36465 pf_find_or_create_kruleset:return ffff000041aef0e0 0 13251 pfr_create_ktable:return 1 0 13251 pfr_create_ktable:return ffffa00116984000 0 13251 pfr_create_ktable:return 1 0 13251 pfr_create_ktable:return ffffa0007d6b7800 0 13259 pfr_create_kentry:return 0 0 13251 pfr_create_ktable:return 1 0 13251 pfr_create_ktable:return ffffa0007d6b7800 0 13251 pfr_create_ktable:return 1 0 13251 pfr_create_ktable:return ffffa000957a3800 0 13259 pfr_create_kentry:return 0 0 36465 pf_find_or_create_kruleset:return ffff000041aef0e0 0 36465 pf_find_or_create_kruleset:return ffff000041aef0e0 0 13251 pfr_create_ktable:return 1 0 13251 pfr_create_ktable:return ffffa000957a3800 0 13251 pfr_create_ktable:return 1 0 13251 pfr_create_ktable:return ffffa0001a8a4000 0 13259 pfr_create_kentry:return 0 0 13251 pfr_create_ktable:return 1 0 13251 pfr_create_ktable:return ffffa000957a3800 0 13259 pfr_create_kentry:return 0 (re-ran the pfctl command again) 2 36465 pf_find_or_create_kruleset:return ffff000041aef0e0 2 36465 pf_find_or_create_kruleset:return ffff000041aef0e0 2 36465 pf_find_or_create_kruleset:return ffff000041aef0e0 2 36465 pf_find_or_create_kruleset:return ffff000041aef0e0 2 36465 pf_find_or_create_kruleset:return ffff000041aef0e0 2 36465 pf_find_or_create_kruleset:return ffff000041aef0e0 2 36465 pf_find_or_create_kruleset:return ffff000041aef0e0 2 36465 pf_find_or_create_kruleset:return ffff000041aef0e0 2 36465 pf_find_or_create_kruleset:return ffff000041aef0e0 2 36465 pf_find_or_create_kruleset:return ffff000041aef0e0 2 36465 pf_find_or_create_kruleset:return ffff000041aef0e0 2 36465 pf_find_or_create_kruleset:return ffff000041aef0e0 2 13251 pfr_create_ktable:return 1 2 13251 pfr_create_ktable:return ffffa00080bc2000 2 13251 pfr_create_ktable:return 1 2 13251 pfr_create_ktable:return ffffa000d24d6000 2 13259 pfr_create_kentry:return 0 2 13251 pfr_create_ktable:return 1 2 13251 pfr_create_ktable:return ffffa000d24d6000 2 13251 pfr_create_ktable:return 1 2 13251 pfr_create_ktable:return ffffa00116976800 2 13259 pfr_create_kentry:return 0 2 36465 pf_find_or_create_kruleset:return ffff000041aef0e0 2 36465 pf_find_or_create_kruleset:return ffff000041aef0e0 2 13251 pfr_create_ktable:return 1 2 13251 pfr_create_ktable:return ffffa00116976800 2 13251 pfr_create_ktable:return 1 2 13251 pfr_create_ktable:return ffffa0001a83d000 2 13259 pfr_create_kentry:return 0 2 13251 pfr_create_ktable:return 1 2 13251 pfr_create_ktable:return ffffa00116976800 2 13259 pfr_create_kentry:return 0 after reboot, before loading pf: # doas -u _pfbadhost pf-badhost -O freebsd Password: pf-badhost 1594 - - Using experimental "aggy" aggregator... 1 table created. 17842 addresses added. pf-badhost 1662 - - IPv4 addresses in table: 619194384 [dtrace] loads of this, maybe over a thousand lines. it fills the standard screen(8) scroll-back buffer: 1 13259 pfr_create_kentry:return ffffa00150ffb3c0 1 13259 pfr_create_kentry:return ffffa00150ffb320 1 13259 pfr_create_kentry:return ffffa00150ffb280 [...] 1 13259 pfr_create_kentry:return ffffa00158828780 1 36465 pf_find_or_create_kruleset:return ffff000041aef0e0 1 36465 pf_find_or_create_kruleset:return ffff000041aef0e0 1 13251 pfr_create_ktable:return 1 1 13251 pfr_create_ktable:return ffffa00158826000 1 13251 pfr_create_ktable:return 1 1 13251 pfr_create_ktable:return ffffa00158827800 1 13259 pfr_create_kentry:return ffffa001588286e0 1 13259 pfr_create_kentry:return ffffa00158828640 1 13259 pfr_create_kentry:return ffffa001588285a0 (ends) but the rules load successfully. later on when the cronjob runs to refresh the table, it'll succeed once maybe twice then pf will bail. I'm not sure if the following information is of any use, but thought I'd mention it in case it is: 1. with more or less the same config, but configured to block many more addresses (so the table is many times larger i guess), but on amd64 (freebsd bhyve guest) running 13.0-p5, the command runs without error: pf-badhost 87438 - - Using experimental "aggy" aggregator... 2182 addresses added. 2854 addresses deleted. pf-badhost 87506 - - IPv4 addresses in table: 1044433099 but in dmesg there is the message that pf states limit has been reached. The vm still works/passes traffic though. The vm has 8GB vram. I've not tried increasing that yet. here is pfctl -si # pfctl -si Status: Enabled for 38 days 17:47:57 Debug: Urgent Interface Stats for vtnet0 IPv4 IPv6 Bytes In 2872664371 0 Bytes Out 1616348865 0 Packets In Passed 5573603 0 Blocked 414006 0 Packets Out Passed 6592735 0 Blocked 13559 0 State Table Total Rate current entries 10 searches 12593897 3.8/s inserts 863091 0.3/s removals 863081 0.3/s Counters match 1290749 0.4/s bad-offset 0 0.0/s fragment 0 0.0/s short 0 0.0/s normalize 6 0.0/s memory 13471 0.0/s bad-timestamp 0 0.0/s congestion 0 0.0/s ip-option 0 0.0/s proto-cksum 0 0.0/s state-mismatch 14 0.0/s state-insert 0 0.0/s state-limit 0 0.0/s src-limit 0 0.0/s synproxy 0 0.0/s map-failed 0 0.0/s (In reply to tech-lists from comment #12) Okay, so we're failing allocation in pfr_create_kentry(), which means either the V_pfr_kentry_z or V_pfr_kentry_counter_z zones. After a failure take `vmstat -z | grep pf`, and see if bumping the `set limit table-entries` number helps. this might be a while as the time taken for it to fail has been variable (In reply to Kristof Provost from comment #14) ok it's failing now: # vmstat -z | grep pf pf mtags: 48, 0, 0, 0, 0, 0, 0, 0 pf tags: 104, 0, 0, 0, 0, 0, 0, 0 pf states: 312, 100000, 22, 212, 13790, 0, 0, 0 pf state keys: 88, 0, 22, 760, 13790, 0, 0, 0 pf source nodes: 136, 10000, 0, 0, 0, 0, 0, 0 pf table entry counters: 64, 0, 0, 0, 0, 0, 0, 0 pf table entries: 160, 25400000, 24700, 5400, 58807, 3, 0, 0 pf frags: 248, 0, 0, 0, 0, 0, 0, 0 pf frag entries: 40, 5000, 0, 0, 0, 0, 0, 0 this is with: in /etc/sysctl.conf: # net.pf.request_maxcount: 25400000 and in /etc/pf.conf: set limit table-entries 25400000 so, doubling it now: sysctl net.pf.request_maxcount=50800000 net.pf.request_maxcount: 25400000 -> 50800000 and in /etc/pf.conf set limit table-entries 50800000 pfctl: Cannot allocate memory. # vmstat -z | grep pf pf mtags: 48, 0, 0, 0, 0, 0, 0, 0 pf tags: 104, 0, 0, 0, 0, 0, 0, 0 pf states: 312, 100000, 24, 210, 14066, 0, 0, 0 pf state keys: 88, 0, 24, 758, 14066, 0, 0, 0 pf source nodes: 136, 10000, 0, 0, 0, 0, 0, 0 pf table entry counters: 64, 0, 0, 0, 0, 0, 0, 0 pf table entries: 160, 25400000, 24700, 5400, 58807, 4, 0, 0 pf frags: 248, 0, 0, 0, 0, 0, 0, 0 pf frag entries: 40, 5000, 0, 0, 0, 0, 0, 0 pf state scrubs: 40, 0, 0, 0, 0, 0, 0, 0 might be of use: # pfctl -si Status: Enabled for 0 days 11:52:43 Debug: Urgent State Table Total Rate current entries 25 searches 1282643 30.0/s inserts 14095 0.3/s removals 14070 0.3/s Counters match 21054 0.5/s bad-offset 0 0.0/s fragment 0 0.0/s short 0 0.0/s normalize 0 0.0/s memory 0 0.0/s bad-timestamp 0 0.0/s congestion 0 0.0/s ip-option 0 0.0/s proto-cksum 0 0.0/s state-mismatch 0 0.0/s state-insert 0 0.0/s state-limit 0 0.0/s src-limit 0 0.0/s synproxy 0 0.0/s map-failed 0 0.0/s # vmstat -m | grep -E 'pf|Size' Type InUse MemUse Requests Size(s) pfil 11 1K 11 64,128 tcpfunc 1 1K 1 64 pf_hash 5 11524K 5 2048 pf_ifnet 7 4K 40 256,2048 pf_rule 19 38K 38 2048 pf_osfp 1191 123K 2382 64,128 pf_table 6 12K 27 2048 pfs_nodes 20 8K 20 384 pfs_vncache 1 16K 46 32,16384 (In reply to tech-lists from comment #18) > pf table entries: 160, 25400000, 24700, 5400, 58807, 4, 0, 0 Okay, so that's the relevant line. Limit of 25400000, 24700 used, 5400 free. We also see 4 failed, which matches the failures you're seeing, but I'm not clear on why we'd get allocation failures when we're so far from the limit. That can happen if the constructor fails, but there's no constructor for pf table entries, so that's not it. The only other reason I can see for it is that we're failing to find free memory to allocate from. So we're still at "Your system is out of memory" rather than anything else. (In reply to Kristof Provost from comment #19) > The only other reason I can see for it is that we're failing to find free > memory to allocate from. So we're still at "Your system is out of memory" > rather than anything else. I've not rebooted the system yet. here's top output: last pid: 50141; load averages: 0.81, 0.35, 0.27 up 0+14:30:53 14:49:30 67 processes: 1 running, 66 sleeping CPU: 0.0% user, 0.0% nice, 0.2% system, 0.0% interrupt, 99.8% idle Mem: 9252K Active, 44M Inact, 139M Laundry, 3161M Wired, 40K Buf, 4461M Free ARC: 1352M Total, 507M MFU, 353M MRU, 132K Anon, 8512K Header, 483M Other 507M Compressed, 740M Uncompressed, 1.46:1 Ratio Swap: 12G Total, 12G Free ran the script again and it failed again. top output after this: last pid: 50441; load averages: 0.18, 0.26, 0.25 up 0+14:37:06 14:55:43 28 processes: 1 running, 27 sleeping CPU: 0.0% user, 0.0% nice, 0.0% system, 0.0% interrupt, 100% idle Mem: 12M Active, 45M Inact, 139M Laundry, 3122M Wired, 40K Buf, 4494M Free ARC: 1352M Total, 508M MFU, 353M MRU, 8516K Header, 483M Other 507M Compressed, 741M Uncompressed, 1.46:1 Ratio Swap: 12G Total, 12G Free Why would it report out-of-memory with over 4GB available? maybe worthwhile to mention I have in /boot/loader.conf: kern.maxdsiz=4294967296 this was much smaller by default on this arch than expected when I set this value. I cannot add any ip to any table while it's like this. This means programs like sshguard (which monitors auth.log) cannot add ip addresses to its table on-the-fly, which means they are no longer functional: # service sshguard status sshguard is running as pid 1293. # pfctl -t sshguard -T add 1.0.0.1 pfctl: Cannot allocate memory. (In reply to tech-lists from comment #22) Yeah, I'm aware of the consequences of this allocation failure. I'm just struggling to work out why it's failing to allocate. There's no obvious reason for it, and it's clearly failing in the memory allocator, not pf itself. Let's at least confirm expectations. Does the number of failed allocation attempts in `vmstat -z` for pf table entries (at 4 when you reported it) increase on every failed attempt to update a table? (In reply to Kristof Provost from comment #23) it's gone up to 8 so far: pf table entries: 160, 25400000, 24700, 5400, 58807, 8, 0, 0 i'll run the script again: pf table entries: 160, 25400000, 24700, 5400, 58807, 9, 0, 0 can I restart it or would you like me to do some more testing? I'll want more testing later, but I need to do some digging first and won't have time to do that today. Go ahead and restart it. It seems to occur sufficiently frequently that it shouldn't take very long to do further testing when I know what to check for. Summarising some private debugging: The "pfctl: Cannot allocate memory." originates in a failed uma_zalloc() in pfr_create_kentry() on the V_pf_kentry_z ("pf table entries") zone. Confirmed by dtrace (which also shows unexplained issues, with some probes not firing when they should). There's no obvious reason why this allocation should fail. There's sufficient free memory and we're far away from the configured limit. The current suspicion is a platform specific (this appears to only manifest on arm64) allocator issue, but combined with the dtrace issues it's hard to say. I don't think it's an exclusive issue of arm64 platform. I am facing exactly this issue on a 13.0-RELEASE-p3 running at AWS EC2 instance in an amd64 i386 platform. (In reply to Diego Linke from comment #28) Have you checked the vmstat -z output as well? (In reply to Kristof Provost from comment #29) Yes, I did. Please see below: # vmstat -z | grep -E 'pf|ITEM' ITEM SIZE LIMIT USED FREE REQ FAILSLEEP XDOMAIN pf mtags: 48, 0, 0, 84, 2057, 0, 0, 0 pf tags: 104, 0, 0, 0, 0, 0, 0, 0 pf states: 296, 100000, 29, 153, 653447, 0, 0, 0 pf state keys: 88, 0, 29, 339, 653447, 0, 0, 0 pf source nodes: 136, 10000, 6, 23, 29737, 0, 0, 0 pf table entry counters: 64, 0, 0, 0, 0, 0, 0, 0 pf table entries: 160, 1000000, 1, 24,57544806,15982, 0, 0 pf frags: 248, 0, 0, 0, 0, 0, 0, 0 pf frag entries: 40, 5000, 0, 0, 0, 0, 0, 0 pf state scrubs: 40, 0, 0, 0, 0, 0, 0, 0 # vmstat -m | grep -E 'pf|Size' Type InUse MemUse Requests Size(s) pfs_nodes 20 8K 20 384 pfs_vncache 1 8K 1 8192 pfil 11 1K 11 64,128 tcpfunc 1 1K 1 64 pf_temp 0 0K 2323 32,64 pf_hash 5 11524K 5 2048 pf_ifnet 21 6K 80 128,256,2048 pf_osfp 1191 123K 2382 64,128 pf_rule 44 44K 44 1024 pf_table 16 32K 28317 2048 # sysctl net.pf.request_maxcount net.pf.request_maxcount: 65535 # sysctl net.pf.request_maxcount=50800000 net.pf.request_maxcount: 65535 -> 50800000 # pfctl -f /etc/pf.conf /etc/pf.conf:21: cannot define table fireholL1: Cannot allocate memory /etc/pf.conf:22: cannot define table fireholL2: Cannot allocate memory /etc/pf.conf:23: cannot define table fireholL3: Cannot allocate memory /etc/pf.conf:24: cannot define table fireholWEB: Cannot allocate memory /etc/pf.conf:25: cannot define table normshield: Cannot allocate memory /etc/pf.conf:26: cannot define table ipblacklistcloud: Cannot allocate memory /etc/pf.conf:27: cannot define table Webbots: Cannot allocate memory /etc/pf.conf:28: cannot define table haley_ssh: Cannot allocate memory /etc/pf.conf:29: cannot define table bi_any_1_7d: Cannot allocate memory pfctl: Syntax error in config file: pf rules not loaded I wonder whether it would help if --purely as a hack / experiment -- the limits on uma zones that pf uses were removed. (In reply to Andriy Gapon from comment #31) How could that be done? (In reply to tech-lists from comment #32) Comment out all calls to uma_zone_set_max() under sys/netpfil/pf. (In reply to Kristof Provost from comment #27) Do memory allocations need to be done with M_NOWAIT if they are performed in a context of a call (ioctl) from userland? I think that it's not impossible that there could be some transient condition (plus maybe a bug) resulting in M_NOWAIT giving up too easily. (In reply to Andriy Gapon from comment #31) This should prevent the table limit from being set, so it's a little narrower than just disabling the limit for everything: diff --git a/sys/netpfil/pf/pf_table.c b/sys/netpfil/pf/pf_table.c index 4cfe5d61e83e..859d5ad79775 100644 --- a/sys/netpfil/pf/pf_table.c +++ b/sys/netpfil/pf/pf_table.c @@ -209,7 +209,6 @@ pfr_initialize(void) V_pfr_kentry_z = uma_zcreate("pf table entries", sizeof(struct pfr_kentry), NULL, NULL, NULL, NULL, UMA_ALIGN_PTR, 0); - V_pf_limits[PF_LIMIT_TABLE_ENTRIES].zone = V_pfr_kentry_z; V_pf_limits[PF_LIMIT_TABLE_ENTRIES].limit = PFR_KENTRY_HIWAT; } diff --git a/sbin/pfctl/pfctl.c b/sbin/pfctl/pfctl.c index a0eec1b09289..22c689934c2d 100644 --- a/sbin/pfctl/pfctl.c +++ b/sbin/pfctl/pfctl.c @@ -1918,6 +1918,9 @@ pfctl_load_limit(struct pfctl *pf, unsigned int index, unsigned int limit) { struct pfioc_limit pl; + if (index == PF_LIMIT_TABLE_ENTRIES) + return (0); + memset(&pl, 0, sizeof(pl)); pl.index = index; pl.limit = limit; (In reply to Andriy Gapon from comment #34) The allocation is done with the PF_RULES write lock held, and that's not a sleepable lock. We need to hold the read lock for rules evaluation, so sleeping with the write lock held would just halt traffic to/from the machine. (In reply to Kristof Provost from comment #36) Could the allocation be done in advance, before the lock is taken? If that's technically messy to do, then I recall that UMA has some sort of a reservation mechanism. But, actually, not sure if it's suitable for the pf's use case. (In reply to Andriy Gapon from comment #37) Realistically no. It's pretty deep down the code path and would massively complicate dealing with the ioctl. It would also be a largely pointless exercise, because we do a lot of uma_zalloc(M_NOWAIT) in the data path as well (e.g. for fragment handling, or to tag packets or for state creation). (In reply to Kristof Provost from comment #35) Would you like me to apply the patch and reboot? (In reply to tech-lists from comment #39) Yes please. Let's see if that makes a difference or not. While we're at it, it'd be useful for someone who has an affected amd64 machine to try this dtrace script: #!/usr/sbin/dtrace -s BEGIN { self->in_call = 0; } fbt:pf:pfr_create_kentry:entry { self->in_call = 1; printf("In"); } fbt:pf:pfr_create_kentry:return { self->in_call = 0; printf("=> %x (@%x)\n", arg1, arg0); } fbt:pf:pfr_create_kentry:return / arg1 == 0 / { stack(); } fbt:kernel:uma_zalloc_arg:return / self->in_call == 1 / { printf("=> %x (@%x)", arg1, arg0); } fbt:kernel:cache_alloc_retry:entry / self->in_call == 1 / { printf("=> %x (@%x)", arg1, arg0); } fbt:kernel:cache_alloc:entry / self->in_call == 1 / { printf("=> %x (@%x)", arg1, arg0); } fbt:kernel:zone_alloc_item:entry / self->in_call == 1 / { printf("=> %x (@%x)", arg1, arg0); } fbt:kernel:uma_small_alloc:return / self->in_call == 1 / { printf("=> %x (@%x)", arg1, arg0); } Once the problem occurs there should be output like 'pfr_create_kentry:return => 0 (@120)'. Please attach the full log, because there's going to be more information (hopefully!) in there. (In reply to Kristof Provost from comment #40) The patch doesn't apply: # patch < pfpatch2.patch Hmm... Looks like a unified diff to me... The text leading up to this was: -------------------------- |diff --git a/sys/netpfil/pf/pf_table.c b/sys/netpfil/pf/pf_table.c |index 4cfe5d61e83e..859d5ad79775 100644 |--- a/sys/netpfil/pf/pf_table.c |+++ b/sys/netpfil/pf/pf_table.c -------------------------- Patching file sys/netpfil/pf/pf_table.c using Plan A... patch: **** malformed patch at line 12: diff --git a/sbin/pfctl/pfctl.c b/sbin/pfctl/pfctl.c Looks like it might need a newline between the first patch and the second, so tried that: # patch < pfpatch2.patch Hmm... Looks like a unified diff to me... The text leading up to this was: -------------------------- |diff --git a/sys/netpfil/pf/pf_table.c b/sys/netpfil/pf/pf_table.c |index 4cfe5d61e83e..859d5ad79775 100644 |--- a/sys/netpfil/pf/pf_table.c |+++ b/sys/netpfil/pf/pf_table.c -------------------------- Patching file sys/netpfil/pf/pf_table.c using Plan A... Hunk #1 failed at 209. 1 out of 1 hunks failed--saving rejects to sys/netpfil/pf/pf_table.c.rej Hmm... The next patch looks like a unified diff to me... The text leading up to this was: -------------------------- |diff --git a/sbin/pfctl/pfctl.c b/sbin/pfctl/pfctl.c |index a0eec1b09289..22c689934c2d 100644 |--- a/sbin/pfctl/pfctl.c |+++ b/sbin/pfctl/pfctl.c -------------------------- Patching file sbin/pfctl/pfctl.c using Plan A... Hunk #1 failed at 1918. 1 out of 1 hunks failed--saving rejects to sbin/pfctl/pfctl.c.rej Hmm... Ignoring the trailing garbage. done This is on: % gvs srctree last updated: Fri Dec 24 11:59:43 2021 +0000 version: 251915 i'll try just manually editing it Created attachment 230375 [details]
no table limit patch
This is the same patch, but not mangled by being copy/pasted into and out of the bugzilla comment field.
It's also trivial to apply it manually, because it's a very simple patch.
(In reply to Kristof Provost from comment #40) Hi, please find below the dtrace output, I hope it will be useful. # dtrace -s dtrace_pf.sh -c "pfctl -f /etc/pf.conf" dtrace: script 'dtrace_pf.sh' matched 9 probes /etc/pf.conf:21: cannot define table fireholL1: Cannot allocate memory /etc/pf.conf:22: cannot define table fireholL2: Cannot allocate memory /etc/pf.conf:23: cannot define table fireholL3: Cannot allocate memory /etc/pf.conf:24: cannot define table fireholWEB: Cannot allocate memory /etc/pf.conf:25: cannot define table normshield: Cannot allocate memory /etc/pf.conf:26: cannot define table ipblacklistcloud: Cannot allocate memory /etc/pf.conf:27: cannot define table Webbots: Cannot allocate memory /etc/pf.conf:28: cannot define table haley_ssh: Cannot allocate memory /etc/pf.conf:29: cannot define table bi_any_1_7d: Cannot allocate memory pfctl: Syntax error in config file: pf rules not loaded dtrace: pid 56813 exited with status 1 CPU ID FUNCTION:NAME 0 74366 pfr_create_kentry:entry In 0 21147 cache_alloc_retry:entry => fffff80006ac1e00 (@fffff80006ac1c80) 0 21145 cache_alloc:entry => fffff80006ac1e00 (@fffff80006ac1c80) 0 21202 zone_alloc_item:entry => 0 (@fffff80006ac1c80) 0 74367 pfr_create_kentry:return => 0 (@12e) 0 74367 pfr_create_kentry:return pf.ko`pfr_ina_define+0x6d6 pf.ko`pfioctl+0x49bb kernel`devfs_ioctl+0xc7 kernel`vn_ioctl+0x1a4 kernel`devfs_ioctl_f+0x1e kernel`kern_ioctl+0x26d kernel`sys_ioctl+0xf6 kernel`amd64_syscall+0x755 kernel`0xffffffff8106227e 0 74366 pfr_create_kentry:entry In 0 21147 cache_alloc_retry:entry => fffff80006ac1e00 (@fffff80006ac1c80) 0 21145 cache_alloc:entry => fffff80006ac1e00 (@fffff80006ac1c80) 0 21202 zone_alloc_item:entry => 0 (@fffff80006ac1c80) 0 74367 pfr_create_kentry:return => 0 (@12e) 0 74367 pfr_create_kentry:return pf.ko`pfr_ina_define+0x6d6 pf.ko`pfioctl+0x49bb kernel`devfs_ioctl+0xc7 kernel`vn_ioctl+0x1a4 kernel`devfs_ioctl_f+0x1e kernel`kern_ioctl+0x26d kernel`sys_ioctl+0xf6 kernel`amd64_syscall+0x755 kernel`0xffffffff8106227e 0 74366 pfr_create_kentry:entry In 0 21147 cache_alloc_retry:entry => fffff80006ac1e00 (@fffff80006ac1c80) 0 21145 cache_alloc:entry => fffff80006ac1e00 (@fffff80006ac1c80) 0 21202 zone_alloc_item:entry => 0 (@fffff80006ac1c80) 0 74367 pfr_create_kentry:return => 0 (@12e) 0 74367 pfr_create_kentry:return pf.ko`pfr_ina_define+0x6d6 pf.ko`pfioctl+0x49bb kernel`devfs_ioctl+0xc7 kernel`vn_ioctl+0x1a4 kernel`devfs_ioctl_f+0x1e kernel`kern_ioctl+0x26d kernel`sys_ioctl+0xf6 kernel`amd64_syscall+0x755 kernel`0xffffffff8106227e 0 74366 pfr_create_kentry:entry In 0 21147 cache_alloc_retry:entry => fffff80006ac1e00 (@fffff80006ac1c80) 0 21145 cache_alloc:entry => fffff80006ac1e00 (@fffff80006ac1c80) 0 21202 zone_alloc_item:entry => 0 (@fffff80006ac1c80) 0 74367 pfr_create_kentry:return => 0 (@12e) 0 74367 pfr_create_kentry:return pf.ko`pfr_ina_define+0x6d6 pf.ko`pfioctl+0x49bb kernel`devfs_ioctl+0xc7 kernel`vn_ioctl+0x1a4 kernel`devfs_ioctl_f+0x1e kernel`kern_ioctl+0x26d kernel`sys_ioctl+0xf6 kernel`amd64_syscall+0x755 kernel`0xffffffff8106227e 0 74366 pfr_create_kentry:entry In 0 21147 cache_alloc_retry:entry => fffff80006ac1e00 (@fffff80006ac1c80) 0 21145 cache_alloc:entry => fffff80006ac1e00 (@fffff80006ac1c80) 0 21202 zone_alloc_item:entry => 0 (@fffff80006ac1c80) 0 74367 pfr_create_kentry:return => 0 (@12e) 0 74367 pfr_create_kentry:return pf.ko`pfr_ina_define+0x6d6 pf.ko`pfioctl+0x49bb kernel`devfs_ioctl+0xc7 kernel`vn_ioctl+0x1a4 kernel`devfs_ioctl_f+0x1e kernel`kern_ioctl+0x26d kernel`sys_ioctl+0xf6 kernel`amd64_syscall+0x755 kernel`0xffffffff8106227e 0 74366 pfr_create_kentry:entry In 0 21147 cache_alloc_retry:entry => fffff80006ac1e00 (@fffff80006ac1c80) 0 21145 cache_alloc:entry => fffff80006ac1e00 (@fffff80006ac1c80) 0 21202 zone_alloc_item:entry => 0 (@fffff80006ac1c80) 0 74367 pfr_create_kentry:return => 0 (@12e) 0 74367 pfr_create_kentry:return pf.ko`pfr_ina_define+0x6d6 pf.ko`pfioctl+0x49bb kernel`devfs_ioctl+0xc7 kernel`vn_ioctl+0x1a4 kernel`devfs_ioctl_f+0x1e kernel`kern_ioctl+0x26d kernel`sys_ioctl+0xf6 kernel`amd64_syscall+0x755 kernel`0xffffffff8106227e 0 74366 pfr_create_kentry:entry In 0 21147 cache_alloc_retry:entry => fffff80006ac1e00 (@fffff80006ac1c80) 0 21145 cache_alloc:entry => fffff80006ac1e00 (@fffff80006ac1c80) 0 21202 zone_alloc_item:entry => 0 (@fffff80006ac1c80) 0 74367 pfr_create_kentry:return => 0 (@12e) 0 74367 pfr_create_kentry:return pf.ko`pfr_ina_define+0x6d6 pf.ko`pfioctl+0x49bb kernel`devfs_ioctl+0xc7 kernel`vn_ioctl+0x1a4 kernel`devfs_ioctl_f+0x1e kernel`kern_ioctl+0x26d kernel`sys_ioctl+0xf6 kernel`amd64_syscall+0x755 kernel`0xffffffff8106227e 0 74366 pfr_create_kentry:entry In 0 21147 cache_alloc_retry:entry => fffff80006ac1e00 (@fffff80006ac1c80) 0 21145 cache_alloc:entry => fffff80006ac1e00 (@fffff80006ac1c80) 0 21202 zone_alloc_item:entry => 0 (@fffff80006ac1c80) 0 74367 pfr_create_kentry:return => 0 (@12e) 0 74367 pfr_create_kentry:return pf.ko`pfr_ina_define+0x6d6 pf.ko`pfioctl+0x49bb kernel`devfs_ioctl+0xc7 kernel`vn_ioctl+0x1a4 kernel`devfs_ioctl_f+0x1e kernel`kern_ioctl+0x26d kernel`sys_ioctl+0xf6 kernel`amd64_syscall+0x755 kernel`0xffffffff8106227e 0 74366 pfr_create_kentry:entry In 0 21147 cache_alloc_retry:entry => fffff80006ac1e00 (@fffff80006ac1c80) 0 21145 cache_alloc:entry => fffff80006ac1e00 (@fffff80006ac1c80) 0 21202 zone_alloc_item:entry => 0 (@fffff80006ac1c80) 0 74367 pfr_create_kentry:return => 0 (@12e) 0 74367 pfr_create_kentry:return pf.ko`pfr_ina_define+0x6d6 pf.ko`pfioctl+0x49bb kernel`devfs_ioctl+0xc7 kernel`vn_ioctl+0x1a4 kernel`devfs_ioctl_f+0x1e kernel`kern_ioctl+0x26d kernel`sys_ioctl+0xf6 kernel`amd64_syscall+0x755 kernel`0xffffffff8106227e root@server5:~# mv dtrace_pf.sh dtrace.script root@server5:~# dtrace -s dtrace.script -c "pfctl -f /etc/pf.conf" dtrace: script 'dtrace.script' matched 9 probes /etc/pf.conf:21: cannot define table fireholL1: Cannot allocate memory /etc/pf.conf:22: cannot define table fireholL2: Cannot allocate memory /etc/pf.conf:23: cannot define table fireholL3: Cannot allocate memory /etc/pf.conf:24: cannot define table fireholWEB: Cannot allocate memory /etc/pf.conf:25: cannot define table normshield: Cannot allocate memory /etc/pf.conf:26: cannot define table ipblacklistcloud: Cannot allocate memory /etc/pf.conf:27: cannot define table Webbots: Cannot allocate memory /etc/pf.conf:28: cannot define table haley_ssh: Cannot allocate memory /etc/pf.conf:29: cannot define table bi_any_1_7d: Cannot allocate memory pfctl: Syntax error in config file: pf rules not loaded CPU ID FUNCTION:NAME 0 74366 pfr_create_kentry:entry In 0 21147 cache_alloc_retry:entry => fffff80006ac1e00 (@fffff80006ac1c80) 0 21145 cache_alloc:entry => fffff80006ac1e00 (@fffff80006ac1c80) 0 21202 zone_alloc_item:entry => 0 (@fffff80006ac1c80) 0 74367 pfr_create_kentry:return => 0 (@12e) 0 74367 pfr_create_kentry:return pf.ko`pfr_ina_define+0x6d6 pf.ko`pfioctl+0x49bb kernel`devfs_ioctl+0xc7 kernel`vn_ioctl+0x1a4 kernel`devfs_ioctl_f+0x1e kernel`kern_ioctl+0x26d kernel`sys_ioctl+0xf6 kernel`amd64_syscall+0x755 kernel`0xffffffff8106227e 0 74366 pfr_create_kentry:entry In 0 21147 cache_alloc_retry:entry => fffff80006ac1e00 (@fffff80006ac1c80) 0 21145 cache_alloc:entry => fffff80006ac1e00 (@fffff80006ac1c80) 0 21202 zone_alloc_item:entry => 0 (@fffff80006ac1c80) 0 74367 pfr_create_kentry:return => 0 (@12e) 0 74367 pfr_create_kentry:return pf.ko`pfr_ina_define+0x6d6 pf.ko`pfioctl+0x49bb kernel`devfs_ioctl+0xc7 kernel`vn_ioctl+0x1a4 kernel`devfs_ioctl_f+0x1e kernel`kern_ioctl+0x26d kernel`sys_ioctl+0xf6 kernel`amd64_syscall+0x755 kernel`0xffffffff8106227e 0 74366 pfr_create_kentry:entry In 0 21147 cache_alloc_retry:entry => fffff80006ac1e00 (@fffff80006ac1c80) 0 21145 cache_alloc:entry => fffff80006ac1e00 (@fffff80006ac1c80) 0 21202 zone_alloc_item:entry => 0 (@fffff80006ac1c80) 0 74367 pfr_create_kentry:return => 0 (@12e) 0 74367 pfr_create_kentry:return pf.ko`pfr_ina_define+0x6d6 pf.ko`pfioctl+0x49bb kernel`devfs_ioctl+0xc7 kernel`vn_ioctl+0x1a4 kernel`devfs_ioctl_f+0x1e kernel`kern_ioctl+0x26d kernel`sys_ioctl+0xf6 kernel`amd64_syscall+0x755 kernel`0xffffffff8106227e 0 74366 pfr_create_kentry:entry In 0 21147 cache_alloc_retry:entry => fffff80006ac1e00 (@fffff80006ac1c80) 0 21145 cache_alloc:entry => fffff80006ac1e00 (@fffff80006ac1c80) 0 21202 zone_alloc_item:entry => 0 (@fffff80006ac1c80) 0 74367 pfr_create_kentry:return => 0 (@12e) 0 74367 pfr_create_kentry:return pf.ko`pfr_ina_define+0x6d6 pf.ko`pfioctl+0x49bb kernel`devfs_ioctl+0xc7 kernel`vn_ioctl+0x1a4 kernel`devfs_ioctl_f+0x1e kernel`kern_ioctl+0x26d kernel`sys_ioctl+0xf6 kernel`amd64_syscall+0x755 kernel`0xffffffff8106227e 0 74366 pfr_create_kentry:entry In 0 21147 cache_alloc_retry:entry => fffff80006ac1e00 (@fffff80006ac1c80) 0 21145 cache_alloc:entry => fffff80006ac1e00 (@fffff80006ac1c80) 0 21202 zone_alloc_item:entry => 0 (@fffff80006ac1c80) 0 74367 pfr_create_kentry:return => 0 (@12e) 0 74367 pfr_create_kentry:return pf.ko`pfr_ina_define+0x6d6 pf.ko`pfioctl+0x49bb kernel`devfs_ioctl+0xc7 kernel`vn_ioctl+0x1a4 kernel`devfs_ioctl_f+0x1e kernel`kern_ioctl+0x26d kernel`sys_ioctl+0xf6 kernel`amd64_syscall+0x755 kernel`0xffffffff8106227e 0 74366 pfr_create_kentry:entry In 0 21147 cache_alloc_retry:entry => fffff80006ac1e00 (@fffff80006ac1c80) 0 21145 cache_alloc:entry => fffff80006ac1e00 (@fffff80006ac1c80) 0 21202 zone_alloc_item:entry => 0 (@fffff80006ac1c80) 0 74367 pfr_create_kentry:return => 0 (@12e) 0 74367 pfr_create_kentry:return pf.ko`pfr_ina_define+0x6d6 pf.ko`pfioctl+0x49bb kernel`devfs_ioctl+0xc7 kernel`vn_ioctl+0x1a4 kernel`devfs_ioctl_f+0x1e kernel`kern_ioctl+0x26d kernel`sys_ioctl+0xf6 kernel`amd64_syscall+0x755 kernel`0xffffffff8106227e 0 74366 pfr_create_kentry:entry In 0 21147 cache_alloc_retry:entry => fffff80006ac1e00 (@fffff80006ac1c80) 0 21145 cache_alloc:entry => fffff80006ac1e00 (@fffff80006ac1c80) 0 21202 zone_alloc_item:entry => 0 (@fffff80006ac1c80) 0 74367 pfr_create_kentry:return => 0 (@12e) 0 74367 pfr_create_kentry:return pf.ko`pfr_ina_define+0x6d6 pf.ko`pfioctl+0x49bb kernel`devfs_ioctl+0xc7 kernel`vn_ioctl+0x1a4 kernel`devfs_ioctl_f+0x1e kernel`kern_ioctl+0x26d kernel`sys_ioctl+0xf6 kernel`amd64_syscall+0x755 kernel`0xffffffff8106227e 0 74366 pfr_create_kentry:entry In 0 21147 cache_alloc_retry:entry => fffff80006ac1e00 (@fffff80006ac1c80) 0 21145 cache_alloc:entry => fffff80006ac1e00 (@fffff80006ac1c80) 0 21202 zone_alloc_item:entry => 0 (@fffff80006ac1c80) 0 74367 pfr_create_kentry:return => 0 (@12e) 0 74367 pfr_create_kentry:return pf.ko`pfr_ina_define+0x6d6 pf.ko`pfioctl+0x49bb kernel`devfs_ioctl+0xc7 kernel`vn_ioctl+0x1a4 kernel`devfs_ioctl_f+0x1e kernel`kern_ioctl+0x26d kernel`sys_ioctl+0xf6 kernel`amd64_syscall+0x755 kernel`0xffffffff8106227e 0 74366 pfr_create_kentry:entry In 0 21147 cache_alloc_retry:entry => fffff80006ac1e00 (@fffff80006ac1c80) 0 21145 cache_alloc:entry => fffff80006ac1e00 (@fffff80006ac1c80) 0 21202 zone_alloc_item:entry => 0 (@fffff80006ac1c80) 0 74367 pfr_create_kentry:return => 0 (@12e) 0 74367 pfr_create_kentry:return pf.ko`pfr_ina_define+0x6d6 pf.ko`pfioctl+0x49bb kernel`devfs_ioctl+0xc7 kernel`vn_ioctl+0x1a4 kernel`devfs_ioctl_f+0x1e kernel`kern_ioctl+0x26d kernel`sys_ioctl+0xf6 kernel`amd64_syscall+0x755 kernel`0xffffffff8106227e dtrace: pid 63447 exited with status 1 (In reply to Diego Linke from comment #44) Ops, I apologize, I pasted more than necesary in the comment above. Please find the correct output. This is a FreeBSD 13 RELEASE amd64 # dtrace -s dtrace.script -c "pfctl -f /etc/pf.conf" dtrace: script 'dtrace.script' matched 9 probes /etc/pf.conf:21: cannot define table fireholL1: Cannot allocate memory /etc/pf.conf:22: cannot define table fireholL2: Cannot allocate memory /etc/pf.conf:23: cannot define table fireholL3: Cannot allocate memory /etc/pf.conf:24: cannot define table fireholWEB: Cannot allocate memory /etc/pf.conf:25: cannot define table normshield: Cannot allocate memory /etc/pf.conf:26: cannot define table ipblacklistcloud: Cannot allocate memory /etc/pf.conf:27: cannot define table Webbots: Cannot allocate memory /etc/pf.conf:28: cannot define table haley_ssh: Cannot allocate memory /etc/pf.conf:29: cannot define table bi_any_1_7d: Cannot allocate memory pfctl: Syntax error in config file: pf rules not loaded CPU ID FUNCTION:NAME 0 74366 pfr_create_kentry:entry In 0 21147 cache_alloc_retry:entry => fffff80006ac1e00 (@fffff80006ac1c80) 0 21145 cache_alloc:entry => fffff80006ac1e00 (@fffff80006ac1c80) 0 21202 zone_alloc_item:entry => 0 (@fffff80006ac1c80) 0 74367 pfr_create_kentry:return => 0 (@12e) 0 74367 pfr_create_kentry:return pf.ko`pfr_ina_define+0x6d6 pf.ko`pfioctl+0x49bb kernel`devfs_ioctl+0xc7 kernel`vn_ioctl+0x1a4 kernel`devfs_ioctl_f+0x1e kernel`kern_ioctl+0x26d kernel`sys_ioctl+0xf6 kernel`amd64_syscall+0x755 kernel`0xffffffff8106227e 0 74366 pfr_create_kentry:entry In 0 21147 cache_alloc_retry:entry => fffff80006ac1e00 (@fffff80006ac1c80) 0 21145 cache_alloc:entry => fffff80006ac1e00 (@fffff80006ac1c80) 0 21202 zone_alloc_item:entry => 0 (@fffff80006ac1c80) 0 74367 pfr_create_kentry:return => 0 (@12e) 0 74367 pfr_create_kentry:return pf.ko`pfr_ina_define+0x6d6 pf.ko`pfioctl+0x49bb kernel`devfs_ioctl+0xc7 kernel`vn_ioctl+0x1a4 kernel`devfs_ioctl_f+0x1e kernel`kern_ioctl+0x26d kernel`sys_ioctl+0xf6 kernel`amd64_syscall+0x755 kernel`0xffffffff8106227e 0 74366 pfr_create_kentry:entry In 0 21147 cache_alloc_retry:entry => fffff80006ac1e00 (@fffff80006ac1c80) 0 21145 cache_alloc:entry => fffff80006ac1e00 (@fffff80006ac1c80) 0 21202 zone_alloc_item:entry => 0 (@fffff80006ac1c80) 0 74367 pfr_create_kentry:return => 0 (@12e) 0 74367 pfr_create_kentry:return pf.ko`pfr_ina_define+0x6d6 pf.ko`pfioctl+0x49bb kernel`devfs_ioctl+0xc7 kernel`vn_ioctl+0x1a4 kernel`devfs_ioctl_f+0x1e kernel`kern_ioctl+0x26d kernel`sys_ioctl+0xf6 kernel`amd64_syscall+0x755 kernel`0xffffffff8106227e 0 74366 pfr_create_kentry:entry In 0 21147 cache_alloc_retry:entry => fffff80006ac1e00 (@fffff80006ac1c80) 0 21145 cache_alloc:entry => fffff80006ac1e00 (@fffff80006ac1c80) 0 21202 zone_alloc_item:entry => 0 (@fffff80006ac1c80) 0 74367 pfr_create_kentry:return => 0 (@12e) 0 74367 pfr_create_kentry:return pf.ko`pfr_ina_define+0x6d6 pf.ko`pfioctl+0x49bb kernel`devfs_ioctl+0xc7 kernel`vn_ioctl+0x1a4 kernel`devfs_ioctl_f+0x1e kernel`kern_ioctl+0x26d kernel`sys_ioctl+0xf6 kernel`amd64_syscall+0x755 kernel`0xffffffff8106227e 0 74366 pfr_create_kentry:entry In 0 21147 cache_alloc_retry:entry => fffff80006ac1e00 (@fffff80006ac1c80) 0 21145 cache_alloc:entry => fffff80006ac1e00 (@fffff80006ac1c80) 0 21202 zone_alloc_item:entry => 0 (@fffff80006ac1c80) 0 74367 pfr_create_kentry:return => 0 (@12e) 0 74367 pfr_create_kentry:return pf.ko`pfr_ina_define+0x6d6 pf.ko`pfioctl+0x49bb kernel`devfs_ioctl+0xc7 kernel`vn_ioctl+0x1a4 kernel`devfs_ioctl_f+0x1e kernel`kern_ioctl+0x26d kernel`sys_ioctl+0xf6 kernel`amd64_syscall+0x755 kernel`0xffffffff8106227e 0 74366 pfr_create_kentry:entry In 0 21147 cache_alloc_retry:entry => fffff80006ac1e00 (@fffff80006ac1c80) 0 21145 cache_alloc:entry => fffff80006ac1e00 (@fffff80006ac1c80) 0 21202 zone_alloc_item:entry => 0 (@fffff80006ac1c80) 0 74367 pfr_create_kentry:return => 0 (@12e) 0 74367 pfr_create_kentry:return pf.ko`pfr_ina_define+0x6d6 pf.ko`pfioctl+0x49bb kernel`devfs_ioctl+0xc7 kernel`vn_ioctl+0x1a4 kernel`devfs_ioctl_f+0x1e kernel`kern_ioctl+0x26d kernel`sys_ioctl+0xf6 kernel`amd64_syscall+0x755 kernel`0xffffffff8106227e 0 74366 pfr_create_kentry:entry In 0 21147 cache_alloc_retry:entry => fffff80006ac1e00 (@fffff80006ac1c80) 0 21145 cache_alloc:entry => fffff80006ac1e00 (@fffff80006ac1c80) 0 21202 zone_alloc_item:entry => 0 (@fffff80006ac1c80) 0 74367 pfr_create_kentry:return => 0 (@12e) 0 74367 pfr_create_kentry:return pf.ko`pfr_ina_define+0x6d6 pf.ko`pfioctl+0x49bb kernel`devfs_ioctl+0xc7 kernel`vn_ioctl+0x1a4 kernel`devfs_ioctl_f+0x1e kernel`kern_ioctl+0x26d kernel`sys_ioctl+0xf6 kernel`amd64_syscall+0x755 kernel`0xffffffff8106227e 0 74366 pfr_create_kentry:entry In 0 21147 cache_alloc_retry:entry => fffff80006ac1e00 (@fffff80006ac1c80) 0 21145 cache_alloc:entry => fffff80006ac1e00 (@fffff80006ac1c80) 0 21202 zone_alloc_item:entry => 0 (@fffff80006ac1c80) 0 74367 pfr_create_kentry:return => 0 (@12e) 0 74367 pfr_create_kentry:return pf.ko`pfr_ina_define+0x6d6 pf.ko`pfioctl+0x49bb kernel`devfs_ioctl+0xc7 kernel`vn_ioctl+0x1a4 kernel`devfs_ioctl_f+0x1e kernel`kern_ioctl+0x26d kernel`sys_ioctl+0xf6 kernel`amd64_syscall+0x755 kernel`0xffffffff8106227e 0 74366 pfr_create_kentry:entry In 0 21147 cache_alloc_retry:entry => fffff80006ac1e00 (@fffff80006ac1c80) 0 21145 cache_alloc:entry => fffff80006ac1e00 (@fffff80006ac1c80) 0 21202 zone_alloc_item:entry => 0 (@fffff80006ac1c80) 0 74367 pfr_create_kentry:return => 0 (@12e) 0 74367 pfr_create_kentry:return pf.ko`pfr_ina_define+0x6d6 pf.ko`pfioctl+0x49bb kernel`devfs_ioctl+0xc7 kernel`vn_ioctl+0x1a4 kernel`devfs_ioctl_f+0x1e kernel`kern_ioctl+0x26d kernel`sys_ioctl+0xf6 kernel`amd64_syscall+0x755 kernel`0xffffffff8106227e dtrace: pid 63447 exited with status 1 (In reply to Kristof Provost from comment #43) Thanks - tried first with manually applying, and got a build underway before i saw yr update (will a kernel build/install be sufficient?) (In reply to Diego Linke from comment #45) Okay, that tells us at least that I used the entry probe in a few places where I meant to use the return probe. But also that it looks like in cache_alloc_retry() we failed to find a bucket with free space, and then presumably also failed zone_alloc_item(). That still looks for all the world like you're out of memory. Here's an updated dtrace script: #!/usr/sbin/dtrace -s BEGIN { self->in_call = 0; } fbt:pf:pfr_create_kentry:entry { self->in_call = 1; printf("In"); } fbt:pf:pfr_create_kentry:return { self->in_call = 0; printf("=> %x (@%x)\n", arg1, arg0); } fbt:pf:pfr_create_kentry:return / arg1 == 0 / { stack(); } fbt:kernel:uma_zalloc_arg:return / self->in_call == 1 / { printf("=> %x (@%x)", arg1, arg0); } fbt:kernel:cache_alloc_retry:return / self->in_call == 1 / { printf("=> %x (@%x)", arg1, arg0); } fbt:kernel:cache_alloc:return / self->in_call == 1 / { printf("=> %x (@%x)", arg1, arg0); } fbt:kernel:zone_alloc_item:return / self->in_call == 1 / { printf("=> %x (@%x)", arg1, arg0); } fbt:kernel:uma_small_alloc:return / self->in_call == 1 / { printf("=> %x (@%x)", arg1, arg0); } fbt:kernel:zone_alloc_limit_hard:return / self->in_call == 1 / { printf("=> %x (@%x)", arg1, arg0); } fbt:kernel:zone_free_limit:entry / self->in_call == 1 / { printf("In"); } (In reply to tech-lists from comment #46) You'll also want to update pfctl. cd /usr/src/sbin/pfctl && make && sudo make install should be sufficient for that. (In reply to Kristof Provost from comment #47) Yes, this is small EC2 Virtual Machine. This used to work fine with FreeBSD 12 and pf. # top -b | head -8 last pid: 22974; load averages: 0.81, 0.81, 0.76 up 16+01:51:17 12:33:44 66 processes: 2 running, 63 sleeping, 1 zombie CPU: 0.8% user, 0.0% nice, 0.2% system, 0.0% interrupt, 99.0% idle Mem: 42M Active, 179M Inact, 15M Laundry, 609M Wired, 103M Buf, 79M Free ARC: 98M Total, 4310K MFU, 32M MRU, 1824K Anon, 2034K Header, 57M Other 7726K Compressed, 31M Uncompressed, 4.12:1 Ratio Swap: 512M Total, 38M Used, 474M Free, 7% Inuse # vmstat procs memory page disks faults cpu r b w avm fre flt re pi po fr sr ad0 xb5 in sy cs us sy id 1 0 0 1.0G 79M 237 1 0 0 286 39 0 0 41 377 215 1 0 99 # swapinfo Device 1K-blocks Used Avail Capacity /dev/md99 524288 39088 485200 7% Please find below the output with the latest dtrace script. I think there is a syntax error on it: # dtrace -s dtrace.script -c "pfctl -f /etc/pf.conf" dtrace: failed to compile script dtrace.script: line 1: probe description fbt:kernel:zone_alloc_limit_hard:return does not match any probes /etc/pf.conf:21: cannot define table fireholL1: Cannot allocate memory /etc/pf.conf:22: cannot define table fireholL2: Cannot allocate memory /etc/pf.conf:23: cannot define table fireholL3: Cannot allocate memory /etc/pf.conf:24: cannot define table fireholWEB: Cannot allocate memory /etc/pf.conf:25: cannot define table normshield: Cannot allocate memory /etc/pf.conf:26: cannot define table ipblacklistcloud: Cannot allocate memory /etc/pf.conf:27: cannot define table Webbots: Cannot allocate memory /etc/pf.conf:28: cannot define table haley_ssh: Cannot allocate memory /etc/pf.conf:29: cannot define table bi_any_1_7d: Cannot allocate memory pfctl: Syntax error in config file: pf rules not loaded (In reply to Diego Linke from comment #49) So that looks a lot more like a machine that's at the limits of its free memory, in which case what you're seeing isn't actually an error but things working as expected. The probe exists on stable/13. Apparently not in 13.0. At this point the first thing you need to try is to run this problem on a machine with more memory. (In reply to Kristof Provost from comment #50) Yes it does make sense. I just reboot the machine and the PF is working fine now. I was avoiding to make it to troubleshoot the issue better. The machine HW config was always the same, it's called my attention that this started to happen just after the upgrade to 13.0. My last question is I had 79 Mb in free memory plus ~476 Mb in swap space. I wondering why PF as the last resource doesn't use swap memory to load the tables. (In reply to Kristof Provost from comment #48) ok the rpi4 is back up, with a new (current, GENERIC, so debug) kernel [...] pf-badhost 1405 - - Using experimental "aggy" aggregator... 1 table created. 15680 addresses added. pf-badhost 1473 - - IPv4 addresses in table: 618966639 # pfctl -e -f /etc/pf.conf pf enabled # service sshguard status sshguard is running as pid 1119. [...] (In reply to Diego Linke from comment #51) > My last question is I had 79 Mb in free memory plus ~476 Mb in swap space. I wondering why PF as the last resource doesn't use swap memory to load the tables. Kernel memory, including that allocated by pf, will not get swapped out. We wouldn't want to have to wait to process a packet until we'd read the relevant pf table back from swap space. (In reply to Kristof Provost from comment #53) Yes, I does make sense. I thought that I had free memory enough to load the tables without swap, the tables are not so big. Thank you very much I will continue to monitor it :-) (In reply to Kristof Provost from comment #48) Looks like your patch may have fixed the issue. I'll try making pfhost load bigger tables to make sure. So far, and it does this every 3 hrs approx: [1800hrs approx] pf-badhost 2232 - - Using experimental "aggy" aggregator... 15 addresses added. pf-badhost 2300 - - IPv4 addresses in table: 618966654 [2100hrs approx] pf-badhost 2852 - - Using experimental "aggy" aggregator... 23 addresses added. 1 addresses deleted. pf-badhost 2920 - - IPv4 addresses in table: 618966677 I'll check again later, then if all ok, make pfbadhost get more IPs still ok # doas -u _pfbadhost pf-badhost -O freebsd Password: pf-badhost 4113 - - Using experimental "aggy" aggregator... 24504 addresses added. 4155 addresses deleted. pf-badhost 4181 - - IPv4 addresses in table: 1063826457 # pfctl -f /etc/pf.conf # pfctl -f /etc/pf.conf # service sshguard status sshguard is running as pid 3001. Hi, Reporting that the rpi4 arm64.aarch64 is still running ok. I did however encounter a similar error with an amd64-based bhyve vm. This vm is unmodified in comparison to the arm64.aarch64 device. It runs 13.0-p5 and updates via freebsd-update. The symptoms were a bit different though, in that: 1. the script ran without error 2. no services were accessible. No ping, smtp, nothing, from anywhere. (this is the best failure mode honestly. The worst nightmare would be not-to-load-leaving-everything-open). I checked the tables for blocked IPs and made sure my accessing IP wasn't listed and it wasn't. 3. on reboot, services didn't come back. I had to access via the console. I was able to bring services back by disabling pf, updating the ruleset then re-enabling it. maybe this would need a different ticket & diags I am not sure if I am fighting the same problem. See https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=259798 Feel free to link the tickets if relevant. (In reply to jjasen from comment #58) This is on AMD64, FreeBSD 13.0-p4. (In reply to Diego Linke from comment #54) Just reporting that today I faced this issue again in another machine, after the upgrade from 12 to 13.0. Yes, this machine also doesn't have a lot of memory but I never faced this issue on 12. I think something changed in either PF or how FreeBSD 13.0 deal with memory allocation. (In reply to Diego Linke from comment #54) Just reporting that today I faced this issue again in another machine, after the upgrade from 12 to 13.0. Yes, this machine also doesn't have a lot of memory but I never faced this issue on 12. I think something changed in either PF or how FreeBSD 13.0 deal with memory allocation. (In reply to Diego Linke from comment #61) Hi, What do you have for the following sysctls: kern.maxdsiz net.pf.request_maxcount ? (In reply to tech-lists from comment #62) net.pf.request_maxcount is not relevant for this issue. The allocation fails after that check. I'm less familiar with kern.maxdsiz, but it also appears to not be relevant to this issue. (In reply to tech-lists from comment #62) kern.maxdsiz: 34359738368 net.pf.request_maxcount: 65535 I'm observing the same thing happening on amd64. In the beginning I thought it might be due to my own patches on pf. Then I thought it only happens on systems which are high on inactive memory (my systems behave differently regarding inactive, some have lot of free memory, some bloat the inactive memory, no idea why) which gave me the idea that maybe inactive memory can't be freed when allocating memory with M_NOWAIT. However today I finally got the same issue on a system running a nonpatched 13.0-RELEASE GENERIC kernel on a system without bloated inactive memory. [17:02:34] al-router01 ~/ # vmstat -z | grep pf pf mtags: 48, 0, 0, 2520,12630497, 0, 0, 0 pf tags: 104, 0, 0, 0, 0, 0, 0, 0 pf states: 296, 4000000, 7038, 39177,3739030613, 0, 0, 0 pf state keys: 88, 0, 7038, 85008,3739030613, 0, 0, 0 pf source nodes: 136, 400000, 0, 0, 0, 0, 0, 0 pf table entry counters: 64, 0, 0, 0, 0, 0, 0, 0 pf table entries: 160, 1000000, 320952, 310548,259049528,180913, 0, 0 pf frags: 248, 0, 0, 96,10783117, 0, 0, 0 pf frag entries: 40, 32000, 0, 606,50033469, 0, 0, 0 pf state scrubs: 40, 0, 0, 0, 0, 0, 0, 0 So there is a limit of 1M entries, 320k are used, 310k are free. When I reload pf.conf, I need double the entries, yes? So the new pf.conf won't load. But where is the missing 370k entries? writing to say the pf on the rpi4 is still working without error % uptime 10:50p.m. up 12 days, 7:04, 7 users, load averages: 5.11, 4.85, 4.65 Same problem here with 'pfctl: Cannot allocate memory.', it's reported by fail2ban (anyway same for blacklistd). 2022-02-09 19:34:30,354 fail2ban.utils [64280]: ERROR 8021b2730 -- exec: pfctl -a f2b/ssh-pf -t f2b-ssh-pf -T add 45.9.20.25 2022-02-09 19:34:30,354 fail2ban.utils [64280]: ERROR 8021b2730 -- stderr: 'pfctl: Cannot allocate memory.' 2022-02-09 19:34:30,354 fail2ban.utils [64280]: ERROR 8021b2730 -- killed with signal 127 (return code: 255) # freebsd-version -uk 13.0-RELEASE-p4 13.0-RELEASE-p4 "pfctl -T del IP" still usable, but add new rule is impossible until reboot :( # vmstat -m | grep -E 'pf|Size' Type InUse MemUse Requests Size(s) pfs_nodes 20 8K 20 384 pfs_vncache 1 128K 1 pfil 11 1K 11 64,128 tcpfunc 1 1K 1 64 pf_temp 0 0K 55 32 pf_hash 5 11524K 5 2048 pf_ifnet 19 7K 171 256,2048 pf_osfp 1191 123K 3573 64,128 pf_rule 269 181K 341 128,1024 pf_table 11 22K 24200 2048 # vmstat -z | grep pf pf mtags: 48, 0, 0, 84, 55, 0, 0, 0 pf tags: 104, 0, 0, 0, 0, 0, 0, 0 pf states: 296, 100000, 40, 2703, 4287549, 0, 0,2135254 pf state keys: 88, 0, 58, 6106, 4592659, 0, 0,2280096 pf source nodes: 136, 10000, 0, 0, 0, 0, 0, 0 pf table entry counters: 64, 0, 0, 0, 200, 0, 0, 0 pf table entries: 160, 200000, 152, 48, 488,7420, 0, 0 pf frags: 248, 0, 0, 16, 15, 0, 0, 0 pf frag entries: 40, 5000, 0, 101, 40, 0, 0, 0 pf state scrubs: 40, 0, 0, 0, 0, 0, 0, 0 # sysctl -a | grep net.pf net.pf.rule_tag_hashsize: 128 net.pf.request_maxcount: 65535 net.pf.source_nodes_hashsize: 32768 net.pf.states_hashsize: 131072 # pfctl -si Status: Enabled for 93 days 22:56:25 Debug: Urgent State Table Total Rate current entries 840 searches 5992674224 738.2/s inserts 4288356 0.5/s removals 4287516 0.5/s Counters match 1169829912 144.1/s bad-offset 0 0.0/s fragment 0 0.0/s short 0 0.0/s normalize 87 0.0/s memory 0 0.0/s bad-timestamp 0 0.0/s congestion 0 0.0/s ip-option 2 0.0/s proto-cksum 0 0.0/s state-mismatch 1206 0.0/s state-insert 0 0.0/s state-limit 0 0.0/s src-limit 0 0.0/s synproxy 0 0.0/s map-failed 0 0.0/s # top -b | head -8 last pid: 20669; load averages: 0.08, 0.11, 0.09 up 93+22:59:55 18:11:49 160 processes: 1 running, 158 sleeping, 1 zombie CPU: 0.3% user, 0.0% nice, 0.1% system, 0.0% interrupt, 99.6% idle Mem: 90M Active, 829M Inact, 506M Laundry, 59G Wired, 2173M Free ARC: 52G Total, 28G MFU, 22G MRU, 3368K Anon, 281M Header, 1419M Other 48G Compressed, 61G Uncompressed, 1.27:1 Ratio Swap: 46G Total, 1939M Used, 44G Free, 4% Inuse (In reply to Jean-Claude MICHOT from comment #67) Is this on amd64 or arm64.aarch64 hardware? (In reply to tech-lists from comment #68) 13.0-RELEASE-p4 #0: Tue Aug 24 07:33:27 UTC 2021 root@amd64-builder.daemonology.net:/usr/obj/usr/src/amd64.amd64/sys/GENERIC amd64 running on Xeon X5670 (PowerEdge R610), with 65 GB of memory JC I'm seeing the problem on stable/13-n249464-d0199f27c06 (AMD64) now Shall I make another ticket for AMD64 specifically? (In reply to Kristof Provost from comment #43) The patch here also works on AMD64 (In reply to Kristof Provost from comment #48) Will your patch be included in the coming release 13.1 ? (In reply to Laurent Frigault from comment #72) No, because it's not a fix but a debugging step. Could anyone seeing this problem please attach output from "sysctl vm.uma.pf_table_entries" at some point after the allocation failure occurs? (In reply to Mark Johnston from comment #74) Never mind, I'm able to reproduce this locally. On my test system we have: vm.uma.pf_table_entries_3.limit.max_items: 2000000 vm.uma.pf_table_entries_3.limit.items: 17592185933560 So we have a limit accounting bug, ok. This is on a NUMA system, let's see if it happens on a single-domain system too. (In reply to Mark Johnston from comment #75) The problem has to do with the reimplementation of zone limits in 13.0. pf may allocate items from the table entry zone before a limit is set, and these items won't be counted. When a limit is set, and the uncounted items are freed (as can happen if there's a memory shortage), then they end up underflowing our counter. uma_zone_set_max() is missing an assertion that would have caught this. pf also needs to set a dummy limit on the table entry zone at initialization time. I'm not sure if other pf zones might have the same problem. I'll work on a patch. A commit in branch main references this bug: URL: https://cgit.FreeBSD.org/src/commit/?id=7d1ab866911a2b29e041d64bc83a93638533f957 commit 7d1ab866911a2b29e041d64bc83a93638533f957 Author: Mark Johnston <markj@FreeBSD.org> AuthorDate: 2022-03-30 19:41:44 +0000 Commit: Mark Johnston <markj@FreeBSD.org> CommitDate: 2022-03-30 19:41:44 +0000 pf: Initialize the table entry zone limit at initialization time The limit may later be updated by the "set limit" directive in pf.conf. UMA does not permit a limit to be set on a zone after any items have been allocated from a zone. Other UMA zones used by pf do not appear to be susceptible to this problem: they either set a limit at zone creation time or never set one at all. PR: 260406 Reviewed by: kp MFC after: 3 days Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D34713 sys/netpfil/pf/pf_table.c | 1 + 1 file changed, 1 insertion(+) (In reply to commit-hook from comment #77) excellent. will this also land in 13.1-R, 13.0-R and 12.3-R ? (In reply to tech-lists from comment #78) It will appear in 13.1, won't appear in 12.3 since the bug isn't present there, and we can issue an EN for 13.0 if there's desire for that. (In reply to Mark Johnston from comment #79) you said: > we can issue an EN for 13.0 if there's desire for that. yes please. I have a load of vms on 13.0-p6 and am nervous of upgrading to 13.1-R until I can trust it more. I'm sure it's not just me who is in this position. (In reply to tech-lists from comment #80) Same for me, that will be great for 13.0-pX A commit in branch stable/13 references this bug: URL: https://cgit.FreeBSD.org/src/commit/?id=1487f84223ce1341ebdfc87e50fbcddedc6d1de8 commit 1487f84223ce1341ebdfc87e50fbcddedc6d1de8 Author: Mark Johnston <markj@FreeBSD.org> AuthorDate: 2022-03-30 19:41:44 +0000 Commit: Mark Johnston <markj@FreeBSD.org> CommitDate: 2022-04-02 15:18:15 +0000 pf: Initialize the table entry zone limit at initialization time The limit may later be updated by the "set limit" directive in pf.conf. UMA does not permit a limit to be set on a zone after any items have been allocated from a zone. Other UMA zones used by pf do not appear to be susceptible to this problem: they either set a limit at zone creation time or never set one at all. PR: 260406 Reviewed by: kp Sponsored by: The FreeBSD Foundation (cherry picked from commit 7d1ab866911a2b29e041d64bc83a93638533f957) sys/netpfil/pf/pf_table.c | 1 + 1 file changed, 1 insertion(+) A commit in branch releng/13.1 references this bug: URL: https://cgit.FreeBSD.org/src/commit/?id=367d0c7d6617c8ef316b71e420b8a7a7a7e66a8a commit 367d0c7d6617c8ef316b71e420b8a7a7a7e66a8a Author: Mark Johnston <markj@FreeBSD.org> AuthorDate: 2022-03-30 19:41:44 +0000 Commit: Mark Johnston <markj@FreeBSD.org> CommitDate: 2022-04-04 14:04:14 +0000 pf: Initialize the table entry zone limit at initialization time The limit may later be updated by the "set limit" directive in pf.conf. UMA does not permit a limit to be set on a zone after any items have been allocated from a zone. Other UMA zones used by pf do not appear to be susceptible to this problem: they either set a limit at zone creation time or never set one at all. Approved by: re (gjb) PR: 260406 Reviewed by: kp Sponsored by: The FreeBSD Foundation (cherry picked from commit 7d1ab866911a2b29e041d64bc83a93638533f957) (cherry picked from commit 1487f84223ce1341ebdfc87e50fbcddedc6d1de8) sys/netpfil/pf/pf_table.c | 1 + 1 file changed, 1 insertion(+) (In reply to tech-lists from comment #80) Ok, I'll stage a 13.0 erratum notice for this bug. A commit in branch releng/13.0 references this bug: URL: https://cgit.FreeBSD.org/src/commit/?id=5b789e0c92a7c363b36111b1f75519f2acd21f97 commit 5b789e0c92a7c363b36111b1f75519f2acd21f97 Author: Mark Johnston <markj@FreeBSD.org> AuthorDate: 2022-04-05 23:26:02 +0000 Commit: Ed Maste <emaste@FreeBSD.org> CommitDate: 2022-04-05 23:26:02 +0000 pf: Initialize the table entry zone limit at initialization time The limit may later be updated by the "set limit" directive in pf.conf. UMA does not permit a limit to be set on a zone after any items have been allocated from a zone. Other UMA zones used by pf do not appear to be susceptible to this problem: they either set a limit at zone creation time or never set one at all. PR: 260406 Reviewed by: kp Sponsored by: The FreeBSD Foundation (cherry picked from commit 7d1ab866911a2b29e041d64bc83a93638533f957) (cherry picked from commit 1487f84223ce1341ebdfc87e50fbcddedc6d1de8) Approved by: so Security: FreeBSD-EN-22:15.pf sys/netpfil/pf/pf_table.c | 1 + 1 file changed, 1 insertion(+) I still have this problem on 13.3-p2 kern.maxdsiz=34G # pfctl -f /etc/pf.conf /etc/pf.conf:28: cannot define table rkn_blocked: Cannot allocate memory pfctl: Syntax error in config file: pf rules not loaded table has about 150k ips |