259798 – relayd: fatal: sync_table: cannot set address list: Cannot allocate memory

Bug 259798 - relayd: fatal: sync_table: cannot set address list: Cannot allocate memory

Summary: relayd: fatal: sync_table: cannot set address list: Cannot allocate memory

Status:	Closed FIXED

Alias:	None

Product:	Base System
Classification:	Unclassified
Component:	misc (show other bugs)
Version:	13.0-RELEASE
Hardware:	amd64 Any

Importance:	--- Affects Some People
Assignee:	freebsd-bugs (Nobody)

URL:
Keywords:

Depends on:
Blocks:

Reported:	2021-11-12 14:40 UTC by jjasen
Modified:	2022-12-16 13:40 UTC (History)
CC List:	2 users (show)

See Also:	243561

Attachments
relayctl show redirect output (759 bytes, text/plain) 2021-11-18 17:48 UTC, jjasen	no flags	Details
relayd.conf (2.20 KB, text/plain) 2021-11-18 17:49 UTC, jjasen	no flags	Details
vmstat -m output soon after a relayd crash (10.27 KB, text/plain) 2021-11-18 17:50 UTC, jjasen	no flags	Details
vmstat-s output (1.48 KB, text/plain) 2021-11-18 19:06 UTC, jjasen	no flags	Details
View All Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description jjasen 2021-11-12 14:40:48 UTC

We run pf-based firewalls leveraging CARP and pfsync, and maintain load-balanced services on them using relayd.

Yesterday, relayd failed on the backup firewall, and would not restart via service commands. However, it would run with relayd -d (and relayd -dv).

States on the firewalls were nominal, memory usage was nominal. I'm not sure if this was relayd or kernel /dev/pf or which.

Nov 11 15:01:58 backup-firewall relayd[80498] startup
Nov 11 15:02:13 backup-firewall relayd[80499] fatal: sync_table: cannot set address list: Cannot allocate memory
Nov 11 15:02:13 backup-firewall relayd[80500] hce exiting, pid 80500
Nov 11 15:02:13 backup-firewall relayd[80501] relay exiting, pid 80501
Nov 11 15:02:13 backup-firewall relayd[80503] relay exiting, pid 80503
Nov 11 15:02:13 backup-firewall relayd[80504] relay exiting, pid 80504
Nov 11 15:02:13 backup-firewall relayd[80505] ca exiting, pid 80505
Nov 11 15:02:13 backup-firewall relayd[80502] ca exiting, pid 80502
Nov 11 15:02:13 backup-firewall relayd[80506] ca exiting, pid 80506
Nov 11 15:02:13 backup-firewall relayd[80498] lost child: pfe exited abnormally
Nov 11 15:02:13 backup-firewall relayd[80498] parent terminating, pid 80498

Comment 1 tech-lists 2021-11-12 22:19:25 UTC

Hi,

Are you able to run something like memtest on that firewall? Something that would stress test the RAM.

At first glance it looks like it might be a hardware problem ("cannot allocate memory" despite low memory usage)

Comment 2 jjasen 2021-11-13 00:48:02 UTC

(In reply to tech-lists from comment #1)
I think memory issues would have come up and been a lot more disruptive during its upgrade to FreeBSD 13.0, or within the period it was being proofed out as the primary firewall.

It really feels like my issue and https://lists.freebsd.org/archives/freebsd-pf/2021-October/000136.html are looking at the same problem from different aspects.

Comment 3 tech-lists 2021-11-13 01:29:08 UTC

(In reply to jjasen from comment #2)

I also use pf-badhosts but have seen no issues.
 
Among other machines, it's running on a raspberry pi4 (8GB) on stable/13 and also has the net.pf.request_maxcount=400000 set as per https://geoghegan.ca/pfbadhost.html#instructions. In /var/log/messages, there's lines like

Nov 11 00:00:20 REDACTED unbound-adblock[30209]:  Changes (+/-):  +7 Domain total :  128951
Nov 12 00:00:11 REDACTED pf-badhost[43205]:  IPv4 addresses in table:  620442279

In a similar context (not with pf-badhosts) on a different (amd64) machine (also 8GB) but running 12.0 or 12.1 where the maxcount value was set in boot/loader.conf, I ran up against the default limit (65536 I think) and had to manually set it to something like 254000.

But I got an error message that was sufficiently descriptive to allow me to solve the problem. IIRC it actually said that maxcount needed to be increased.

Unfortunately the error your system is reporting isn't as descriptive

Comment 4 jjasen 2021-11-18 17:48:14 UTC

Created attachment 229578 [details]
relayctl show redirect output

relayctl show redirect output

Comment 5 jjasen 2021-11-18 17:49:10 UTC

Created attachment 229579 [details]
relayd.conf

relayd.conf file in use on the two PF firewalls in question.

Comment 6 jjasen 2021-11-18 17:50:13 UTC

Created attachment 229580 [details]
vmstat -m output soon after a relayd crash

vmstat -m output soon after a relayd crash

Comment 7 jjasen 2021-11-18 19:06:22 UTC

Created attachment 229582 [details]
vmstat-s output

vmstat -s from a broken pf firewall

Comment 8 jjasen 2021-11-18 19:13:01 UTC

As an update, this is a general problem now with pf, it seems. I have two systems where relayd will not reload after start, and attempting rules reloads via pfctl result in:

etc/pf/tables.conf:938: cannot define table clients: Cannot allocate memory
... <ad nauseum>
pfctl: Syntax error in config file: pf rules not loaded



Some have indicated low memory causing the issue. These systems have 32GB of ram each, dual processor, multi-core, Intel E5-2697 v2.

Comment 9 tech-lists 2021-12-14 02:26:19 UTC

(In reply to jjasen from comment #0)
Hi,

I'm seeing the problem in a different context (arm64 on recent -current) please see https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=260406

Comment 10 tech-lists 2022-03-06 00:30:04 UTC

(In reply to jjasen from comment #8)

Hi,

I'm now seeing this problem on AMD64 like you, on stable/13-n249464-d0199f27c06 (built 1st March 2022)

Comment 11 tech-lists 2022-03-06 00:38:10 UTC

(In reply to jjasen from comment #8)

I've applied the patch from https://bugs.freebsd.org/bugzilla/attachment.cgi?id=230375&action=diff recompiled and rebooted, so far the problem hasn't reappeared

Comment 12 jjasen 2022-12-16 13:40:14 UTC

I believe this was addressed in a 13.0 errata release, but forgot which one.