Bug 219815 - ipfw stops working when more than one tables is used
Summary: ipfw stops working when more than one tables is used
Status: Closed Unable to Reproduce
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: 10.3-RELEASE
Hardware: amd64 Any
: --- Affects Only Me
Assignee: freebsd-bugs (Nobody)
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2017-06-06 05:32 UTC by ecsd
Modified: 2017-06-07 02:52 UTC (History)
1 user (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description ecsd 2017-06-06 05:32:06 UTC
This is not bug 165939. I do these things before doing anything with tables:

After the original fw flush:

sysctl kern.ipc.soacceptqueue=2048
ipfw table all flush { as recommended in 165939.}

The behavior is bizarre. I started working with tables, loaded addresses into them, added rules via programs, and the rules seemed to work ... mostly. I had a rule for table 86 to deny ip from any to any, and despite adding addresses to the table, some of those addresses still got through. When I added a lower numbered rule "deny ip from blah to any" that did not use tables, then the firewall stopped the packets.

Ever After a reboot and a system crash (due to a failing disk), I can add ONE TABLE that works, but despite adding addresses (I know connect to the machine) to other tables, they NEVER TRIGGER. The crash did not injure system software.

I originally manually (a) loaded the tables, (b) issued the ipfw command to effect a rule against the table. After a reboot before the crash, not only would the table rules refuse to trigger, higher-numbered firewall rules also refused to trigger. Causing havoc with my mail system where the tables formerly firewalled off intense bad guy traffic.

I then surmised that perhaps if I got rid of all the table-based firewall rules first, then loaded tables, then introduced rules to use them, that might fake the error out, but no. Only the first table rule shows hits and the rest pretend they see nothing at all. It also appears that whatever this is, stops the firing of any later-numbered rules.

Some of the tables are very large, but I phased them in from small to large, testing between each load, and it seems that as soon as a 2nd table is introduced, things break. I am stymied because before the reboot and later disk crash (not affecting the system main disk) the tables were working more than not, apart from the indication of a failure starting to occur when newly added "block all"-rule addresses were not being blocked.

Some of the trouble tickets made reference to ipfw.conf and directories where firewall control information is stored but man and apropos know nothing about it, nor does the online documentation. Since things were working for a while, I thought I had smashed a limit (when I converted the tables back into "deny" rules there were 12,800 of them.) But nothing says there is any upper limit to the content of a table (or any sysctl thingy to affect it.) Also I never used a table numbered higher than 120 - the system said the limit was 127, I think, while the man page says the limit is 1024, and then I found I could set the limit with sysctl.

I marked this as "affects many people" because if they see what I see ... then it does.

I see a couple trouble tickets that suggest similar funky behavior, but the contexts are different and I see nothing that says higher-numbered rules are ignored after tables enter the picture; I see "later rules don't /load/", not "don't fire". Also, ipfw complains when the scripts try to re-enter a subnet already in the table, the message is

ipfw: setsockopt(IP_FW_TABLE_XADD): File exists

but I assume that is not an issue. However, I get that message even when the rules are sorted and run through uniq to eliminate duplicates. While the tables were working I observed that X/N did not collide with X/M for N != M.The one hint of something going awry in the system guts is a curious message "No prev search." that occurs sometimes after the "add" of many many addresses, after which the add script seems to die:

I do
ipfw table N add X1
ipfw table N add X2
...
ipfw add # deny ip from table\(N\) to any

and when the "No prev search." message appears, the command to add the rule never executes.

Has anyone else observed similar behavior? Does anyone know how to make it stop (short of migrating to 11.0 from 10.3, if possible.) Are there any patches. Yada etcetera.

==

As to suggestions:

* It would be nice to be able to tell ipfw to list the rules in the order they were added to the table. As is they are listed numerically and there is no choice about the order.

* When ipfw complains 'XADD: file exists', it would be nice to see the address being complained of being duplicated.

* program call ''system("ipfw -q table # add address/mask");'' still complains about XADD dups to stderr, I thought '-q' was supposed to suppress that.

* it would be /real/ cool to have an option for 'ipfw table list', to have it output only the largest or smallest containing rules, and/or to have 'ipfw table add' delete a larger or smaller subnet upon insertion, e.g. the table contains

X/24
X/16

and the modified list command would only report X/16 since X/24 is entirely subsumed by it; or vice-versa. The insertion to delete supersets or subsets, i.e. insert X/16 if X/24 is already present, deletes X/24 and adds X/16; or vice-versa. This is one of those things "left as an exercise for the reader", I think.

ipfw show, shows me (excerpt):

...
01000  59589   5790205 allow ip from table(93) to any
01000  37042   6535481 allow ip from any to table(93)
01001      0         0 allow ip from table(94) to any dst-port 25
03000      0         0 deny ip from table(86) to any
03001      0         0 deny ip from table(101) to any dst-port 20-25,110,...
03001      0         0 deny ip from table(101) to any dst-port 20-25,110,...
03004      0         0 deny ip from table(103) to any dst-port 20-25,110,...
03005      0         0 deny ip from table(59) to any dst-port 25
03008      0         0 deny ip from table(102) to any dst-port 20-22,110,...
03009      0         0 deny ip from table(53) to any dst-port 25
03010      0         0 deny ip from table(80) to any dst-port 80,443
03011      0         0 deny ip from table(50) to any dst-port 25
03012      0         0 deny ip from table(58) to any dst-port 25
03013      0         0 deny ip from table(51) to any dst-port 25
03014      0         0 deny ip from table(21) to any dst-port 20,21
03015      0         0 deny ip from table(54) to any dst-port 25
03016      0         0 deny ip from table(61) to any dst-port 25
03017      0         0 deny ip from table(52) to any dst-port 25
03018      0         0 deny ip from table(20) to any dst-port 143,993
03019      0         0 deny ip from table(98) to any dst-port 20-22,110,...
03020      0         0 deny ip from table(11) to any dst-port 25
03021      0         0 deny ip from table(55) to any dst-port 25
03022      0         0 deny ip from table(10) to any dst-port 25
03023      0         0 deny ip from table(64) to any dst-port 25
05000     10       428 deny ip from 1.0.0.0/8 to any dst-port 20-25,110,...
...

Table 93 (first added) is the only table that works. Table 94 should have registered hits and does not, and for sure addresses were connecting that should have added hits the the 3xxx series rules; and rule 5000 ceased to increment after the table rules were added, despite knowing for sure that more hits should have accrued to it.

For jollies to know, table 100 refers to all of CN, and contains some 7700 entries. 103 refers to all of VN, for 550 entries.
Comment 1 Andrey V. Elsukov freebsd_committer freebsd_triage 2017-06-06 12:55:59 UTC
First of check your rules. You can use `log` keyword and enable verbose logging via syslog or via tcpdump what packet was hit by rule.
We use ipfw with tрousands of rules, hundreds of tables and hundreds of thousands addresses. It works as expected.
Comment 2 ecsd 2017-06-06 20:58:06 UTC
I am glad to hear there are not volume restrictions, but I fail to see what logging has to do with rules failing to fire when traffic that would trigger them is known for a certainty to have entered the machine. I could say "log deny" but if the rule never fires, then - ? And this issue asks what is wrong that adding as much as a 2nd table to the mix causes the firewall to start failing past the point (sequence number) where the 2nd table reference is made.
Comment 3 Andrey V. Elsukov freebsd_committer freebsd_triage 2017-06-06 21:22:19 UTC
(In reply to ecsd from comment #2)
> I am glad to hear there are not volume restrictions, but I fail to see what
> logging has to do with rules failing to fire when traffic that would trigger
> them is known for a certainty to have entered the machine. I could say "log
> deny" but if the rule never fires, then - ? And this issue asks what is
> wrong that adding as much as a 2nd table to the mix causes the firewall to
> start failing past the point (sequence number) where the 2nd table reference
> is made.

You can add `log` action to the `allow` rules. I suspect your first `allow` rules do match the packets that you want to be matched by `deny` rules.
Comment 4 ecsd 2017-06-06 21:50:39 UTC
I understand perfectly what you are saying. You are simply mistaken. I know exactly what sort traffic I expect. The source is high volume, and while of a stochastic nature as regards source address, very predictable (e.g. People's Republic of China generates so many hits per hour.) The rule sits there waiting for any such traffic. The rules are simply not firing. I am reporting a bug. It is a BUG, not an error in user observation or understanding. I am waiting to hear if anyone has experienced anything similar or if anyone can suggest something to try within the kernel to change the behavior (ie. make it stop malfunctioning.)