Bug 191975 - [ng_iface] [regression] in 10.0: cannot contact local services
Summary: [ng_iface] [regression] in 10.0: cannot contact local services
Status: Closed Feedback Timeout
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: 10.0-RELEASE
Hardware: Any Any
: Normal Affects Some People
Assignee: Eugene Grosbein
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2014-07-19 23:58 UTC by dgilbert
Modified: 2018-11-06 14:28 UTC (History)
2 users (show)

See Also:


Attachments
netstat -rn on the computer with the this ticket's problem (14.46 KB, text/plain)
2014-08-30 21:06 UTC, dgilbert
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description dgilbert 2014-07-19 23:58:06 UTC
On a machine connected to the server like:

ng2: flags=88d1<UP,POINTOPOINT,RUNNING,NOARP,SIMPLEX,MULTICAST> metric 0 mtu 1436
        inet 66.96.31.6 --> 66.96.16.50 netmask 0xffffffff
        inet6 fe80::219:b9ff:fef9:b9e7%ng2 prefixlen 64 scopeid 0x8
        nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>

Which also has:

root@owl:/usr/local/etc/mpd5 # ifconfig bge0.401
bge0.401: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
        options=3<RXCSUM,TXCSUM>
        ether 00:19:b9:f9:b9:e7
        inet 66.96.16.3 netmask 0xfffffff0 broadcast 66.96.16.15
        inet6 fe80::219:b9ff:fef9:b9e7%bge0.401 prefixlen 64 scopeid 0x4
        inet6 2001:1928::3 prefixlen 80
        nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
        media: Ethernet autoselect (100baseTX <full-duplex>)
        status: active
        vlan: 401 parent interface: bge0
root@owl:/usr/local/etc/mpd5 # ifconfig bge0
bge0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
        options=8009b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,LINKSTATE>
        ether 00:19:b9:f9:b9:e7
        inet 172.17.14.2 netmask 0xffffff00 broadcast 172.17.14.255
        inet6 fe80::219:b9ff:fef9:b9e7%bge0 prefixlen 64 scopeid 0x1
        nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
        media: Ethernet autoselect (100baseTX <full-duplex>)
        status: active

The following works:

i[1:49:349]root@strike:/mnt/usr/local/share/asterisk> ping 66.96.31.6
PING 66.96.31.6 (66.96.31.6): 56 data bytes
64 bytes from 66.96.31.6: icmp_seq=0 ttl=64 time=5.939 ms
64 bytes from 66.96.31.6: icmp_seq=1 ttl=64 time=7.179 ms
^C
--- 66.96.31.6 ping statistics ---
2 packets transmitted, 2 packets received, 0.0% packet loss
round-trip min/avg/max/stddev = 5.939/6.559/7.179/0.620 ms
[1:50:350]root@strike:/mnt/usr/local/share/asterisk> ping 66.96.16.3
PING 66.96.16.3 (66.96.16.3): 56 data bytes
64 bytes from 66.96.16.3: icmp_seq=0 ttl=64 time=6.136 ms
64 bytes from 66.96.16.3: icmp_seq=1 ttl=64 time=8.619 ms
^C
--- 66.96.16.3 ping statistics ---
2 packets transmitted, 2 packets received, 0.0% packet loss
round-trip min/avg/max/stddev = 6.136/7.377/8.619/1.242 ms

The following do not:

ssh 66.96.31.6 or 66.96.16.3

or any other service running locally on the machine.

It's worth noting that machines connected to the bge0.401 vlan or the
internet in general can ssh to both addresses.

Environment:
System: FreeBSD yak.eicat.ca 10.0-RELEASE-p6 FreeBSD 10.0-RELEASE-p6 #0 r268353: Mon Jul 7 13:16:17 EDT 2014 root@yak.eicat.ca:/usr/obj/usr/src/sys/YAK amd64


As above.  Machine is an x3210 Xeon with 4G ram and two BGE class ethernet
interfaces.  Quagga is running.

How-To-Repeat:
I use mpd5 to terinate l2tp tunnels full of pppoe tunnels of subscribers.

Fix:
none known.
Comment 1 Mark Linimon freebsd_committer freebsd_triage 2014-07-20 18:21:09 UTC
Over to maintainers.
Comment 2 dgilbert 2014-07-21 19:56:35 UTC
I've done some additional work on the problem.  One thing I have done is to enable the rc.conf variable

ipv6_activate_all_interfaces="YES"

... which changes the ifconfig for the ngX interfaces to say:

ng11: flags=88d1<UP,POINTOPOINT,RUNNING,NOARP,SIMPLEX,MULTICAST> metric 0 mtu 1436
        inet 66.96.31.6 --> 66.96.16.50 netmask 0xffffffff
        inet6 fe80::219:b9ff:fef9:b9e7%ng11 prefixlen 64 scopeid 0x17
        nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>

(this removes IFDISABLED ... which shouldn't matter for ipv4, but I'm working on ipv6 anyways).

The 2nd thing I've done is compare the 9.1 and 10.0 versions if ng_iface.c.  The main difference appears to be a rewrite of ng_iface_output() to make it's third argument a constant.  I'm unsure reading this if the handling of the dst->sa_family could be causing my problem.
Comment 3 dgilbert 2014-08-19 00:47:39 UTC
I continue to try to eek out what's happening here.  I had an idea: Why don't I create a firewall rule:

rdr on ng1 inet proto tcp from any to 66.96.16.3 port = 2222 -> 66.96.16.3 port 22

and then I can try this.  Well...

[2:54:354]root@owl:~> pfctl -vs nat
No ALTQ support in kernel
ALTQ related functions disabled
rdr on ng1 inet proto tcp from any to 66.96.16.3 port = 2222 -> 66.96.16.3 port 22
  [ Evaluations: 118329    Packets: 7         Bytes: 356         States: 1     ]
  [ Inserted: uid 0 pid 43426 State Creations: 1     ]
[2:55:355]root@owl:~> netstat -an | grep 22
tcp4       0      0 66.96.16.3.22          66.96.16.11.53211      ESTABLISHED
tcp4       0      0 *.22                   *.*                    LISTEN
tcp6       0      0 *.22                   *.*                    LISTEN

so... PF sees the SYN packets, but the local TCP stack does not.

Sigh.  Help?
Comment 4 dgilbert 2014-08-19 02:17:49 UTC
This is to say: that a host connecting trough an ng_iface interface can access the rest of the network, but cannot access the host on which the ng_iface resides.  _And_ this is a regression in 10.0.

OK.  This is a _really_ interesting example.  There are two MPD servers: A and B.  A is .1 (both v4 and v6) and B is .3 (both v4 and v6).  The only difference is that B has gif interfaces to give v6 services to mpd-connected clients.

If mpd client is connected to A:

WORKS: ssh -4 ...3
WORKS: ssh -6 ..:3
BROKE: ssh -4 ...1
WORKS: ssh -6 ..:1

If mpd client is connected to B:

BROKE: ssh -4 ...3
BROKE: ssh -6 ..:3
WORKS: ssh -4 ...1
WORKS: ssh -6 ..:1

ie: if the packet path includes ngX on the host in question, it fails.

mpd -> ngX -> gif -> ssh -> fail
mpd -> ngX -> gif -> otherhost -> ssh -> success

mpd -> ngX -> otherhost -> gif -> ssh -> success
mpd -> ngX -> otherhost -> gif -> otherhost -> ssh -> success
Comment 5 Gleb Smirnoff freebsd_committer freebsd_triage 2014-08-26 13:11:43 UTC
This looks more like a pf issue, not ng_iface.
Comment 6 dgilbert 2014-08-28 04:21:39 UTC
I'd like to quickly respond before things veer away.  The problem exists when pf is not even loaded or enabled.  I brought in some pf examples to try to discern which parts of the stack "see" the packet... but pf is definitely not part of the problem.
Comment 7 dgilbert 2014-08-28 04:23:18 UTC
I can also add that I have an application that uses if_tun interfaces and they don't exhibit the ng_iface problems.  Nor do the gif interfaces exhibit the problem when the packet has not arrived on an ng_iface.
Comment 8 Gleb Smirnoff freebsd_committer freebsd_triage 2014-08-29 10:20:52 UTC
Then this might be artifact of quagga. Can you show 'netstat -rn'?

P.S. I'm using mpd5 client and server on FreeBSD head for many years, and didn't encounter the problems you describe neither in 10-CURRENT lifetime, nor in 11-CURRENT.
Comment 9 dgilbert 2014-08-30 21:06:38 UTC
Created attachment 146574 [details]
netstat -rn on the computer with the this ticket's problem

As requested.  The particular client may not be logged into _this_ server there are two.  But this netstat -rn is of a sane size (and still exhibits the problem).  The other server has a full BGP table (and thus a 500k line netstat -rn).

In the output, routes to bge0.401 are to the outside world.  66.96.16.3 is the other mpd machine --- so you see a number of host routes for those.  ngX are obviously the mpd links.  16.11 is one of the core BGP routers.  The services that were uncontactable from mpd were either aliases on lo0 or the primary address on bge0.401.
Comment 10 dgilbert 2014-08-30 21:07:44 UTC
(In reply to Gleb Smirnoff from comment #8)

> P.S. I'm using mpd5 client and server on FreeBSD head for many years, and
> didn't encounter the problems you describe neither in 10-CURRENT lifetime,
> nor in 11-CURRENT.

... do you also use quagga, or no?
Comment 11 dgilbert 2014-11-10 06:47:42 UTC
Additional Information from a new test.  I set up a 10.1 RC3 host in VMWare and duplicated it.  I configured one as a pppoe server and the other as a pppoe client.  They do _not_ replicate this problem.

Remaining possible culprits:

 - l2tp in conjuction with this setup
 - quagga

I still have the suspicion that something is marking the mbufs of the packet in some way so as to convince the ip_input layer to ignore the packets.
Comment 12 dgilbert 2016-07-05 06:33:37 UTC
So... just for clarification and history, PR 154557 (still open, but sounds like it should be closed) says much the same thing _except_ that setting tcpmssfix in mpd5's config fixed it.  Note also FreeBSD-8.2

My router's history:  Started 7.x.  Worked (although memory leak in netgraph or mpd5 for most of 7.x).  FreeBSD 8.x: still worked.  Upgraded to 9.x around 8.x EOL.  Started having problems.  Upgraded to 10.0 to see if that fixed it... then 10.1, 10.2 and now 10.3.

Also: tcpmssfix not required as neither host blocks ICMP (even tho tcpmssfix is turned on (makes TCP setup faster anyways).

Just adding all this for completeness... problem still persists and somewhat responsible people at BSDCan couldn't crack it.
Comment 13 Eugene Grosbein freebsd_committer freebsd_triage 2017-11-06 13:16:09 UTC
Is this problem still relevant?

If so, I would advise you to update to 10.4 or 11.1 and update mpd5 itself too, as there were pretty nasty bugs in both of base system and mpd5 fixed recently.

If this does not help, I will need you to run tcpdump to ensure what part is guilty here.
Comment 14 dgilbert 2018-06-20 18:02:08 UTC
It's still relevant.  I brought it up at BSDCan and got some dtrace scripts written for it.  I intend to test with 11.4 in a cluster of bhyve insances shortly.
Comment 15 Eugene Grosbein freebsd_committer freebsd_triage 2018-08-18 11:27:18 UTC
Still waiting for some outputs using recent versions.