Summary: | vimage ALPHA6 [ipfw will not kldload] | ||
---|---|---|---|
Product: | Base System | Reporter: | Joe Barbish <qjail1> |
Component: | kern | Assignee: | Bjoern A. Zeeb <bz> |
Status: | Closed Overcome By Events | ||
Severity: | Affects Only Me | CC: | bz |
Priority: | --- | Keywords: | vimage |
Version: | CURRENT | ||
Hardware: | Any | ||
OS: | Any |
Description
Joe Barbish
2016-07-04 12:58:39 UTC
Can you please try to compile ipfilter into the kernel and see if all your problems go away? It seems ipfilter (like multicast) is exhausting the per-VNET module data area. Let me state here what I think you are trying to say. That the host ipfilter firewall is being kldloaded when the ipfilter statements are read in the hosts rc.conf file. This is normal behavior. You think that compiling ipfilter into the kernel along with vimage will allow the host to run ipfilter and enable vnet/vimage jails to kldload ipfw, or pf, when there statements are read in rc.conf file in the vnet/vimage jail. Whats being tested here is whether a vnet/image jail can run a different firewall then what the host is running. Testing has demonstrated that this is not possible so far and is the bases of this pr. In the case of ipfw, testing has shown that compiling ipfw in the kernel will allow the host and vnet/vimage jails to run the ipfw firewall. What is desired is ipfw being kldloaded by the host and the vnet/vimage jails to also use that host kldloaded ipfw module. This is the next test I will run. Requiring the desired firewall to be compiled into the kernel along with vimage is not a solution. No, what I am saying is that what is compiled into the kernel we know its global space needed to duplicate state with each VNET instance and thus it is a "static constant at compile time". For modules we need to reserve space as on load time (either on boot if loaded by the loader, or in whatever way using kldload after init is run) we have to determine the amount of memory needed for the state of each VNET. We cannot preserve endless amount of memory for that and for the last few years this was set to two pages per VNET (plus a tiny bit of roundup memory depending on the static global state). If the amount of virtualized space needed by all modules for their global state exceeds this they will (if properly written) fail to load, or they will be loaded but not function (ideally the load should fail). Thus compiling modules into the kernel, e.g., firewalls or multicast routing or virtual network devices which tend to need quite a bit of virtualized global state space, will help to avoid that general problem. Unfortunately there is no way to automatically increase that per-module region at run time. I was pondering adding a loader tunable so people could enlarge it but that turned out to be not trivial either. We could add a kernel option so people could override the default w/o patching vnet.c. See https://svnweb.freebsd.org/base/head/sys/net/vnet.c?annotate=302054#l165 The problem with ipfilter seems to be that on amd64 without checking #ifdefs the size of ipf_main_softc_t alone is 2280. While that should easily fit in combination with other modules it might not anymore, hence my question to try to compile it into the kernel and see if things just work then before we are going off into a long debug session for possible other causes of error. To address your other comment: if the firewalls are compiled into the kernel (or the modules get properly loaded), then each VNET instance, as the base system, will be able to use one or more (any combination) of provided firewalls. One can then use pf in one VNET, ipfilter in another, ipfw in a third and all three together in a forth in one desires (as one could for a plain system if it was just a GENERIC kernel). Compiled the kernel with ipfilter compiled in. test 1. Have the ipfilter statements in the host rc.conf commented out so host is not running any firewall at all. Have ipfw statements in the vnet/vimage jail's rc.conf and when jail starts get the same messages as posted before except the nd6_dad_timer message does not happen. kldload: can't load ipfw: Operation not permitted /etc/rc.d/ipfw: WARNING: Unable to load kernel module ipfw test 2. Have ipfilter statements in the host rc.conf so host is running ipfilter firewall. Have ipfw statements in the vnet/vimage jail's rc.conf and when jail starts get the same messages as posted before except the nd6_dad_timer message does not happen. kldload: can't load ipfw: Operation not permitted /etc/rc.d/ipfw: WARNING: Unable to load kernel module ipfw Compiling ipfilter in the kernel changed nothing. Compiled ipfirewall and vimage together. options VIMAGE options IPFIREWAL options IPFIREWAL_NAT options IPDIVERT options LIBALIAS My network is like this Gateway host connected to public internet with LAN behind it. On LAN is ALPHA6 box being used for testing vnet/vimage. TEST #1 This ALPHA6 test box is running the generic kernel with ipfw statements in the hosts rc.conf. Only have 2 rules in ipfw. ipfw add 010 allow all from any to any via lo0 ipfw add 010 allow log all from any to any via rl0 At boot time get msg ipfw rules loaded & ipfw logging enabled ipfw show command shows those 2 rules Issue ping 8.8.8.8 returns results, meaning box has network connection to public internet. The ipfw log shows the logged packets from the ping command. This verifies that host generic and ipfw are working. TEST #2 Everything is the same except this time booted the vimage kernel and vnet jail has ipfw statements in it's rc.conf. When vnet jail starts get msg that ipfw rules loaded & logging enabled. Vnet jail console log has this msg. Protect: Procctl: operation not permitted. Logging into the started vnet jail and issuing ping 8.8.8.8 returns 0 packets received. The vnet jail ipfw log is empty. Host ipfw log shows ICMP packets out via epair1b which is the vnet jail. TEST #3 Everything is the same except this time rebooted the ALPHA6 test box running kernel with vimage/ipfw compiled in. Host boot messages show msg that ipfw rules loaded & logging enabled. Host ping works and host ipfw log shows ICMP packets. When vnet jail starts get msg that ipfw rules loaded & logging enabled. Vnet jail console log has this msg. Protect: Procct: operation not permitted. Logging into the started vnet jail and issuing ping 8.8.8.8 returns 0 packets received. The vnet jail ipfw log is empty. Host ipfw log shows ICMP packets out via epair1b which is the vnet jail. I have no idea what "Protect: Procctl: operation not permitted." means or if it may have any baring on what is happening here. Now the host ipfw log is very interesting. I would think that when the vnet jail ping command is issued the host ipwf firewall should receive a packet via "IN" but we see via "out" instead. On another subject. I see BETA1 is out. Have you made changes to vimage or any of the 3 firewalls that are missing form ALPHA6. Do I need to install BETA1 on my test box to test new vimage changes? The "Protect: Procctl: operation not permitted." is unrelated. I assume you start sshd inside the vnet jail. Se man 1 protect and man 2 procctl . With regards to the ipfw it is unclear to me how you connect the vnet to the outside. Is the epair bridged to rl0? How does it get it's address. What happens if you try to ping a host system IP address from the Vnet? Does that work? Can the Vnet ping it's default gateway? Can you use tcpdump on the various (host interfaces) to follow all incoming/outgoing packets related to the vnet? Start with the epair connected to the vnet and then try the physical interface (probably limit tcpdump to icmp to not log ssh traffic in case you are logged in remotely via the same interface). Alos what happens if you start the base system the same way, and then start the vnet just without the ipfw firewall? Do things work then? Just trying to narrow down where the problem in your setup comes from. I do not login to the host system remotely. I have host console in front of me. The same is true for logging into the vnet jail. I enter jexec command on host console to log into vnet jail. So problem with "Protect: Procctl: operation not permitted." is still open. Yes the epair is bridged to rl0. As stated in test #1 of my previous post bridge/epair works because I can ping the public internet. The conclusion is vnet/vimage works ok as long as the vnet jail does not try to start any one of the 3 firewalls. If your doing testing using the "service" script or the jail rc.d scripts this may be why you are getting different results. The jail rc.d scripts have know problems with vnet jails(8) jails. I do not use that old script system. I only use the jail(8) command for starting/stopping my jails with jail definitions in jail.conf format. All your comment #7 question have already been answered by my previous post #6. I am the qjail maintainer and used qjail to perform all the tests posted in #6. I have users who use qjail for vnet jails so its proved its function is valid. The single common item among all the qjail vnet users is none of them can get a firewall to run in a vnet jail. This is the same thing I am seeing with the updated vimage available in ALPHA6. Maybe you need another pair of eyes to review your setup for testing vnet/vimage jails. I would be open to doing so. root@:/ # sysctl -a | grep jailed security.jail.jailed: 1 root@:/ # ping 192.168.5.1 PING 192.168.5.1 (192.168.5.1): 56 data bytes 64 bytes from 192.168.5.1: icmp_seq=0 ttl=64 time=0.408 ms 64 bytes from 192.168.5.1: icmp_seq=1 ttl=64 time=0.302 ms 64 bytes from 192.168.5.1: icmp_seq=2 ttl=64 time=0.312 ms ^C --- 192.168.5.1 ping statistics --- 3 packets transmitted, 3 packets received, 0.0% packet loss round-trip min/avg/max/stddev = 0.302/0.341/0.408/0.048 ms root@:/ # ipfw show 00010 0 0 allow ip from any to any via lo0 00010 0 0 allow log ip from any to any via rl0 00010 0 0 allow log ip from any to any via epair0b 00010 7 600 allow log ip from any to any via epair0a 65535 90 8802 allow ip from any to any Clearly my pings to my default gateway work when I do this. Then also added your default rules to the base system (which had ipfw running with a default allow for quite a while already): root@rabbit4:~ # ipfw show 00010 672 143534 allow log ip from any to any via igb0 00010 0 0 allow log ip from any to any via lo0 65535 5242947 1487787707 allow ip from any to any bridged the epair to the physical interface in the base system. root@rabbit4:~ # ifconfig bridge0 bridge0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500 ether 02:41:30:09:bd:00 nd6 options=9<PERFORMNUD,IFDISABLED> groups: bridge id 00:00:00:00:00:00 priority 32768 hellotime 2 fwddelay 15 maxage 20 holdcnt 6 proto rstp maxaddr 2000 timeout 1200 root id 00:00:00:00:00:00 priority 32768 ifcost 0 port 0 member: igb0 flags=143<LEARNING,DISCOVER,AUTOEDGE,AUTOPTP> ifmaxaddr 0 port 1 priority 128 path cost 20000 member: epair0b flags=143<LEARNING,DISCOVER,AUTOEDGE,AUTOPTP> ifmaxaddr 0 port 7 priority 128 path cost 2000 root@rabbit4:~ # ifconfig epair0b epair0b: flags=8943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> metric 0 mtu 1500 options=8<VLAN_MTU> ether 02:ff:c0:00:07:0b nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL> media: Ethernet 10Gbase-T (10Gbase-T <full-duplex>) status: active groups: epair Could you please go ahead and try to tcpdump all interfaces on the base system and inside the vnet (rl0, bridge0, both end of the epair) and show where you can see each packet and a possible reply. It'll be essential to figure out where we lose it in your setup. Commands issued from the host /root >ipfw show 00010 0 0 allow ip from any to any via lo0 00011 0 0 deny ip from 10.0.10.4 to any in via rl0 00012 0 0 allow log ip from any to any via rl0 00013 0 0 deny log ip from any to any 65535 276 27346 deny ip from any to any /root >cat /var/log/security host ipfw log file is empty /root >ls /usr/jails archive sharedfs v10 v30 v50 flavors template v20 v40 /root >qjail list STA JID NIC IP Jailname --- ---- --- --------------- -------------------------------------------------- DS N/A rl0 vnet|be|ipf v10 DS N/A rl0 vnet|be|ipfw v20 DS N/A rl0 vnet|be|pf v30 DS N/A rl0 vnet|ng|none v40 DS N/A rl0 vnet|be|none v50 /root >cat /usr/local/etc/qjail.config/v20 v20 { host.hostname = "v20"; path = "/usr/jails/v20"; mount.fstab = "/usr/local/etc/qjail.fstab/v20"; exec.start = "/bin/sh /etc/rc"; exec.stop = "/bin/sh /etc/rc.shutdown"; exec.consolelog = "/var/log/qjail.v20.console.log"; mount.devfs; devfs_ruleset = "4"; vnet; exec.poststart="/usr/local/bin/qjail.vnet.be start v20 rl0 ipfw"; exec.prestop="/usr/local/bin/qjail.vnet.be stop v20 rl0 ipfw"; } /root >cat /usr/local/etc/qjail.fstab/v20 /usr/jails/sharedfs /usr/jails/v20/sharedfs nullfs ro 0 0 /root >cat /usr/local/bin/qjail.vnet.be #!/bin/sh function=$1 jailname=$2 nicname=$3 firewall=$4 jaildir="/usr/jails" start () { jid=`jls -j ${jailname} jid` #if [ "${jid}" -gt "100" ]; then # echo " " # echo "WARNING: The JID value is greater then 100." # echo "This may indicate many cycles of starting/stopping vnet jails" # echo "which results in lost memory pages. To recover the lost memory," # echo "shutdown the host and reboot. This will zero out the JID counter" # echo "and make all the memory available again." # echo " " #fi if [ "${jid}" -gt "250" ]; then echo " " echo "ERROR: No more vnet jail epair ip addresses can be created." echo "You MUST shutdown the host and reboot before vnet jails are" echo "startable again." echo " " exit 2 fi # Check the hosts network for existing bridge. # If no bridge yet then create the bridge. # Add real interface device name to one side of bridge. # bridge=`ifconfig | grep -m 1 bridge | cut -f 1 -d :` if [ -z ${bridge} ]; then ifconfig bridge0 create ifconfig bridge0 addm ${nicname} ifconfig bridge0 up # vnet jails will not work unless ip forwarding is enabled. sysctl net.inet.ip.forwarding=1 fi # Do this logic for all vnet jails. # Assign alias IP number to bridge using jid to make it unique per vnet jail. # The alias IP number is the vnet jails default route ip address. # Create epair assigning "a" to bridge and "b" to the vnet jail # ifconfig bridge0 alias 10.${jid}.0.1 ifconfig epair${jid} create ifconfig bridge0 addm epair${jid}a ifconfig epair${jid}a up ifconfig epair${jid}b vnet ${jid} # Assign ip address to epair "b" inside of the vnet jail. # jexec ${jailname} ifconfig epair${jid}b 10.${jid}.0.2 jexec ${jailname} route add default 10.${jid}.0.1 jexec ${jailname} ifconfig lo0 127.0.0.1 if [ ${firewall} = "none" ]; then # If no firewall was selected in config -v # Start services inside of jail needed for network. # Note: using service command because it's not nojail keyword aware. # jexec ${jailname} service netif start jexec ${jailname} service routing start exit 0 fi if [ ${firewall} = "ipfw" ]; then # Chech to see if selected firewall kernel modules have been loaded. #if ! kldstat -v | grep -qw ${firewall}; then # echo "Error: ${firewall} was not compiled into the kernel." # exit 2 #fi # If ipfw firewall was selected in config -v # Get the epairXb interface name of the vnet jail and # write the vaule to a file so the epairXb interface name can be # passed to the ipfw.rules file, then start ipfw. # Start services inside of jail needed by ipfw firewall. # Note: using service command because it's not nojail keyword aware. # jexec ${jailname} service netif start jexec ${jailname} service routing start ipfw_epair="${jaildir}/${jailname}/etc/epair" jexec ${jailname} ifconfig | grep -m 1 epair | cut -f 1 -d : > ${ipfw_epair} echo "ipfw_epair = ${ipfw_epair}" jexec ${jailname} service ipfw restart exit 0 fi if [ ${firewall} = "pf" ]; then # Chech to see if selected firewall kernel modules have been loaded. #if ! kldstat -v | grep -qw ${firewall}; then # echo "Error: ${firewall} was not compiled into the kernel." # exit 2 #fi # If pf firewall was selected in config -v # Get the epairXb interface name of the vnet jail and # write the vaule to a file so the epairXb interface name can be # passed to the pf.rules file, then start pf. # Start services inside of jail needed by pf firewall. # Note: using service command because it's not nojail keyword aware. # #jexec ${jailname} service netif start > /dev/null 2> /dev/null #jexec ${jailname} service routing start > /dev/null 2> /dev/null pf_epair="${jaildir}/${jailname}/etc/epair" jexec ${jailname} ifconfig | grep -m 1 epair | cut -f 1 -d : > ${pf_epair} # jexec ${jailname} service pf start > /dev/null 2> /dev/null jexec ${jailname} service pf start # jexec ${jailname} pfctl -F all; pfctl -f /etc/pf.rules fi if [ ${firewall} = "ipf" ]; then ####### This stub is not used. Coded for when ipfilter becomes vnet aware. # If ipfilter firewall was selected in config -v # Get the epairXb interface name of the vnet jail and # write the vaule to a file so the epairXb interface name can be # passed to the ipf.rules file, then start ipf. # Start services inside of jail needed by ipfilter firewall. # Note: using service command because it's not nojail keyword aware. # jexec ${jailname} service netif start > /dev/null 2> /dev/null jexec ${jailname} service routing start > /dev/null 2> /dev/null ipf_epair="${jaildir}/${jailname}/etc/epair" jexec ${jailname} ifconfig | grep -m 1 epair | cut -f 1 -d : > ${ipf_epair} # jexec ${jailname} service ipfilter start > /dev/null 2> /dev/null jexec ${jailname} service ipfilter start fi } stop () { # Disable vnet jails network configuration. # jid=`jls -j ${jailname} jid` ifconfig epair${jid}b -vnet ${jid} ifconfig bridge0 -alias 10.${jid}.0.1 ifconfig epair${jid}a destroy # If host has no more vnet jails then disable bridge. # epair=`ifconfig | grep -m 1 epair | cut -f 1 -d :` if [ -z ${epair} ]; then ifconfig bridge0 destroy # sysctl net.inet.ip.forwarding=0 > /dev/null 2> /dev/null fi if [ ${firewall} = "ipfw" ]; then # If ipfw was started, now disable it. # jexec ${jailname} service ipfw stop > /dev/null 2> /dev/null jexec ${jailname} service routing stop > /dev/null 2> /dev/null jexec ${jailname} service netif stop > /dev/null 2> /dev/null sleep 2 fi if [ ${firewall} = "pf" ]; then # If pf was started, now disable it. # jexec ${jailname} service pf stop > /dev/null 2> /dev/null jexec ${jailname} service routing stop > /dev/null 2> /dev/null jexec ${jailname} service netif stop > /dev/null 2> /dev/null sleep 2 fi #if [ ${firewall} = "ipf" ]; then # ######### This stub is not used right now. # # If ipfilter was started, now disable it. # # # jexec ${jailname} service ipfilter stop > /dev/null 2> /dev/null # jexec ${jailname} service routing stop > /dev/null 2> /dev/null # jexec ${jailname} service netif stop > /dev/null 2> /dev/null # sleep 2 #fi #if [ ${firewall} = "none" ]; then # If no firewall was started, disable network. # #jexec ${jailname} service routing stop > /dev/null 2> /dev/null #jexec ${jailname} service netif stop > /dev/null 2> /dev/null # exit 0 #fi } [ "${function}" = "start" ] && start $* && exit 0 [ "${function}" = "stop" ] && stop $* && exit 0 /root >qjail start v20 net.inet.ip.forwarding: 1 -> 1 epair4a add net default: gateway 10.4.0.1 Starting Network: lo0. lo0: flags=8049<UP,LOOPBACK,RUNNING,MULTICAST> metric 0 mtu 16384 options=600003<RXCSUM,TXCSUM,RXCSUM_IPV6,TXCSUM_IPV6> inet6 ::1 prefixlen 128 inet6 fe80::1%lo0 prefixlen 64 scopeid 0x1 inet 127.0.0.1 netmask 0xff000000 nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL> groups: lo add host 127.0.0.1: gateway lo0 fib 0: route already in table Additional inet routing options: gateway=YES. add host ::1: gateway lo0 fib 0: route already in table add net fe80::: gateway ::1 fib 0: route already in table add net ff02::: gateway ::1 fib 0: route already in table add net ::ffff:0.0.0.0: gateway ::1 fib 0: route already in table add net ::0.0.0.0: gateway ::1 fib 0: route already in table ipfw_epair = /usr/jails/v20/etc/epair net.inet.ip.fw.enable: 1 -> 0 net.inet6.ip6.fw.enable: 1 -> 0 jailed /etc/epair = epair4b Firewall rules loaded. Firewall logging enabled. Jail successfully started v20 /root >ifconfig -a rl0: flags=8943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> metric 0 mtu 1500 options=2008<VLAN_MTU,WOL_MAGIC> ether 00:0c:6e:09:8b:74 inet 10.0.10.9 netmask 0xfffffff0 broadcast 10.0.10.15 nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL> media: Ethernet autoselect (100baseTX <full-duplex>) status: active lo0: flags=8049<UP,LOOPBACK,RUNNING,MULTICAST> metric 0 mtu 16384 options=600003<RXCSUM,TXCSUM,RXCSUM_IPV6,TXCSUM_IPV6> inet6 ::1 prefixlen 128 inet6 fe80::1%lo0 prefixlen 64 scopeid 0x2 inet 127.0.0.1 netmask 0xff000000 nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL> groups: lo bridge0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500 ether 02:46:d0:31:46:00 inet 10.4.0.1 netmask 0xff000000 broadcast 10.255.255.255 nd6 options=9<PERFORMNUD,IFDISABLED> groups: bridge id 00:00:00:00:00:00 priority 32768 hellotime 2 fwddelay 15 maxage 20 holdcnt 6 proto rstp maxaddr 2000 timeout 1200 root id 00:00:00:00:00:00 priority 32768 ifcost 0 port 0 member: epair4a flags=143<LEARNING,DISCOVER,AUTOEDGE,AUTOPTP> ifmaxaddr 0 port 4 priority 128 path cost 2000 member: rl0 flags=143<LEARNING,DISCOVER,AUTOEDGE,AUTOPTP> ifmaxaddr 0 port 1 priority 128 path cost 200000 epair4a: flags=8943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> metric 0 mtu 1500 options=8<VLAN_MTU> ether 02:c0:00:00:04:0a inet6 fe80::c0:ff:fe00:40a%epair4a prefixlen 64 scopeid 0x4 nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL> media: Ethernet 10Gbase-T (10Gbase-T <full-duplex>) status: active groups: epair Start the jail and issue commands to jail from jail console /root >qjail console v20 Last login: Sat Jul 23 07:25:09 on pts/0 FreeBSD 11.0-ALPHA6 (ipfwVimage) #0: Sun Jul 10 09:10:17 EDT 2016 Welcome to your FreeBSD jail. v20 /root > v20 /root >ipfw show 00010 0 0 allow ip from any to any via lo0 00011 0 0 allow log ip from any to any via epair4b 00012 0 0 deny log ip from any to any 65535 0 0 deny ip from any to any v20 /root >ifconfig -a lo0: flags=8049<UP,LOOPBACK,RUNNING,MULTICAST> metric 0 mtu 16384 options=600003<RXCSUM,TXCSUM,RXCSUM_IPV6,TXCSUM_IPV6> inet6 ::1 prefixlen 128 inet6 fe80::1%lo0 prefixlen 64 scopeid 0x1 inet 127.0.0.1 netmask 0xff000000 nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL> groups: lo epair4b: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500 options=8<VLAN_MTU> ether 02:c0:00:00:05:0b inet 10.4.0.2 netmask 0xff000000 broadcast 10.255.255.255 inet6 fe80::c0:ff:fe00:50b%epair4b prefixlen 64 scopeid 0x2 nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL> media: Ethernet 10Gbase-T (10Gbase-T <full-duplex>) status: active groups: epair v20 /root >ping 8.8.8.8 PING 8.8.8.8 (8.8.8.8): 56 data bytes ^C --- 8.8.8.8 ping statistics --- 6 packets transmitted, 0 packets received, 100.0% packet loss v20 /root >ipfw show 00010 0 0 allow ip from any to any via lo0 00011 6 504 allow log ip from any to any via epair4b 00012 0 0 deny log ip from any to any 65535 0 0 deny ip from any to any v20 /root >cat /var/log/security Jul 2 20:36:27 v20 newsyslog[3010]: logfile first created log is empty v20 /root >exit logout Back to issueing commands to the host /root >cat tcpdump.epair4a 07:29:14.031691 ARP, Request who-has 10.4.0.1 tell 10.4.0.2, length 28 07:29:14.031803 ARP, Reply 10.4.0.1 is-at 02:46:d0:31:46:00 (oui Unknown), lengt h 28 07:29:14.031829 IP 10.4.0.2 > 8.8.8.8: ICMP echo request, id 5909, seq 0, length 64 07:29:15.033222 IP 10.4.0.2 > 8.8.8.8: ICMP echo request, id 5909, seq 1, length 64 07:29:16.034410 IP 10.4.0.2 > 8.8.8.8: ICMP echo request, id 5909, seq 2, length 64 07:29:17.035091 IP 10.4.0.2 > 8.8.8.8: ICMP echo request, id 5909, seq 3, length 64 07:29:18.036279 IP 10.4.0.2 > 8.8.8.8: ICMP echo request, id 5909, seq 4, length 64 07:29:19.037469 IP 10.4.0.2 > 8.8.8.8: ICMP echo request, id 5909, seq 5, length 64 07:34:44.631297 ARP, Request who-has 10.0.10.2 tell 10.0.10.3, length 46 /root >cat tcpdump.bridge0 07:29:14.031725 ARP, Request who-has 10.4.0.1 tell 10.4.0.2, length 28 07:29:14.031780 ARP, Reply 10.4.0.1 is-at 02:46:d0:31:46:00 (oui Unknown), lengt h 28 07:29:14.031833 IP 10.4.0.2 > 8.8.8.8: ICMP echo request, id 5909, seq 0, length 64 07:29:15.033231 IP 10.4.0.2 > 8.8.8.8: ICMP echo request, id 5909, seq 1, length 64 07:29:16.034418 IP 10.4.0.2 > 8.8.8.8: ICMP echo request, id 5909, seq 2, length 64 07:29:17.035099 IP 10.4.0.2 > 8.8.8.8: ICMP echo request, id 5909, seq 3, length 64 07:29:18.036286 IP 10.4.0.2 > 8.8.8.8: ICMP echo request, id 5909, seq 4, length 64 07:29:19.037477 IP 10.4.0.2 > 8.8.8.8: ICMP echo request, id 5909, seq 5, length 64 07:34:44.631308 ARP, Request who-has 10.0.10.2 tell 10.0.10.3, length 46 /root >cat tcpdump.rl0 07:29:14.031717 ARP, Request who-has 10.4.0.1 tell 10.4.0.2, length 46 07:34:44.631277 ARP, Request who-has 10.0.10.2 tell 10.0.10.3, length 46 /root/bin >cat /var/log/security Jul 23 07:29:14 fbsdjones kernel: ipfw: 11 Accept ICMP:8.0 10.4.0.2 8.8.8.8 out via epair4b Jul 23 07:29:14 fbsdjones kernel: ipfw: 13 Deny ICMP:8.0 10.4.0.2 8.8.8.8 in via bridge0 Jul 23 07:29:15 fbsdjones kernel: ipfw: 11 Accept ICMP:8.0 10.4.0.2 8.8.8.8 out via epair4b Jul 23 07:29:15 fbsdjones kernel: ipfw: 13 Deny ICMP:8.0 10.4.0.2 8.8.8.8 in via bridge0 Jul 23 07:29:16 fbsdjones kernel: ipfw: 11 Accept ICMP:8.0 10.4.0.2 8.8.8.8 out via epair4b Jul 23 07:29:16 fbsdjones kernel: ipfw: 13 Deny ICMP:8.0 10.4.0.2 8.8.8.8 in via bridge0 Jul 23 07:29:17 fbsdjones kernel: ipfw: 11 Accept ICMP:8.0 10.4.0.2 8.8.8.8 out via epair4b Jul 23 07:29:17 fbsdjones kernel: ipfw: 13 Deny ICMP:8.0 10.4.0.2 8.8.8.8 in via bridge0 Jul 23 07:29:18 fbsdjones kernel: ipfw: 11 Accept ICMP:8.0 10.4.0.2 8.8.8.8 out via epair4b Jul 23 07:29:18 fbsdjones kernel: ipfw: 13 Deny ICMP:8.0 10.4.0.2 8.8.8.8 in via bridge0 Jul 23 07:29:19 fbsdjones kernel: ipfw: 11 Accept ICMP:8.0 10.4.0.2 8.8.8.8 out via epair4b Jul 23 07:29:19 fbsdjones kernel: ipfw: 13 Deny ICMP:8.0 10.4.0.2 8.8.8.8 in via bridge0 /root/bin >ipfw show 00010 0 0 allow ip from any to any via lo0 00011 0 0 deny ip from 10.0.10.4 to any in via rl0 00012 0 0 allow log ip from any to any via rl0 00013 41 4004 deny log ip from any to any 65535 276 27346 deny ip from any to any Conclution: From the results it looks like the vnet jail's ipfw log is writing to the hosts ipfw log which is /var/log/security. As shown by the "ipfw show" command for the host that no packets have been passed to the rl0 interface. 1. There is a security problem with the vnet jailed ipfw firewall having write access to the hosts /var/log/security file. A jail no matter what kind it is, non-vnet or vnet, is by design, not suppose to have access to anything on the host. Here is hard evidence that it is happening. 2. The output to the host's ipfw log is missing in action. Host's ipfw firewall rules log all denied packets, but they are not in the host's security log interspersed with the vnet jail's log records. 3. External evidence indicates the passing of packets from the vnet jail stack is NOT being handed off correctly to the host's stack. 4. In general everything points to ipfw not yet being totally integrated into vimage and in turn into the host at the kernel level. Installed BETA2 and same results. Firewalls do not work with vimage. BETA3 is available but no use testing with it because no changes applied yet. Looks like 11.0 is on tract to be published with only basic vimage working. NO firewall of any kind being able to run on host and/or in vimage jail with vimage compiled into host kernel. I have reviewed your comment #9 in detail. You did not provide enough details to prove anything is working. Pinging the host's bridge from within a vnet jail is a long way from pinging the public internet. Posting your test jail.conf file contents and your epair/bridge commands and the commands you use to start/stop your vnet jail including the login banner showing which Freebsd version is running on the host would be very helpful for me to reproduce your test environment. Maybe another pair of eyes will help. Some times a developer is too close to the forest to see the trees. IE too involved with the project to see the problem staring them in the face. Have been their myself. (In reply to Joe Barbish from comment #10) To your #1, what is logged to /var/log/security is the kernel log that syslog gets. It's not anything in a jail writing to that file, it's your syslogd on the all-seeing base system. That's essentially "dmesg". That's an unfortunate historic thing of the ipfw implementation; using tcpdump on the ipfw0 interface like on pflog0 for pf will avoid you seeing the vnet-jail logging on the base system as well. To your #2, that might be the case as a bit later your syslog might log a line that the last message was repeated another n times. Hard to say from just the output. To your #3 and your actual problem: (a) if you are bridging you do not need ip forwarding, especially not inside the vnet-jail. The fact that you are bridging and trying to forward is a weird setup in first place. (b) your current topology looks like: (gateway system) --- |physical wire| --- (rl0) --- (bridge0) --- epairNa --- |jail| --- epairNb All these interfaces are in the same L2 broadcast domain (hence no need for ip forwarding ideally). However you setup your L3 that bridge0 is the default gw for the jail, so your host system suddenly has to "forward" these packets. You can have IP aliases (or different subnets) on your gateway machine and then just point the vnet-jail at that (as in move the IP address from the bridge0 to the gateway), or you can remove the bridge0 have have your base system be a router forwarding the packets. In that case you put the IP address on epairNa instead of the bridge. In the latter case however your gateway machine needs to route that subnet to the IP of the rl0 interface of the base system, as otherwise return packets never make it. (c) the base system firewall does what it's told to do and drops the packet on the bridge0 interface on the base system, as your log shows: Jul 23 07:29:14 fbsdjones kernel: ipfw: 13 Deny ICMP:8.0 10.4.0.2 8.8.8.8 in via bridge0 So everything does work as expected, but your base system rules do not allow the packet to pass. To your #4 I think it's mostly a problem of not enough documentation and not enough samples yet. From Barbish’s comment #10 #1. There is a security problem with the vnet jailed ipfw firewall having write access to the hosts /var/log/security file. A jail no matter what kind it is, non-vnet or vnet, is by design, not suppose to have access to anything on the host. Here is hard evidence that it is happening. From Zeeb’s Comment #12 To your #1, what is logged to /var/log/security is the kernel log that syslog gets. It's not anything in a jail writing to that file, it's your syslogd on the all-seeing base system. That's essentially "dmesg". That's an unfortunate historic thing of the ipfw implementation; using tcpdump on the ipfw0 interface like on pflog0 for pf will avoid you seeing the vnet-jail logging on the base system as well. ************* reply ******* You did a fine job of describing the logging problem, as it currently exists. It cannot be left this way. The base system ipfw /var/log/security file should only contain records logged from the base system ipfw firewall. Log records from the IPFW firewall in the vnet jail must be posted to the vnet jails directory tree /var/log/security file. The ipfw logging sub-system needs to be made vnet aware to accomplish logging to the correct vnet jail as their may be more that a single vnet jail running on the base host system. This implies that each vnet jail would have its own ipfw log records written to their vnet jails directory tree /var/log/security file. This is what the vnet jail user community expects. Ipfw may be writing syslog format records, but its tagging all its records to the “security” facility and the base system /etc/syslog.conf file has that facility being written to /var/log/security file which is different than “dmesg” records which go to /var/log/messages file. When it comes to your recommendation “using tcpdump on the ipfw0 interface like on pflog0 for pf will avoid you seeing the vnet-jail logging on the base system as well.” On one hand, this don’t seem to be doable, and on the other hand you can’t really thing you are going to force vnet users to jump through hoops to set this up for each vnet jail. That’s crazy. This has to be fixed centrally. Its my understanding that the ipfw0 interface is only enabled with “firewall_logging=yes” in the rc.conf of the host or the vnet jail. This currently activates logging to the hosts /var/log/security file. So the baseline security file will still be polluted with logging from each running vnet jails ipfw firewall, while the vnet jail admin grows his own userland task to tcpdump the ipfw0 raw data and write it to the correct vnet jail in real time. You need to rethink this approach. From Zeeb’s Comment #12 To your #3 and your actual problem: (a) if you are bridging you do not need ip forwarding, especially not inside the vnet-jail. The fact that you are bridging and trying to forward is a weird setup in first place. ************* reply ******* It’s my understanding that non-vnet jails need gateway_enable=yes in the hosts (base system) rc.conf or “sysctl net.init.ip.forwarding=1 to work. If the goal is have a base system where both non-vnet jails and vnet jails can run at same time then my bridging setup is “normal or more the norm then just having a base system that can only run vnet jails. Although an only vnet jail base system is also a normal setup. In this light my setup is not a weird setup at all, but a more flexible one. The jail(8) man page does not state any restrictions on vnet jails and non-vnet jails not being allowed to run on the same base system at the same time. From Zeeb’s Comment #12 (b) Your current topology looks like: Snip……. There is nothing wrong with the bridge/epair method I have employed. Without using an dynamically loaded firewall on the host base system and not having ipfw compiled into the kernel with vimage, using a kernel with just vimage, I can start a vnet jail and ping the public internet getting replies without any problems. Its when I enable ipfw on the host that problems arise. Even compiling a kernel with vimage and ipfw, problems also occur. Here is another trace of events documenting the problem ipfw on a ALPHA6 and BETA2 os. Script started on Mon Aug 1 12:19:58 2016 # Entering commands on the host console #host is running vimage kernel #with ipfw statements in the hosts rc.conf /root >cat /var/log/security # host ipfw security log is empty /root >ipfw show 00050 0 0 check-state 00060 0 0 allow ip from any to any via lo0 00070 0 0 deny ip from 10.0.10.4 to any 00080 0 0 allow log ip from any to any via rl0 keep-state 00090 0 0 allow log ip from any to any keep-state 65535 0 0 deny ip from any to any #no activity on host ipfw firewall yet /root >qjail start v50 Jail successfully started v50 # Lets see what host network looks like /root >ifconfig -a rl0: flags=8943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> metric 0 mtu 1500 options=2008<VLAN_MTU,WOL_MAGIC> ether 00:0c:6e:09:8b:74 inet 10.0.10.9 netmask 0xfffffff0 broadcast 10.0.10.15 nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL> media: Ethernet autoselect (100baseTX <full-duplex>) status: active lo0: flags=8049<UP,LOOPBACK,RUNNING,MULTICAST> metric 0 mtu 16384 options=600003<RXCSUM,TXCSUM,RXCSUM_IPV6,TXCSUM_IPV6> inet6 ::1 prefixlen 128 inet6 fe80::1%lo0 prefixlen 64 scopeid 0x2 inet 127.0.0.1 netmask 0xff000000 nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL> groups: lo bridge0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500 ether 02:8f:94:84:0c:00 inet 10.5.0.1 netmask 0xff000000 broadcast 10.255.255.255 nd6 options=9<PERFORMNUD,IFDISABLED> groups: bridge id 00:00:00:00:00:00 priority 32768 hellotime 2 fwddelay 15 maxage 20 holdcnt 6 proto rstp maxaddr 2000 timeout 1200 root id 00:00:00:00:00:00 priority 32768 ifcost 0 port 0 member: epair5a flags=143<LEARNING,DISCOVER,AUTOEDGE,AUTOPTP> ifmaxaddr 0 port 4 priority 128 path cost 2000 member: rl0 flags=143<LEARNING,DISCOVER,AUTOEDGE,AUTOPTP> ifmaxaddr 0 port 1 priority 128 path cost 200000 epair5a: flags=8943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> metric 0 mtu 1500 options=8<VLAN_MTU> ether 02:c0:00:00:04:0a inet6 fe80::c0:ff:fe00:40a%epair5a prefixlen 64 scopeid 0x4 nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL> media: Ethernet 10Gbase-T (10Gbase-T <full-duplex>) status: active groups: epair /root >jexec v50 login -f root Last login: Mon Aug 1 12:06:21 on ttyv0 FreeBSD 11.0-BETA2 (Vimage) #0: Tue Jul 26 07:48:38 EDT 2016 Welcome to your FreeBSD jail. v50 /root >ping 8.8.8.8 PING 8.8.8.8 (8.8.8.8): 56 data bytes ping: sendto: Permission denied ping: sendto: Permission denied ping: sendto: Permission denied ping: sendto: Permission denied ping: sendto: Permission denied ping: sendto: Permission denied ^C --- 8.8.8.8 ping statistics --- 6 packets transmitted, 0 packets received, 100.0% packet loss v50 /root >exit logout ^M # Lets go see what the hosts ipfw log has captured /root >cat /var/log/security Aug 1 12:23:52 fbsdjones kernel: ipfw: 90 Accept ICMPv6:143.0 [::] [ff02::16] out via epair5a Aug 1 12:23:53 fbsdjones kernel: ipfw: 90 Accept ICMPv6:135.0 [::] [ff02::1:ff00:40a] out via epair5a Aug 1 12:23:53 fbsdjones kernel: ipfw: 80 Accept UDP 10.0.10.9:51100 209.18.47.61:53 out via rl0 Aug 1 12:23:53 fbsdjones kernel: ipfw: 80 Accept UDP 209.18.47.61:53 10.0.10.9:51100 in via rl0 Aug 1 12:23:53 fbsdjones kernel: ipfw: 80 Accept UDP 10.0.10.9:59748 209.18.47.61:53 out via rl0 Aug 1 12:23:53 fbsdjones kernel: ipfw: 80 Accept UDP 209.18.47.61:53 10.0.10.9:59748 in via rl0 Aug 1 12:23:53 fbsdjones kernel: ipfw: 80 Accept UDP 10.0.10.9:18948 209.18.47.61:53 out via rl0 Aug 1 12:23:53 fbsdjones kernel: ipfw: 80 Accept UDP 209.18.47.61:53 10.0.10.9:18948 in via rl0 Aug 1 12:23:53 fbsdjones kernel: ipfw: 80 Accept UDP 10.0.10.9:16357 209.18.47.61:53 out via rl0 Aug 1 12:23:53 fbsdjones kernel: ipfw: 80 Accept UDP 209.18.47.61:53 10.0.10.9:16357 in via rl0 Aug 1 12:23:53 fbsdjones kernel: ipfw: 80 Accept UDP 10.0.10.9:48864 209.18.47.61:53 out via rl0 Aug 1 12:23:53 fbsdjones kernel: ipfw: 80 Accept UDP 209.18.47.61:53 10.0.10.9:48864 in via rl0 Aug 1 12:23:54 fbsdjones kernel: ipfw: 90 Accept ICMPv6:143.0 [fe80::c0:ff:fe00:40a] [ff02::16] out via epair5a Aug 1 12:23:54 fbsdjones kernel: ipfw: 80 Accept UDP 10.0.10.9:49985 209.18.47.61:53 out via rl0 Aug 1 12:23:54 fbsdjones kernel: ipfw: 80 Accept UDP 209.18.47.61:53 10.0.10.9:49985 in via rl0 Aug 1 12:23:54 fbsdjones kernel: ipfw: 80 Accept UDP 10.0.10.9:35004 209.18.47.61:53 out via rl0 Aug 1 12:23:54 fbsdjones kernel: ipfw: 80 Accept UDP 209.18.47.61:53 10.0.10.9:35004 in via rl0 Aug 1 12:23:54 fbsdjones kernel: ipfw: 80 Accept UDP 10.0.10.9:53364 209.18.47.61:53 out via rl0 Aug 1 12:23:55 fbsdjones kernel: ipfw: 80 Accept UDP 209.18.47.61:53 10.0.10.9:53364 in via rl0 # the vnet jail is trying to do dns lookuup for its domain name /root >ipfw show 00050 0 0 check-state 00060 0 0 allow ip from any to any via lo0 00070 0 0 deny ip from 10.0.10.4 to any 00080 16 1859 allow log ip from any to any via rl0 keep-state 00090 3 304 allow log ip from any to any keep-state 65535 0 0 deny ip from any to any # check-state & keep-state is the standard method of only needing a rule # to let stuff pass the firewall without needing a rule to allow it back in. # A in core keep-state rule is created to auto allow the conversation back in. Lets look at some tcpdump files to see what is really moving a round /root >/root >tcpdump -c50 -i epair5a > tcpdump.epair5a /root >tcpdump: verbose output suppressed, use -v or -vv for full protocol /root >listening on epair5a, link-type EN10MB (Ethernet), capture size /root >26214 bytes /root >^C /root >0 packets captured /root >0 packets received by filter /root >0 packets dropped by kernel /root >tcpdump: verbose output suppressed, use -v or -vv for full protocol /root >listening on bridge0, link-type EN10MB (Ethernet), capture size /root >26214 bytes /root >^C /root >0 packets captured /root >0 packets received by filter /root >0 packets dropped by kernel /root >/root >tcpdump -c50 -i rl0 > tcpdump.rl0 /root >tcpdump: verbose output suppressed, use -v or -vv for full protocol /root >listening on rl0, link-type EN10MB (Ethernet), capture size /root >26214 bytes /root >^C /root >23 packets captured /root >23 packets received by filter /root >0 packets dropped by kernel # no dump data captured for epair5a & bridge0 # lets see what dump data captured from /root >cat tcpdump.rl0 12:23:52.855962 ARP, Request who-has 10.5.0.1 tell 10.5.0.1, length 46 12:23:52.948717 ARP, Request who-has 10.5.0.2 tell 10.5.0.2, length 46 12:23:52.962511 IP6 :: > ff02::16: HBH ICMP6, multicast listener report v2, 3 group record(s), length 68 12:23:53.328262 IP6 :: > ff02::1:ff00:40a: ICMP6, neighbor solicitation, who has fe80::c0:ff:fe00:40a, length 32 12:23:53.593534 ARP, Request who-has 10.0.10.2 tell 10.0.10.9, length 46 12:23:53.619012 ARP, Reply 10.0.10.2 is-at 00:10:b5:7b:1d:6f (oui Unknown), length 46 12:23:53.619063 IP 10.0.10.9.51100 > dns-cac-lb-01.rr.com.domain: 53852+ PTR? 1.0.5.10.in-addr.arpa. (39) 12:23:53.637185 IP dns-cac-lb-01.rr.com.domain > 10.0.10.9.51100: 53852 NXDomain* 0/1/0 (98) 12:23:53.638286 IP 10.0.10.9.59748 > dns-cac-lb-01.rr.com.domain: 12862+ PTR? 2.0.5.10.in-addr.arpa. (39) 12:23:53.656456 IP dns-cac-lb-01.rr.com.domain > 10.0.10.9.59748: 12862 NXDomain* 0/1/0 (98) 12:23:53.657428 IP 10.0.10.9.18948 > dns-cac-lb-01.rr.com.domain: 8843+ PTR? 0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.ip6.arpa. (90) 12:23:53.676235 IP dns-cac-lb-01.rr.com.domain > 10.0.10.9.18948: 8843 NXDomain* 0/1/0 (149) 12:23:53.677175 IP 10.0.10.9.16357 > dns-cac-lb-01.rr.com.domain: 40286+ PTR? 6.1.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.2.0.f.f.ip6.arpa. (90) 12:23:53.694977 IP dns-cac-lb-01.rr.com.domain > 10.0.10.9.16357: 40286 NXDomain 0/1/0 (160) 12:23:53.695975 IP 10.0.10.9.48864 > dns-cac-lb-01.rr.com.domain: 22515+ PTR? a.0.4.0.0.0.f.f.1.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.2.0.f.f.ip6.arpa. (90) 12:23:53.923131 IP dns-cac-lb-01.rr.com.domain > 10.0.10.9.48864: 22515 NXDomain 0/1/0 (160) 12:23:54.562106 IP6 fe80::c0:ff:fe00:40a > ff02::16: HBH ICMP6, multicast listener report v2, 3 group record(s), length 68 12:23:54.946338 IP 10.0.10.9.49985 > dns-cac-lb-01.rr.com.domain: 47412+ PTR? 2.10.0.10.in-addr.arpa. (40) 12:23:54.964724 IP dns-cac-lb-01.rr.com.domain > 10.0.10.9.49985: 47412 NXDomain* 0/1/0 (99) 12:23:54.965525 IP 10.0.10.9.35004 > dns-cac-lb-01.rr.com.domain: 48228+ PTR? 9.10.0.10.in-addr.arpa. (40) 12:23:54.983620 IP dns-cac-lb-01.rr.com.domain > 10.0.10.9.35004: 48228 NXDomain* 0/1/0 (99) 12:23:54.984610 IP 10.0.10.9.53364 > dns-cac-lb-01.rr.com.domain: 24431+ PTR? 61.47.18.209.in-addr.arpa. (43) 12:23:55.002963 IP dns-cac-lb-01.rr.com.domain > 10.0.10.9.53364: 24431 1/0/0 PTR dns-cac-lb-01.rr.com. (77) /root >exit exit Script done on Mon Aug 1 12:40:15 2016 Lets try to interpret what the evidence is telling us. When the vnet jail is started it tries to do a dns search for the vnet jail hostname which of course is bogus. But we see the traffic in the dump and also in the host ipfw log. What is also shown is the ICMPV6 packets trying to do the ping command issued from within the running vnet jail. They also have keep-state so they should by allowed back in. These ICMPV6 packets are on the epair5a interface, but yet we see no traffic in the tcpdump for the epair5a interface. What is really strange is there are no ipv4 ping packets for the epair5a, bridge0, or rl0 tcpdumps. So what is unique about the stuff that did get out? The vnet jails hostname lookup is an auto function of starting the jail. The ICMPV6 packets are also part of some auto function of the ping command. In my opinion this strange behavior is caused by the ipfw firewall running on the host system not being correctly integrated into vimage. (In reply to Joe Barbish from comment #13) the log you get in /var/log/security is done by ipfw using a log(9) statement, essentially a printf in the kernel. There is one kernel running. Apart from the network stack nothing has been virtualised using VIMAGE. If you want to virtualise the kernel message buffer, patches will be welcome. The base system is always able to see everything so this is not a security issue. What I might consider an issue is that a jail seems to be able to call dmesg, but that's a different issue also relevant to non-vnet-jaisl; I just opened a PR for this to track it. Your assumption on gateway_enabled=YES in the base system is not correct; it might depend on the setup but you can perfectly fine run (non-vnet) jails without enabling ip forwarding in the base system and I have done so for years: $ sysctl -a | grep forwarding net.inet.ip.forwarding: 0 net.inet.ip.fastforwarding: 0 net.inet6.ip6.forwarding: 0 $ jls -av | wc -l 49 $ I think what I am saying about your topology is that if you are routing in your base system (turn forwarding on) you will not need the bridge interface. If you use the bridge interface there's no need for forwarding. You are doing both at the same time by treating the bridge interface as a gateway interface as well, and that's just not how L2 and L3 are done normally. (In reply to Joe Barbish from comment #14) All the evidence you show in this trace tells me there no single packet coming out from your vnet jail yet; sorry. All other conclusions are not backed by the data you show. Also your conclusions about ping 8.8.8.8 triggering IPv6 packets are wrong. What you see is the epair in the base system doing DAD and joining a MC group, which gets bridged to your real interface. Whatever DNS lookups are happening, e.g. the reverse for the jail IP, is triggered by something in the base system it seems (also see the source address of that packet), which is not 10.0.5.x but 10.0.10.9. At this point I think asking on a mailing list for help to get your setup sorted might be more productive than discussing this in a PR. It'll also help you to get the other pair of eyes. You can always easily point people here and if I missed something someone will point it out, I am sure. |