Bug 240818 - igb(4) vlanhwfilter feature generate link UP/DOWN for each new vlan created
Summary: igb(4) vlanhwfilter feature generate link UP/DOWN for each new vlan created
Status: In Progress
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: CURRENT
Hardware: Any Any
: --- Affects Many People
Assignee: Eric Joyner
Depends on:
Reported: 2019-09-25 14:58 UTC by Olivier Cochard
Modified: 2020-07-07 14:04 UTC (History)
12 users (show)

See Also:
koobs: mfc-stable12+


Note You need to log in before you can comment on or make changes to this bug.
Description Olivier Cochard freebsd_committer 2019-09-25 14:58:24 UTC
On 12-STALBE and -head (I didn't try with older version), creating a vlan generate interface DOWN + UP.

Problem reproduced with I354 and I211.

How to reproduce the problem:

# ifconfig igb3.1000 create vlan 1000 vlandev igb3
igb3: link state changed to DOWN
igb3.1000: link state changed to DOWN
igb3: link state changed to UP
igb3.1000: link state changed to UP

So let's disable vlanhwfilter (this will trigger another down/up):

# ifconfig igb3 -vlanhwfilter
igb3: link state changed to DOWN
igb3.1000: link state changed to DOWN
igb3: link state changed to UP
igb3.1000: link state changed to UP

# ifconfig igb3.1100 create vlan 1100 vlandev igb3

=> No more down/up with vlanhwfilter disabled
Comment 1 Piotr Pietruszewski 2019-09-26 08:52:01 UTC
Currently, iflib based drivers issue an init after registering or unregistering vlans with IFCAP_VLAN_HWFILTER enabled ( https://svnweb.freebsd.org/base/stable/12/sys/net/iflib.c?revision=351276&view=markup#l4265 ). In case of e1000 driver, init is needed as vlan hw support setup is part of driver's initialization routine.
Comment 2 Olivier Cochard freebsd_committer 2019-09-26 11:41:55 UTC
Here is a user use case:

FreeBSD as a router serving multiple customer: one customer per vlan.
Each time we add a new customer, we add a VLAN… and the link re-init, so all the customers are impacted.
It's a big limitation for this use case.
Comment 3 Andrey V. Elsukov freebsd_committer 2019-09-26 12:35:08 UTC
It was reported by several people a year ago:

I also think that this behavior makes Intel cards with iflib drivers unusable for setups many vlans.
Comment 4 Kurt Jaeger freebsd_committer 2019-10-31 05:16:38 UTC
Wouldn't it be sufficient to init only on the first vlan ? For all further VLANs, the hardware would already be available ?

I agree with the statement that that DOWN/UP events for every VLAN
would make that type of interface useless for VLANs.
Comment 5 Aleksandr Fedorov 2020-05-06 15:13:13 UTC
Need help testing this patch: https://reviews.freebsd.org/D24659
Comment 6 Lev A. Serebryakov freebsd_committer 2020-05-06 21:29:01 UTC
D24659 helps me on em, igb and ix
Comment 7 commit-hook freebsd_committer 2020-05-11 17:42:15 UTC
A commit references this bug:

Author: erj
Date: Mon May 11 17:42:05 UTC 2020
New revision: 360902
URL: https://svnweb.freebsd.org/changeset/base/360902

  em/ix/ixv/ixl/iavf: Implement ifdi_needs_restart iflib method

  Pursuant to r360398, implement driver-specific versions of the
  ifdi_needs_restart iflib device method.

  Some (if not most?) Intel network cards don't need reinitializing when a
  VLAN is added or removed from the device hardware, so these implement
  ifdi_needs_restart in a way that tell iflib not to bring the interface
  up or down when a VLAN is added or removed, regardless of whether the
  VLAN_HWFILTER interface capability flag is set or not.

  This could potentially solve several PRs relating to link flaps that
  occur when VLANs are added/removed to devices.

  Signed-off-by: Eric Joyner <erj@freebsd.org>

  PR:		240818, 241785
  Reviewed by:	gallatin@, olivier@
  MFC after:	3 days
  MFC with:	r360398
  Sponsored by:	Intel Corporation
  Differential Revision:	https://reviews.freebsd.org/D24659

Comment 8 commit-hook freebsd_committer 2020-05-14 19:57:43 UTC
A commit references this bug:

Author: erj
Date: Thu May 14 19:56:56 UTC 2020
New revision: 361053
URL: https://svnweb.freebsd.org/changeset/base/361053

  MFC r360398 and r360902

  These commits introduce a new iflib device-dependent method and
  implements that method in the Intel ethernet network drivers;
  this method tells iflib if the network interface needs to be
  restarted when certain events happen.

  This fixes several issues that occur when VLANs are registered
  or unregistered with the network interface.

  PR:		240818, 241785
  Sponsored by:	Intel Corporation

_U  stable/12/
Comment 9 Kubilay Kocak freebsd_committer freebsd_triage 2020-05-19 02:37:36 UTC

 - Assign to committer resolving (via base r360902)
 - Pending close of bug 240818
Comment 10 Marek Zarychta 2020-05-19 14:40:35 UTC
Beware I217 users. The r361053 has broken a bit startup process for em0 and its VLAN children in 12-STABLE. After bootup, the NIC is up and capable of sniffing traffic, the same for children VLAN subinterfaces. The only cons are that neither of them is able to send traffic. Bringing the NIC up and down solves the issue, unplugging and plugging media again doesn't change anything.

em0@pci0:0:25:0:	class=0x020000 card=0x1905103c chip=0x153a8086 rev=0x05 hdr=0x00
    vendor     = 'Intel Corporation'
    device     = 'Ethernet Connection I217-LM'
    class      = network
    subclass   = ethernet
    cap 01[c8] = powerspec 2  supports D0 D3  current D0
    cap 05[d0] = MSI supports 1 message, 64 bit enabled with 1 message
    cap 13[e0] = PCI Advanced Features: FLR TP

em0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 9000
	ether 2c:44:fd:37:be:14
	media: Ethernet autoselect (1000baseT <full-duplex>)
	status: active
	supported media:
		media autoselect
		media 1000baseT
		media 1000baseT mediaopt full-duplex
		media 100baseTX mediaopt full-duplex
		media 100baseTX
		media 10baseT/UTP mediaopt full-duplex
		media 10baseT/UTP
vlan7: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 9000
	ether 2c:44:fd:37:be:14
	inet 10.0.x.y netmask 0xfffff800 broadcast 10.0.y.z
	groups: vlan
	vlan: 798 vlanpcp: 0 parent interface: em0
	media: Ethernet autoselect (1000baseT <full-duplex>)
	status: active
	supported media:
		media autoselect
# sysctl dev.em.0
dev.em.0.wake: 1
dev.em.0.interrupts.rx_overrun: 0
dev.em.0.interrupts.rx_desc_min_thresh: 0
dev.em.0.interrupts.tx_queue_min_thresh: 0
dev.em.0.interrupts.tx_queue_empty: 0
dev.em.0.interrupts.tx_abs_timer: 0
dev.em.0.interrupts.tx_pkt_timer: 0
dev.em.0.interrupts.rx_abs_timer: 0
dev.em.0.interrupts.rx_pkt_timer: 0
dev.em.0.interrupts.asserts: 916405
dev.em.0.mac_stats.tso_ctx_fail: 0
dev.em.0.mac_stats.tso_txd: 0
dev.em.0.mac_stats.tx_frames_1024_1522: 0
dev.em.0.mac_stats.tx_frames_512_1023: 0
dev.em.0.mac_stats.tx_frames_256_511: 0
dev.em.0.mac_stats.tx_frames_128_255: 0
dev.em.0.mac_stats.tx_frames_65_127: 0
dev.em.0.mac_stats.tx_frames_64: 0
dev.em.0.mac_stats.mcast_pkts_txd: 293
dev.em.0.mac_stats.bcast_pkts_txd: 38
dev.em.0.mac_stats.good_pkts_txd: 17046
dev.em.0.mac_stats.total_pkts_txd: 17046
dev.em.0.mac_stats.good_octets_txd: 5112315
dev.em.0.mac_stats.good_octets_recvd: 119853581
dev.em.0.mac_stats.rx_frames_1024_1522: 0
dev.em.0.mac_stats.rx_frames_512_1023: 0
dev.em.0.mac_stats.rx_frames_256_511: 0
dev.em.0.mac_stats.rx_frames_128_255: 0
dev.em.0.mac_stats.rx_frames_65_127: 0
dev.em.0.mac_stats.rx_frames_64: 0
dev.em.0.mac_stats.mcast_pkts_recvd: 323244
dev.em.0.mac_stats.bcast_pkts_recvd: 402143
dev.em.0.mac_stats.good_pkts_recvd: 995757
dev.em.0.mac_stats.total_pkts_recvd: 995757
dev.em.0.mac_stats.xoff_txd: 0
dev.em.0.mac_stats.xoff_recvd: 0
dev.em.0.mac_stats.xon_txd: 0
dev.em.0.mac_stats.xon_recvd: 0
dev.em.0.mac_stats.coll_ext_errs: 0
dev.em.0.mac_stats.alignment_errs: 0
dev.em.0.mac_stats.crc_errs: 0
dev.em.0.mac_stats.recv_errs: 0
dev.em.0.mac_stats.recv_jabber: 0
dev.em.0.mac_stats.recv_oversize: 0
dev.em.0.mac_stats.recv_fragmented: 0
dev.em.0.mac_stats.recv_undersize: 0
dev.em.0.mac_stats.recv_no_buff: 0
dev.em.0.mac_stats.missed_packets: 0
dev.em.0.mac_stats.defer_count: 0
dev.em.0.mac_stats.sequence_errors: 0
dev.em.0.mac_stats.symbol_errors: 0
dev.em.0.mac_stats.collision_count: 0
dev.em.0.mac_stats.late_coll: 0
dev.em.0.mac_stats.multiple_coll: 0
dev.em.0.mac_stats.single_coll: 0
dev.em.0.mac_stats.excess_coll: 0
dev.em.0.queue_rx_0.rx_irq: 0
dev.em.0.queue_rx_0.rxd_tail: 122
dev.em.0.queue_rx_0.rxd_head: 124
dev.em.0.queue_tx_0.tx_irq: 0
dev.em.0.queue_tx_0.txd_tail: 61
dev.em.0.queue_tx_0.txd_head: 63
dev.em.0.fc_low_water: 20552
dev.em.0.fc_high_water: 23584
dev.em.0.rx_control: 100892734
dev.em.0.device_control: 1075315264
dev.em.0.watchdog_timeouts: 0
dev.em.0.rx_overruns: 0
dev.em.0.link_irq: 0
dev.em.0.dropped: 0
dev.em.0.eee_control: 1
dev.em.0.itr: 488
dev.em.0.tx_abs_int_delay: 66
dev.em.0.rx_abs_int_delay: 66
dev.em.0.tx_int_delay: 66
dev.em.0.rx_int_delay: 0
dev.em.0.rs_dump: 0
dev.em.0.reg_dump: General Registers
	CTRL	 40180240
	STATUS	 00080083
	CTRL_EXIT	 015a1006

Interrupt Registers
	ICR	 00000000

RX Registers
	RCTL	 0603803e
	RDLEN	 00010000
	RDH	 0000007c
	RDT	 0000007a
	RXDCTL	 00010000
	RDBAL	 035c2000
	RDBAH	 00000001

TX Registers
	TCTL	 3103f0fa
	TDBAL	 0359a000
	TDBAH	 00000001
	TDLEN	 00010000
	TDH	 00000050
	TDT	 00000050
	TXDCTL	 0341011f
	TDFH	 00000c46
	TDFT	 00000c46
	TDFHS	 00000c46
	TDFPC	 00000000

dev.em.0.fc: 3
dev.em.0.debug: -1
dev.em.0.nvm: -1
dev.em.0.iflib.rxq0.rxq_fl0.credits: 4095
dev.em.0.iflib.rxq0.rxq_fl0.cidx: 124
dev.em.0.iflib.rxq0.rxq_fl0.pidx: 123
dev.em.0.iflib.txq0.r_abdications: 0
dev.em.0.iflib.txq0.r_restarts: 0
dev.em.0.iflib.txq0.r_stalls: 0
dev.em.0.iflib.txq0.r_starts: 17094
dev.em.0.iflib.txq0.r_drops: 0
dev.em.0.iflib.txq0.r_enqueues: 17102
dev.em.0.iflib.txq0.ring_state: pidx_head: 0739 pidx_tail: 0739 cidx: 0739 state: IDLE
dev.em.0.iflib.txq0.txq_cleaned: 24629
dev.em.0.iflib.txq0.txq_processed: 24671
dev.em.0.iflib.txq0.txq_in_use: 42
dev.em.0.iflib.txq0.txq_cidx_processed: 97
dev.em.0.iflib.txq0.txq_cidx: 59
dev.em.0.iflib.txq0.txq_pidx: 101
dev.em.0.iflib.txq0.no_tx_dma_setup: 0
dev.em.0.iflib.txq0.txd_encap_efbig: 0
dev.em.0.iflib.txq0.tx_map_failed: 0
dev.em.0.iflib.txq0.no_desc_avail: 0
dev.em.0.iflib.txq0.mbuf_defrag_failed: 0
dev.em.0.iflib.txq0.m_pullups: 5
dev.em.0.iflib.txq0.mbuf_defrag: 0
dev.em.0.iflib.override_nrxds: 4096
dev.em.0.iflib.override_ntxds: 4096
dev.em.0.iflib.separate_txrx: 0
dev.em.0.iflib.core_offset: 0
dev.em.0.iflib.tx_abdicate: 0
dev.em.0.iflib.rx_budget: 0
dev.em.0.iflib.disable_msix: 1
dev.em.0.iflib.override_qs_enable: 0
dev.em.0.iflib.override_nrxqs: 0
dev.em.0.iflib.override_ntxqs: 0
dev.em.0.iflib.driver_version: 7.6.1-k
dev.em.0.%parent: pci0
dev.em.0.%pnpinfo: vendor=0x8086 device=0x153a subvendor=0x103c subdevice=0x1905 class=0x020000
dev.em.0.%location: slot=25 function=0 dbsf=pci0:0:25:0 handle=\_SB_.PCI0.GLAN
dev.em.0.%driver: em
dev.em.0.%desc: Intel(R) PRO/1000 Network Connection
Comment 11 Marek Zarychta 2020-05-19 15:25:03 UTC
I should have written "Bringing the NIC down and up solves the issue"; to clarify: I have not tested whether the NIC is able to transmit untagged frames.
Comment 12 Eric Joyner freebsd_committer 2020-05-21 19:36:09 UTC
Is anyone else seeing the problem that Marek is seeing?

I don't have any 1G devices, so I can't try to reproduce this.
Comment 13 Marek Zarychta 2020-05-21 20:18:11 UTC
(In reply to Eric Joyner from comment #12)
This could be I217-LM (em) specific or different flaw which came out after upgrade. Other 1G NICs like I219-LM (em) and I350 (igb) are initialising fine on 12-STABLE after this MFC.
Comment 14 martin.mato 2020-05-27 15:36:23 UTC
Thanks to Mr Marek Zarychta  here who has pointed me to this report, i can share 
that i have similar problems, related to r360902

I have a intel 82574L based network card who  act as a link to my ISP,  who is gaving me dhcp based authentication through a vlan

after r360902 , the connectivity to my ISP can't be restored  after a reboot or a power-on until, as Mr Marek Zarychta pointed i "Bring the NIC down and up " by the ifconfig(8) command and NOT by unplugging /replugging the ethernet cable back.

I'll open a proprer bug report if  needed to; meanwhile, i'll test the patch provided, in case of some interesting results 

Comment 15 martin.mato 2020-05-27 17:18:44 UTC
(In reply to martin.mato from comment #14)
Well, i compiled successfully a new kernel, and it is working perfectly

because the working generated if_em.c file has the same content than the revision pre-r360902, right?
Comment 16 Raúl 2020-05-29 07:06:44 UTC
(In reply to martin.mato from comment #14)
Same situation here. After replacing isp router, no success obviously, the box ended relocated in a hurry on a data center that, just by chance, doesn't need vlan tagging to operate.

I sadly didn't try down/up, but unplug/plug cable didn't help.

em0@pci0:0:25:0:        class=0x020000 card=0x20448086 chip=0x15038086 rev=0x04 hdr=0x00
    vendor     = 'Intel Corporation'
    device     = '82579V Gigabit Network Connection'
    class      = network
    subclass   = ethernet
    cap 01[c8] = powerspec 2  supports D0 D3  current D0
    cap 05[d0] = MSI supports 1 message, 64 bit enabled with 1 message
    cap 13[e0] = PCI Advanced Features: FLR TP
Comment 17 martin.mato 2020-05-29 08:36:07 UTC
(In reply to Raúl from comment #16)
No, unfortunately:
Unplugging/repluging the cable doesn't work, like i said.
The only way is to manually reset by ifconfig(8) down and up the parent interface em0 , not the vlan interface(s).

In the @current list ; Ian Freislich  provided a patch who worked for him

i tested succesfully as well on my box

here it is

Index: sys/dev/e1000/if_em.c
--- sys/dev/e1000/if_em.c       (revision 361538)
+++ sys/dev/e1000/if_em.c       (working copy)
@@ -4054,7 +4054,7 @@
        switch (event) {
-               return (false);
+               return (true);
                return (true);
Comment 18 Raúl 2020-05-29 10:38:30 UTC
(In reply to martin.mato from comment #17)

I'll look how to test it on site before returning it back home.
Given covid-19 circumstances logistics are ... sort of worrisome.

Thanks for the info.
Comment 19 Eric Joyner freebsd_committer 2020-06-11 15:55:59 UTC
I'll change em(4) so that it always does the interface down/up on VLAN config/unconfig, but leave igb(4) as-is. It seems like em(4) needs more investigation on what needs to be done to prevent a link flap.
Comment 20 commit-hook freebsd_committer 2020-06-11 16:00:41 UTC
A commit references this bug:

Author: erj
Date: Thu Jun 11 15:59:49 UTC 2020
New revision: 362063
URL: https://svnweb.freebsd.org/changeset/base/362063

  em(4): Always reinit interface when adding/removing VLAN

  This partially reverts r361053 since there have been reports
  by users that this breaks some functionality for em(4)
  devices; it seems at first glance that some sort of interface
  restart is required for those cards.

  This isn't a proper fix; this unbreaks those users until a proper
  fix is found for their issues.

  PR:		240818
  Reported by:	Marek Zarychta <zarychtam@plan-b.pwste.edu.pl>
  MFC after:	3 days

Comment 21 martin.mato 2020-06-14 17:37:16 UTC
(In reply to commit-hook from comment #20)

Compiled a kernel and tested  immediately upon receipt.

So far, it's working very well, at least for me.