There are 2 servers, in everyone costs on 4 network cards. 2 from them are united in lagg. In some days lagg collapses: 1 server lagg0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500 options=19b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,TSO4> ether 00:1b:21:3b:4d:4d inet 1.1.1.1 netmask 0xffffffc0 broadcast 1.1.1.255 media: Ethernet autoselect status: active laggproto lacp laggport: em3 flags=1c<ACTIVE,COLLECTING,DISTRIBUTING> laggport: em2 flags=4<ACTIVE> ifconfig em2 em2: flags=9c43<UP,BROADCAST,RUNNING,OACTIVE,SIMPLEX,LINK0,MULTICAST> metric 0 mtu 1500 options=19b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,TSO4> ether 00:1b:21:3b:4d:4d media: Ethernet autoselect (1000baseTX <full-duplex>) status: active lagg: laggdev lagg0 #less /var/run/dmesg.boot | grep em2 em2: <Intel(R) PRO/1000 Network Connection 6.9.6.Yandex[$Revision: 1.36.2.17 $]> port 0x3000-0x301f mem 0xd3180000-0xd319ffff,0xd3100000-0xd317ffff,0xd31a0000-0xd31a3fff irq 16 at device 0.0 on pci2 em2: Using MSIX interrupts em2: Using TXD_LOW instead of TXDW em2: [FILTER] em2: [FILTER] em2: [FILTER] em2: Ethernet address: 00:1b:21:3b:4d:4d em2@pci0:2:0:0: class=0x020000 card=0xa01f8086 chip=0x10d38086 rev=0x00 hdr=0x00 vendor = 'Intel Corporation' class = network subclass = ethernet em3@pci0:4:0:0: class=0x020000 card=0xa01f8086 chip=0x10d38086 rev=0x00 hdr=0x00 vendor = 'Intel Corporation' class = network subclass = ethernet 2 server lagg1: flags=8943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> metric 0 mtu 1500 options=19b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,TSO4> ether 00:1b:21:1b:19:5d media: Ethernet autoselect status: active laggproto lacp laggport: em4 flags=1c<ACTIVE,COLLECTING,DISTRIBUTING> laggport: em1 flags=18<COLLECTING,DISTRIBUTING> em1: flags=8943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> metric 0 mtu 1500 options=19b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,TSO4> ether 00:1b:21:1b:19:5d media: Ethernet autoselect (1000baseTX <full-duplex>) status: active lagg: laggdev lagg1 # less /var/run/dmesg.boot |grep em1 em1: <Intel(R) PRO/1000 Network Connection 6.9.6.Yandex[$Revision: 1.36.2.17 $]> port 0x4000-0x401f mem 0xd0320000-0xd033ffff,0xd0300000-0xd031ffff irq 16 at device 0.0 on pci3 em1: Using MSI interrupt em1: Using TXD_LOW instead of TXDW em1: [FILTER] em1: Ethernet address: 00:1b:21:1b:19:5d em1@pci0:3:0:0: class=0x020000 card=0x10838086 chip=0x10b98086 rev=0x06 hdr=0x00 vendor = 'Intel Corporation' device = '82572EI PRO/1000 PT Desktop Adapter (Copper)' class = network subclass = ethernet em4@pci0:5:0:0: class=0x020000 card=0xa01f8086 chip=0x10d38086 rev=0x00 hdr=0x00 vendor = 'Intel Corporation' class = network subclass = ethernet Error log: Apr 16 00:27:31 2 kernel: em4: link state changed to UP Apr 16 00:27:34 2 kernel: em4: watchdog timeout -- resetting Apr 16 00:27:34 2 kernel: em4: Excessive collisions = 0 Apr 16 00:27:34 2 kernel: em4: Sequence errors = 0 Apr 16 00:27:34 2 kernel: em4: Defer count = 0 Apr 16 00:27:34 2 kernel: em4: Missed Packets = 1217754 Apr 16 00:27:34 2 kernel: em4: Receive No Buffers = 0 Apr 16 00:27:34 2 kernel: em4: Receive Length Errors = 0 Apr 16 00:27:34 2 kernel: em4: Receive errors = 0 Apr 16 00:27:34 2 kernel: em4: Crc errors = 0 Apr 16 00:27:34 2 kernel: em4: Alignment errors = 0 Apr 16 00:27:34 2 kernel: em4: Collision/Carrier extension errors = 0 Apr 16 00:27:34 2 kernel: em4: RX overruns = 0 Apr 16 00:27:34 2 kernel: em4: watchdog timeouts = 143 Apr 16 00:27:34 2 kernel: em4: RX MSIX IRQ = 1654280804 TX MSIX IRQ = 1491971579 LINK MSIX IRQ = 1214367 Apr 16 00:27:34 2 kernel: em4: XON Rcvd = 203508246 Apr 16 00:27:34 2 kernel: em4: XON Xmtd = 3183073363 Apr 16 00:27:34 2 kernel: em4: XOFF Rcvd = 202792650 Apr 16 00:27:34 2 kernel: em4: XOFF Xmtd = 3170508497 Apr 16 00:27:34 2 kernel: em4: Good Packets Rcvd = 108209172443 Apr 16 00:27:34 2 kernel: em4: Good Packets Xmtd = 113645818564 Apr 16 00:27:34 2 kernel: em4: TSO Contexts Xmtd = 0 Apr 16 00:27:34 2 kernel: em4: TSO Contexts Failed = 0 Apr 16 00:27:34 2 kernel: em4: Adapter hardware address = 0xc52a0218 Apr 16 00:27:34 2 kernel: em4: CTRL = 0x58100248 RCTL = 0x801a Apr 16 00:27:34 2 kernel: em4: Packet buffer = Tx=20k Rx=20k Apr 16 00:27:34 2 kernel: em4: Flow control watermarks high = 18432 low = 16932 Apr 16 00:27:34 2 kernel: em4: tx_int_delay = 0, tx_abs_int_delay = 64 Apr 16 00:27:34 2 kernel: em4: rx_int_delay = 0, rx_abs_int_delay = 66 Apr 16 00:27:34 2 kernel: em4: fifo workaround = 0, fifo_reset_count = 0 Apr 16 00:27:34 2 kernel: em4: hw tdh = 0, hw tdt = 1 Apr 16 00:27:34 2 kernel: em4: hw rdh = 0, hw rdt = 4095, next_rx_desc_to_check = 0 Apr 16 00:27:34 2 kernel: em4: Num Tx descriptors avail = 4095 Apr 16 00:27:34 2 kernel: em4: Tx Descriptors not avail1 = 12063 Apr 16 00:27:34 2 kernel: em4: Tx Descriptors not avail2 = 0 Apr 16 00:27:34 2 kernel: em4: Std mbuf failed = 0 Apr 16 00:27:34 2 kernel: em4: Std mbuf cluster failed = 6 Apr 16 00:27:34 2 kernel: em4: Driver dropped packets = 0 Apr 16 00:27:34 2 kernel: em4: Driver tx dma failure in encap = 0 Apr 16 00:27:34 2 kernel: em4: Packets pended due to reorder = 0 Apr 16 00:27:34 2 kernel: em4: RX interrupts has been masked = 77251713 Apr 16 00:27:34 2 kernel: em4: TX interrupts has been generated = 0 Apr 16 00:27:34 2 kernel: em4: link state changed to DOWN tcpdump -i em4 00:47:06.511867 LACPv1, length: 110 00:47:36.997247 LACPv1, length: 110 After reboot for some time all is normalised. Fix: While only reboot :( How-To-Repeat: To connect 2 servers directly through lagg.
Responsible Changed From-To: freebsd-i386->freebsd-net Over to maintainer(s).
3 days ago has refreshed one of servers to 8.0-STABLE from *default date=2010.04.05.00.00.00, the situation is a bit now another. Watchdog is not present, but the interface from lagg is in a state lagg1: flags=8943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> metric 0 mtu 1500 options=9b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM> ether 00:1b:21:1b:19:5d media: Ethernet autoselect status: active laggproto lacp laggport: em4 flags=18<COLLECTING,DISTRIBUTING> laggport: em1 flags=1c<ACTIVE,COLLECTING,DISTRIBUTING> em4: flags=8943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> metric 0 mtu 1500 options=9b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM> ether 00:1b:21:1b:19:5d media: Ethernet 1000baseT (1000baseT <full-duplex>) status: active Has tried to make ifconfig lagg1 -laggport em4 and then ifconfig lagg1 laggport em4 has not helped.
For bugs matching the following criteria: Status: In Progress Changed: (is less than) 2014-06-01 Reset to default assignee and clear in-progress tags. Mail being skipped