Bug 260629

Summary: cc(4): Chelsio T6225-CR takes 252 seconds to change state to UP
Product: Base System Reporter: Jason Mader <jasonmader>
Component: kernAssignee: Navdeep Parhar <np>
Status: Closed FIXED    
Severity: Affects Only Me CC: jasonmader, np
Priority: Normal Keywords: needs-qa
Version: 13.0-RELEASEFlags: koobs: maintainer-feedback? (np)
koobs: maintainer-feedback? (jasonmader)
Hardware: amd64   
OS: Any   
Attachments:
Description Flags
the system message buffer
none
pciconf -lv
none
ifconfig -a when interface is UP none

Description Jason Mader 2021-12-23 02:23:29 UTC
I have a Chelsio T6225-CR connected with 25 Gbps DAC to a Mellanox switch, and noticed that after system boot, there's this 4 minute wait for the interface to come up.

Dec 23 02:07:23 core-0 kernel: t6nex0: <Chelsio T6225-CR> mem 0xdb200000-0xdb27ffff,0xda000000-0xdaffffff,0xdb684000-0xdb685fff irq 120 at device 0.4 on pci12

Dec 23 02:07:23 core-0 kernel: cc0: 16 txq, 8 rxq (NIC); 8 txq (TOE), 2 rxq (TOE)
Dec 23 02:07:23 core-0 kernel: cc0: permanently promiscuous mode enabled
Dec 23 02:07:23 core-0 kernel: cc0: link state changed to DOWN
Dec 23 02:11:35 core-0 kernel: cc0: link state changed to UP
Comment 1 Kubilay Kocak freebsd_committer freebsd_triage 2021-12-23 20:55:38 UTC
Thank you for your report Jason. Could you please include:

- uname -a output
- /var/run/dmesg.boot output (as an attachment)
- pciconf -lv output (as an attachment)
- minimum interface configuration (rc.conf) necessary to reproduce
- ifconfig -a output when device is DOWN (before UP)... (as an attachment)
- ifconfig -a output after device is UP (as an attachment)

Also:

- Is the time the interface takes to change state (DOWN->UP) consistent or variable?
- How does the interface respond/behave once UP, to manual ifconfig down/up commands?
Comment 2 Navdeep Parhar freebsd_committer freebsd_triage 2021-12-23 21:12:45 UTC
Jason,

Can you please provide this information in addition to what koobs@ asked for in comment #1?

T6 firmware version:
# sysctl dev.t6nex.0.firmware_version

Output of this command after the link is UP.  I'm assuming the the link does not bounce or misbehave once it's UP.
# sysctl -n dev.t6nex.0.misc.devlog
Comment 3 Jason Mader 2021-12-27 15:48:03 UTC
Created attachment 230456 [details]
the system message buffer
Comment 4 Jason Mader 2021-12-27 15:50:15 UTC
Created attachment 230458 [details]
pciconf -lv
Comment 5 Jason Mader 2021-12-27 15:52:00 UTC
Created attachment 230460 [details]
ifconfig -a when interface is UP
Comment 6 Jason Mader 2021-12-27 16:08:29 UTC
(In reply to Kubilay Kocak from comment #1)
- uname -a output
FreeBSD core-0-ccsa 13.0-RELEASE-p5 FreeBSD 13.0-RELEASE-p5 #1: Wed Dec 22 17:43:48 UTC 2021     root@core-0-ccsa:/usr/obj/usr/src/amd64.amd64/sys/GENERIC  amd64
- rc.conf
# /etc/start_if.cc0
ifconfig_cc0="inet 129.174.130.140 netmask 0xffffffe0 broadcast 129.174.130.159"
ifconfig_cc0_ipv6="inet6 2620:10e:6024:f004::140 prefixlen 64"
- /etc/start_if.cc0
ifconfig cc0 link 00:25:91:c4:a5:fc promisc mtu 9000 description 'CCSA-MSN2010 Eth1/9' -vlanhwtag -vlanhwtso -vlanmtu -vlanhwcsum -vxlanhwcsum -vxlanhwtso -tso4 -tso6
- Firmware
dev.t6nex.0.firmware_version: 1.25.0.40

DOWN->UP is inconsistent:

Dec 21 02:02:29 core-0-ccsa kernel: lo0: link state changed to UP
Dec 21 02:02:29 core-0-ccsa kernel: cc0: permanently promiscuous mode enabled
Dec 21 02:02:29 core-0-ccsa kernel: cc0: link state changed to DOWN
Dec 21 02:03:38 core-0-ccsa kernel: cc0: link state changed to UP (1 minute 9 seconds)

Dec 23 02:07:23 core-0-ccsa kernel: lo0: link state changed to UP
Dec 23 02:07:23 core-0-ccsa kernel: cc0: permanently promiscuous mode enabled
Dec 23 02:07:23 core-0-ccsa kernel: cc0: link state changed to DOWN
Dec 23 02:11:35 core-0-ccsa kernel: cc0: link state changed to UP (4 minutes 12 seconds)

Once it is up it has never bounced.

# sysctl -n dev.t6nex.0.misc.devlog
      Seq#           Tstamp     Level  Facility  Message
       220        141139601      INFO      PORT  hw_mac_init_port[0], ptype 0x14, speed 0x8, lanes 0x1, fec 0x800000
       221        141139856      INFO      PORT  port[0] negotiated speed 0x8, lanes 0x1:0x1, fec 0x800000
       222        141139857      INFO      PORT  aec_fsm[0] : state START (sigdet 0x1)
       223        141139895      INFO      PORT  aec_fsm[0] : transitioning to TRAINING
       224        141523542      INFO      PORT  aec_fsm[0] : TRAINING_COMPLETE
       225        141523542      INFO      PORT  aec_fsm[0] : COEFFICIENT TAP OVERRIDE 1:2:3 :: 0x7e:0x1b:0x75
       226        141524072      INFO      PORT  aec_fsm[0] : Remote fault while waiting for link status 0x29
       227        142088600      INFO      PORT  aec_fsm[0] timed out training, Link Status 0x29
       228        142188600      INFO      PORT  bean_fsm[0] : state START (count = 1)
       229        142198326      INFO      PORT  hw_mac_init_port[0], ptype 0x14, speed 0x4, lanes 0x1, fec 0x800000
       230        142198605      INFO      PORT  bean_fsm[0] : entering state BASEP_HANDLE
       231        152198636    NOTICE      PORT  bean_fsm[0] TIMEOUT; state 5 eth_status 0x7 bean_status 0x8c PCS_DEBUG0_RO 0x20c10002, AN_STATE 0x4, hss sigdet 0x1 retry_cnt 1
       232        152199698      INFO      PORT  hw_mac_init_port[0], ptype 0x14, speed 0x8, lanes 0x1, fec 0x800000
       233        154298642      INFO      PORT  bean_try_other_fec[0]: try fec 0x1000000
       234        154299643      INFO      PORT  hw_mac_init_port[0], ptype 0x14, speed 0x8, lanes 0x1, fec 0x1000000
       235        156308650      INFO      PORT  bean_try_other_fec[0]: try fec 0x0
       236        156309651      INFO      PORT  hw_mac_init_port[0], ptype 0x14, speed 0x8, lanes 0x1, fec 0x0
       237        158318656      INFO      PORT  bean_timedout[0] reset to port fec
       238        158318657      INFO      PORT  port[0] : reset fec to default
       239        158319658      INFO      PORT  hw_mac_init_port[0], ptype 0x14, speed 0x8, lanes 0x1, fec 0x800000
       240        158319852      INFO      PORT  bean_fsm[0] : state START (count = 2)
       241        158329538      INFO      PORT  hw_mac_init_port[0], ptype 0x14, speed 0x4, lanes 0x1, fec 0x800000
       242        158329816      INFO      PORT  bean_fsm[0] : entering state BASEP_HANDLE
       243        168338693    NOTICE      PORT  bean_fsm[0] TIMEOUT; state 5 eth_status 0x7 bean_status 0x8c PCS_DEBUG0_RO 0x20c10002, AN_STATE 0x4, hss sigdet 0x1 retry_cnt 2
       244        168339755      INFO      PORT  hw_mac_init_port[0], ptype 0x14, speed 0x8, lanes 0x1, fec 0x800000
       245        171438702      INFO      PORT  bean_try_other_fec[0]: try fec 0x1000000
       246        171439703      INFO      PORT  hw_mac_init_port[0], ptype 0x14, speed 0x8, lanes 0x1, fec 0x1000000
       247        174448712      INFO      PORT  bean_try_other_fec[0]: try fec 0x0
       248        174449713      INFO      PORT  hw_mac_init_port[0], ptype 0x14, speed 0x8, lanes 0x1, fec 0x0
       249        177458723      INFO      PORT  bean_timedout[0] reset to port fec
       250        177458724      INFO      PORT  port[0] : reset fec to default
       251        177459725      INFO      PORT  hw_mac_init_port[0], ptype 0x14, speed 0x8, lanes 0x1, fec 0x800000
       252        177459918      INFO      PORT  bean_fsm[0] : state START (count = 3)
       253        177469604      INFO      PORT  hw_mac_init_port[0], ptype 0x14, speed 0x4, lanes 0x1, fec 0x800000
       254        177469882      INFO      PORT  bean_fsm[0] : entering state BASEP_HANDLE
       255        187478759    NOTICE      PORT  bean_fsm[0] TIMEOUT; state 5 eth_status 0x7 bean_status 0x8c PCS_DEBUG0_RO 0x20c10002, AN_STATE 0x4, hss sigdet 0x1 retry_cnt 3
       256        187479821      INFO      PORT  hw_mac_init_port[0], ptype 0x14, speed 0x8, lanes 0x1, fec 0x800000
       257        191578772      INFO      PORT  bean_try_other_fec[0]: try fec 0x1000000
       258        191579773      INFO      PORT  hw_mac_init_port[0], ptype 0x14, speed 0x8, lanes 0x1, fec 0x1000000
       259        195588786      INFO      PORT  bean_try_other_fec[0]: try fec 0x0
       260        195589787      INFO      PORT  hw_mac_init_port[0], ptype 0x14, speed 0x8, lanes 0x1, fec 0x0
       261        199598800      INFO      PORT  bean_timedout[0] reset to port fec
       262        199598801      INFO      PORT  port[0] : reset fec to default
       263        199599802      INFO      PORT  hw_mac_init_port[0], ptype 0x14, speed 0x8, lanes 0x1, fec 0x800000
       264        199599996      INFO      PORT  bean_fsm[0] : state START (count = 4)
       265        199609782      INFO      PORT  hw_mac_init_port[0], ptype 0x14, speed 0x4, lanes 0x1, fec 0x800000
       266        199610060      INFO      PORT  bean_fsm[0] : entering state BASEP_HANDLE
       267        203918966      INFO      PORT  bean_fsm[0] : entering state BASEP_HANDLE
       268        203928817      INFO      PORT  bean_fsm[0] : entering state NXP_HANDLE
       269        203938818      INFO      PORT  bean_fsm[0] : entering state EXT_NXP_HANDLE
       270        203948817      INFO      PORT  consortium_fec[0]: local 0x7, remote 0x3, negotiated 0x800000
       271        203948817      INFO      PORT  bean_fsm[0] : entering state WAIT_FOR_NULL_PAGE
       272        203978816      INFO      PORT  bean_fsm[0] : entering state WAIT_COMPLETE
       273        203978818      INFO      PORT  bean_fsm[0] : tech ability local 0x600, remote 0x600 cr-s 0
       274        203978819      INFO      PORT  bean_fsm[0] : IEEE speed 0x8, FEC remote 0x4, negotiated 0x800000
       275        203978819      INFO      PORT  bean_fsm[0] : state DONE
       276        203978820      INFO      PORT  bean_fsm[0] : FEC local 0x1, negotiated 0x800000
       277        203979821      INFO      PORT  hw_mac_init_port[0], ptype 0x14, speed 0x8, lanes 0x1, fec 0x800000
       278        203980075      INFO      PORT  port[0] negotiated speed 0x8, lanes 0x1:0x1, fec 0x800000
       279        203980076      INFO      PORT  aec_fsm[0] : state START (sigdet 0x1)
       280        203980115      INFO      PORT  aec_fsm[0] : transitioning to TRAINING
       281        204349704      INFO      PORT  aec_fsm[0] : TRAINING_COMPLETE
       282        204349705      INFO      PORT  aec_fsm[0] : COEFFICIENT TAP OVERRIDE 1:2:3 :: 0x7e:0x1b:0x75
       283        204350249      INFO      PORT  aec_fsm[0] : Remote fault while waiting for link status 0x29
       284        204928820      INFO      PORT  aec_fsm[0] timed out training, Link Status 0x29
       285        205028819      INFO      PORT  bean_fsm[0] : state START (count = 1)
       286        205038526      INFO      PORT  hw_mac_init_port[0], ptype 0x14, speed 0x4, lanes 0x1, fec 0x800000
       287        205038804      INFO      PORT  bean_fsm[0] : entering state BASEP_HANDLE
       288        215038856    NOTICE      PORT  bean_fsm[0] TIMEOUT; state 5 eth_status 0x7 bean_status 0x8c PCS_DEBUG0_RO 0x20c10002, AN_STATE 0x4, hss sigdet 0x1 retry_cnt 1
       289        215039918      INFO      PORT  hw_mac_init_port[0], ptype 0x14, speed 0x8, lanes 0x1, fec 0x800000
       290        217138862      INFO      PORT  bean_try_other_fec[0]: try fec 0x1000000
       291        217139863      INFO      PORT  hw_mac_init_port[0], ptype 0x14, speed 0x8, lanes 0x1, fec 0x1000000
       292        219148868      INFO      PORT  bean_try_other_fec[0]: try fec 0x0
       293        219149869      INFO      PORT  hw_mac_init_port[0], ptype 0x14, speed 0x8, lanes 0x1, fec 0x0
       294        221158876      INFO      PORT  bean_timedout[0] reset to port fec
       295        221158876      INFO      PORT  port[0] : reset fec to default
       296        221159877      INFO      PORT  hw_mac_init_port[0], ptype 0x14, speed 0x8, lanes 0x1, fec 0x800000
       297        221160071      INFO      PORT  bean_fsm[0] : state START (count = 2)
       298        221169757      INFO      PORT  hw_mac_init_port[0], ptype 0x14, speed 0x4, lanes 0x1, fec 0x800000
       299        221170035      INFO      PORT  bean_fsm[0] : entering state BASEP_HANDLE
       300        231178912    NOTICE      PORT  bean_fsm[0] TIMEOUT; state 5 eth_status 0x7 bean_status 0x8c PCS_DEBUG0_RO 0x20c10002, AN_STATE 0x4, hss sigdet 0x1 retry_cnt 2
       301        231179974      INFO      PORT  hw_mac_init_port[0], ptype 0x14, speed 0x8, lanes 0x1, fec 0x800000
       302        234278921      INFO      PORT  bean_try_other_fec[0]: try fec 0x1000000
       303        234279922      INFO      PORT  hw_mac_init_port[0], ptype 0x14, speed 0x8, lanes 0x1, fec 0x1000000
       304        237288932      INFO      PORT  bean_try_other_fec[0]: try fec 0x0
       305        237289933      INFO      PORT  hw_mac_init_port[0], ptype 0x14, speed 0x8, lanes 0x1, fec 0x0
       306        240298942      INFO      PORT  bean_timedout[0] reset to port fec
       307        240298943      INFO      PORT  port[0] : reset fec to default
       308        240299944      INFO      PORT  hw_mac_init_port[0], ptype 0x14, speed 0x8, lanes 0x1, fec 0x800000
       309        240300138      INFO      PORT  bean_fsm[0] : state START (count = 3)
       310        240309844      INFO      PORT  hw_mac_init_port[0], ptype 0x14, speed 0x4, lanes 0x1, fec 0x800000
       311        240310122      INFO      PORT  bean_fsm[0] : entering state BASEP_HANDLE
       312        250318978    NOTICE      PORT  bean_fsm[0] TIMEOUT; state 5 eth_status 0x7 bean_status 0x8c PCS_DEBUG0_RO 0x20c10002, AN_STATE 0x4, hss sigdet 0x1 retry_cnt 3
       313        250320041      INFO      PORT  hw_mac_init_port[0], ptype 0x14, speed 0x8, lanes 0x1, fec 0x800000
       314        254418991      INFO      PORT  bean_try_other_fec[0]: try fec 0x1000000
       315        254419992      INFO      PORT  hw_mac_init_port[0], ptype 0x14, speed 0x8, lanes 0x1, fec 0x1000000
       316        258429005      INFO      PORT  bean_try_other_fec[0]: try fec 0x0
       317        258430006      INFO      PORT  hw_mac_init_port[0], ptype 0x14, speed 0x8, lanes 0x1, fec 0x0
       318        262439019      INFO      PORT  bean_timedout[0] reset to port fec
       319        262439020      INFO      PORT  port[0] : reset fec to default
       320        262440021      INFO      PORT  hw_mac_init_port[0], ptype 0x14, speed 0x8, lanes 0x1, fec 0x800000
       321        262440215      INFO      PORT  bean_fsm[0] : state START (count = 4)
       322        262449881      INFO      PORT  hw_mac_init_port[0], ptype 0x14, speed 0x4, lanes 0x1, fec 0x800000
       323        262450159      INFO      PORT  bean_fsm[0] : entering state BASEP_HANDLE
       324        266779186      INFO      PORT  bean_fsm[0] : entering state BASEP_HANDLE
       325        266789036      INFO      PORT  bean_fsm[0] : entering state NXP_HANDLE
       326        266799036      INFO      PORT  bean_fsm[0] : entering state EXT_NXP_HANDLE
       327        266809036      INFO      PORT  consortium_fec[0]: local 0x7, remote 0x3, negotiated 0x800000
       328        266809036      INFO      PORT  bean_fsm[0] : entering state WAIT_FOR_NULL_PAGE
       329        266839036      INFO      PORT  bean_fsm[0] : entering state WAIT_COMPLETE
       330        266839037      INFO      PORT  bean_fsm[0] : tech ability local 0x600, remote 0x600 cr-s 0
       331        266839038      INFO      PORT  bean_fsm[0] : IEEE speed 0x8, FEC remote 0x4, negotiated 0x800000
       332        266839039      INFO      PORT  bean_fsm[0] : state DONE
       333        266839039      INFO      PORT  bean_fsm[0] : FEC local 0x1, negotiated 0x800000
       334        266840040      INFO      PORT  hw_mac_init_port[0], ptype 0x14, speed 0x8, lanes 0x1, fec 0x800000
       335        266840294      INFO      PORT  port[0] negotiated speed 0x8, lanes 0x1:0x1, fec 0x800000
       336        266840296      INFO      PORT  aec_fsm[0] : state START (sigdet 0x1)
       337        266840334      INFO      PORT  aec_fsm[0] : Transitioning to TRAINING_LOCAL
       338        266840594      INFO      PORT  aec_fsm[0] : Transitioning to COEF_STATUS_UPDATE
       339        266952212      INFO      PORT  aec_fsm[0] : transitioning to TRAINING
       340        266952263      INFO      PORT  aec_fsm[0] : TRAINING_COMPLETE
       341        266952264      INFO      PORT  aec_fsm[0] : COEFFICIENT TAP OVERRIDE 1:2:3 :: 0x7e:0x1b:0x75
       342        266952798      INFO      PORT  aec_fsm[0] : Remote fault while waiting for link status 0x29
       343        272651956      INFO      PORT  aec_fsm[0] : DONE
       344        272651956      INFO      PORT  bean/aec complete (retry: 4)
       345        272749058      INFO      PORT  port[0] link up (1) (speed 0x8 acaps 0x20970008 lpcaps 0x100008)
       346        272749059      INFO      PORT  port[0] set PAUSE PARAMS: pppen 0 txpe 0 rxpe 0
       347        272749064      INFO      PORT  port[0] update (flowcid 20652 rc 0)

There's a couple more details that I'll provide as soon as I can DOWN/UP the interface.
Comment 7 Navdeep Parhar freebsd_committer freebsd_triage 2021-12-31 04:11:16 UTC
There will be a new T6 firmware available within a couple of weeks and it
is expected to link up a lot sooner in situations like this.  I will update
cxgbe(4) as soon as the firmware is available.
Comment 8 Navdeep Parhar freebsd_committer freebsd_triage 2022-01-05 22:48:22 UTC
This should be fixed with the latest firmware (1.26.6.0) that was checked into main in this commit:

https://cgit.freebsd.org/src/commit/?id=3b76242433e5651bce0c89ce98af450c0cd02ad1

FreeBSD 13.0 is recent enough that the driver there should just work with this firmware.  To use this firmware manually on 13.0 grab the t6fw-1.26.6.0.bin file from the commit above and then run these commands:

# sysctl dev.t6nex.0.dflags=2
# cxgbetool t6nex0 loadfw .../t6fw-1.26.6.0.bin

Reboot the system and you'll have the new firmware running on the card.
Comment 9 Jason Mader 2022-01-05 23:43:45 UTC
That fixed it for me. Thank you.

Jan  5 23:30:24 core-0 kernel: t6nex0: <Chelsio T6225-CR> mem 0xdb200000-0xdb27ffff,0xda000000-0xdaffffff,0xdb684000-0xdb685fff irq 120 at device 0.4 on pci12

Jan  5 23:30:24 core-0 kernel: cc0: 16 txq, 8 rxq (NIC); 8 txq (TOE), 2 rxq (TOE)
Jan  5 23:30:24 core-0 kernel: cc0: permanently promiscuous mode enabled
Jan  5 23:30:24 core-0 kernel: cc0: link state changed to DOWN
Jan  5 23:30:25 core-0 kernel: cc0: link state changed to UP