Bug 141413

Summary: [hang] Tyan 2881 m3289 SMDC freeze
Product: Base System Reporter: Byron Young <bkyoung74q9>
Component: amd64Assignee: freebsd-amd64 (Nobody) <amd64>
Status: Closed Feedback Timeout    
Severity: Affects Only Me CC: jhb
Priority: Normal Flags: bugmeister: mfc-stable10?
bugmeister: mfc-stable9?
bugmeister: mfc-stable8?
Version: 8.0-RELEASE   
Hardware: Any   
OS: Any   

Description Byron Young 2009-12-13 00:40:05 UTC
After default install of 8.0-RELEASE on SMDC machine, Tyan TSO Manager 1.6.1 running on remote XP machine cannot connect to SMDC card. However, when running DOS 6.22 on SMDC machine, remote TSO manager connects out-of-band to SMDC card on SMDC machine.

Fix: 

Escape to boot loader prompt.
At this point, remote TSO Manager connects to SMDC card.
At boot loader prompt, set bge variable hw.bge.allow_asf=1, then boot directly into single user mode. In single user mode, verify hw.bge.allow_asf is set using sysctl hw.bge.allow_asf.
At this point, remote TSO Manager connects to SMDC card.
Attempt to configure bge0 using ifconfig
ifconfig bge0 inet 192.168.4.4 netmask 255.255.255.0

Result:
ifconfig and kernel freeze, forcing power cycle reset.
How-To-Repeat: Install 8.0-RELEASE on SMDC machine then boot into 8.0-RELEASE. Attempt to connect to SMDC card in SMDC machine using TSO manager/console from remote XP machine.
Comment 1 Byron Young 2009-12-13 00:48:40 UTC
Some URL's for mentioned hardware/software...

Tyan Thunder K8SR S2881
http://www.tyan.com/product_board_detail.aspx?pid=115

Tyan m3289 SMDC
http://www.tyan.com/product_accessories_spec.aspx?pid=7

Tyan TSO 1.61
ftp://ftp.tyan.com/Software/tso/m3289/tso/TSOCD1.61-091505.ISO
Comment 2 Byron Young 2009-12-14 01:35:29 UTC
A debugging session puts the hang at line 581. Interesting to note that some other *nix distros don't seem to have this hangup.

# sysctl hw.bge.allow_asf
hw.bge.allow_asf: 1
# cat if.txt
b main
set args -v bge0 inet 192.168.4.8 netmask 255.255.255.0
r

# gdb -x if.txt ifconfig
GNU gdb 6.1.1 [FreeBSD]
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "amd64-marcel-freebsd"...
Breakpoint 1 at 0x402b48: file /usr/src/sbin/ifconfig/ifconfig.c, line 143.

Breakpoint 1, main (argc=7, argv=0x7fffffffecd0)
    at /usr/src/sbin/ifconfig/ifconfig.c:143
143	{
(gdb) b ifconfig
Breakpoint 2 at 0x403736: file /usr/src/sbin/ifconfig/ifconfig.c, line 465.
(gdb) c
Continuing.

Breakpoint 2, ifconfig (argc=3, argv=0x7fffffffecf0, iscreate=0, 
    uafp=0x521f60) at /usr/src/sbin/ifconfig/ifconfig.c:465
465		strncpy(ifr.ifr_name, name, sizeof ifr.ifr_name);
(gdb) b 581
Breakpoint 3 at 0x403bff: file /usr/src/sbin/ifconfig/ifconfig.c, line 581.
(gdb) c
Continuing.

Breakpoint 3, ifconfig (argc=0, argv=0x7fffffffed08, iscreate=0, 
    uafp=0x521f60) at /usr/src/sbin/ifconfig/ifconfig.c:581
581			if (ioctl(s, afp->af_aifaddr, afp->af_addreq) < 0)
(gdb) quit
Comment 3 Andriy Gapon freebsd_committer freebsd_triage 2010-12-05 14:08:44 UTC
Is this still an issue?
The problem looks related to network code.  It would be useful to discuss this
on net@.
It would be useful if you could get a stack trace of the hang.

-- 
Andriy Gapon
Comment 4 robert 2011-02-03 00:43:10 UTC
On 12/5/2010 8:10 AM, Andriy Gapon wrote:
> The following reply was made to PR amd64/141413; it has been noted by GNATS.
>
> From: Andriy Gapon<avg@freebsd.org>
> To: bug-followup@freebsd.org, bkyoung74q9@yahoo.com
> Cc:
> Subject: Re: amd64/141413: [hang] Tyan 2881 m3289 SMDC freeze
> Date: Sun, 05 Dec 2010 16:08:44 +0200
>
>   Is this still an issue?
>   The problem looks related to network code.  It would be useful to discuss this
>   on net@.
>   It would be useful if you could get a stack trace of the hang.
>
>   --
>   Andriy Gapon
I've yet to see the problem subside so I imagine this is still an 
issue.  I follow this PR as I still have this card and Freebsd
in production and would love to see a resolution to this. I do believe I 
have a dev box with this setup and could probably
get it operational to test around on this.

--
Robert Clemens
Comment 5 robert 2011-02-03 04:42:42 UTC
I apologize for the length of this followup but wanted to detail this as 
much as possible for future readers and
what I believe to be the closing of PR141413 now that it appears to be 
resolved. With the documentation I have
provided I feel this is easily duplicated.

I pulled out the old trusty dev box (exact specs listed for this PR).
    Tyan s2881 motherboard with m3289 SMDC card.

FreeBSD 8.2-RC2 works great with remote ipmi management while power is 
off, during bootup, and during normal
operational init multiuser conditions.

I last tried this for FreeBSD 8.1-RELEASE. I can't speak for when this 
started working but it was after 8.1-REL and sometime during 8.2-RCx.

One thing I did notice is I no longer see ipmi0 dev or ipmi information 
from dmesg as I used to. I'm not exactly sure the intended functionality 
of the ipmi0 disappearance.
This results in the inability to use ipmitool to connect locally from 
the machine in question as was once possible -- actually this was the 
only way previous to use the ipmi
functionality before 8.2-RCx. That may still result in an open issue but 
as far as I'm concerned, I'm quite ecstatic to see a working console 
login via com2 over lan.

Now for the setup for replication (I would love someone to verify this 
on another similar system):

Make sure you setup the SMDC card with username/password/ip/etc.
Make sure you enable remote access in BIOS. There are a few settings.
You will need to find them all and bind it to NIC48 (bge0).

FreeBSD tyan.solidsolutions.net 8.2-RC2 FreeBSD 8.2-RC2 #0: Wed Jan 12 
17:02:35 UTC 2011     
root@mason.cse.buffalo.edu:/usr/obj/usr/src/sys/GENERIC  amd64

// nothing in dmesg..
[root@tyan ~]# dmesg |grep ipmi
[root@tyan ~]#

// i do have ipmi module loaded in loader.conf
[root@tyan ~]# kldstat
Id Refs Address            Size     Name
  1   23 0xffffffff80100000 da04a0   kernel
  2    1 0xffffffff80ea1000 21068    geom_mirror.ko
  3    1 0xffffffff80ec3000 4d0a0    pf.ko
  4    1 0xffffffff80f11000 15e0     accf_http.ko
  5    1 0xffffffff80f13000 fba8     ipmi.ko
  6    4 0xffffffff80f23000 24c0     smbus.ko
  7    1 0xffffffff80f26000 2d48     smb.ko
  8    1 0xffffffff80f29000 3e00     amdsmb.ko
  9    1 0xffffffff80f2d000 ba60     if_lagg.ko
[root@tyan ~]#

// i added a line to /etc/ttys for console redirection (com2 19200 baud 
vt100 emulation)
ttyu1 "/usr/libexec/getty std.19200" vt100 on secure

// i also needed to bind the ip for the smdc to my network interface.
// i used 192.168.1.199 on the smdc firmware. i added this as an alias 
to my network interface.
// notice i am using lagg0 but you would likely just be using bge0
// the only thing below of concern is that you can indeed see that 
192.168.1.199 is active on my (pseudo-)NIC.
lagg0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
         
options=8009b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,LINKSTATE>
         ether 00:e0:81:2d:b1:5c
         inet 192.168.1.131 netmask 0xffffff00 broadcast 192.168.1.255
         inet 192.168.1.90 netmask 0xffffffff broadcast 192.168.1.90
         inet 192.168.1.91 netmask 0xffffffff broadcast 192.168.1.91
         inet 192.168.1.92 netmask 0xffffffff broadcast 192.168.1.92
         inet 192.168.1.93 netmask 0xffffffff broadcast 192.168.1.93
         inet 192.168.1.94 netmask 0xffffffff broadcast 192.168.1.94
         inet 192.168.1.95 netmask 0xffffffff broadcast 192.168.1.95
         inet 192.168.1.96 netmask 0xffffffff broadcast 192.168.1.96
         inet 192.168.1.97 netmask 0xffffffff broadcast 192.168.1.97
         inet 192.168.1.98 netmask 0xffffffff broadcast 192.168.1.98
         inet 192.168.1.199 netmask 0xffffffff broadcast 192.168.1.199
         media: Ethernet autoselect
         status: active
         laggproto failover
         laggport: bge1 flags=0<>
         laggport: bge0 flags=5<MASTER,ACTIVE>

// i used ipmitool several times over and over again with the below command
// to make sure i have active response. if your information is correct 
you should
// get a clear sign that it is working
ipmitool -I lan -H 192.168.1.199 -UAdministrator -Ppassword chassis status
System Power         : on
Power Overload       : false
Power Interlock      : inactive
Main Power Fault     : false
Power Control Fault  : false
Power Restore Policy : always-off
Last Power Event     : command
Chassis Intrusion    : inactive
Front-Panel Lockout  : inactive
Drive Fault          : false
Cooling/Fan Fault    : false

//The big surprise was that I could do this even after the system had 
finished booting into FreeBSD!!!!!

// and lastly I used ipmitool from another linux box on the network to 
connect
ipmitool -I lan -H 192.168.1.199 -UAdministrator -Ppassword tsol
[Starting SOL with receiving address 192.168.1.100:6230]
[SOL Session operational.  Use ~? for help]


FreeBSD/amd64 (tyan.solidsolutions.net) (ttyu1)

login:

//A login prompt via ttyu1 (com2) !!!!!!
//And after logging in with my credentials you can see below that I am 
root on ttyu1!
[root@tyan ~]# w
  4:33PM  up 22 mins, 2 users, load averages: 0.00, 0.00, 0.00
USER             TTY      FROM              LOGIN@  IDLE WHAT
root             u1       -                 4:18PM     - -tcsh (tcsh)
robert           pts/0    mystique          4:12PM     - w
[root@tyan ~]#


Let me know if I missed something or need to clarify. It's hard to have 
amazing formatting in an email so it is a little sloppy.

--
Robert Clemens
Comment 6 Mark Linimon freebsd_committer freebsd_triage 2011-02-03 07:11:36 UTC
State Changed
From-To: open->feedback

To submitter: based on the most recent feedback, do you agree that this 
case can be closed?
Comment 7 Byron Young 2011-02-03 15:33:45 UTC
How about changing to "patched" state. After 8.2-RELEASE, if no activity, then close.
Comment 8 robert 2011-02-03 15:43:00 UTC
On Thu, Feb 3, 2011 at 6:42 AM, John Baldwin <jhb@freebsd.org> wrote:

> On Wednesday, February 02, 2011 11:50:12 pm Robert Clemens wrote:
> > The following reply was made to PR amd64/141413; it has been noted by
> GNATS.
> >
> > From: Robert Clemens <robert@solidsolutions.net>
> > To: bug-followup@FreeBSD.org, bkyoung74q9@yahoo.com, avg@freebsd.org
> > Cc:
> > Subject: Re: amd64/141413: [hang] Tyan 2881 m3289 SMDC freeze
> > Date: Wed, 02 Feb 2011 22:42:42 -0600
> >
> >  I apologize for the length of this followup but wanted to detail this as
> >  much as possible for future readers and
> >  what I believe to be the closing of PR141413 now that it appears to be
> >  resolved. With the documentation I have
> >  provided I feel this is easily duplicated.
> >
> >  I pulled out the old trusty dev box (exact specs listed for this PR).
> >      Tyan s2881 motherboard with m3289 SMDC card.
> >
> >  FreeBSD 8.2-RC2 works great with remote ipmi management while power is
> >  off, during bootup, and during normal
> >  operational init multiuser conditions.
> >
> >  I last tried this for FreeBSD 8.1-RELEASE. I can't speak for when this
> >  started working but it was after 8.1-REL and sometime during 8.2-RCx.
> >
> >  One thing I did notice is I no longer see ipmi0 dev or ipmi information
> >  from dmesg as I used to. I'm not exactly sure the intended functionality
> >  of the ipmi0 disappearance.
> >  This results in the inability to use ipmitool to connect locally from
> >  the machine in question as was once possible -- actually this was the
> >  only way previous to use the ipmi
> >  functionality before 8.2-RCx. That may still result in an open issue but
> >  as far as I'm concerned, I'm quite ecstatic to see a working console
> >  login via com2 over lan.
>
> Can you get the ipmi lines from an older dmesg when it worked?  The output
> of
> dmidecode may also be useful.
>

This is from another server I have running.
FreeBSD abyss.solidsolutions.net 8.0-RELEASE FreeBSD 8.0-RELEASE #0: Sat Nov
21 15:02:08 UTC 2009
root@mason.cse.buffalo.edu:/usr/obj/usr/src/sys/GENERIC
 amd64

[root@abyss /var/run]# cat dmesg.boot |grep ipmi
ipmi0: <IPMI System Interface> on isa0
ipmi0: KCS mode found at io 0xca2 alignment 0x1 on isa
ipmi0: IPMI device rev. 1, firmware rev. 1.81, version 1.5
ipmi0: Number of channels 1
ipmi0: Attached watchdog
[root@abyss /var/run]#

Handle 0x003B, DMI type 38, 16 bytes
IPMI Device Information
        Interface Type: KCS (Keyboard Control Style)
        Specification Version: 1.5
        I2C Slave Address: 0x10
        NV Storage Device: Not Present
        Base Address: 0x0000000000000CA2 (I/O)


> >  // i also needed to bind the ip for the smdc to my network interface.
> >  // i used 192.168.1.199 on the smdc firmware. i added this as an alias
> >  to my network interface.
> >  // notice i am using lagg0 but you would likely just be using bge0
> >  // the only thing below of concern is that you can indeed see that
> >  192.168.1.199 is active on my (pseudo-)NIC.
>
> That is very odd.  In general with a BMC, the packets never make it to the
> OS,
> so you shouldn't need to do this.  Perhaps the BMC is not responding to ARP
> so
> by putting the IP in the host OS you cause the host OS to respond to ARP
> requests but the BMC then sniffs the IP traffic?  Can you verify that this
> step is required for you, and if so can you run a tcpdump of ARP packets on
> bge0 while doing a remote ipmitool command to see if you see ARP requests
> for
> the BMC IP in the host OS?
>


I'll verify the host OS IP binding when I get a chance and respond to the
PR.
I do believe this has been a bge(4) issue all along and as bge(4) changes
have
been made there has been a series of progressions on this matter.

I also previously neglected to mention that I did sysctl hw.bge.allow_asf=1

The IPMI card shares the bge0 interface with the host and does not have an
interface
of its own.


> >  Let me know if I missed something or need to clarify. It's hard to have
> >  amazing formatting in an email so it is a little sloppy.
>
> The general issue from the PR sounds very much like a problem with bge(4)
> and
> not specific to the IPMI or amd64 support.  We use IPMI with igb(4) parts
> at
> work without any issues, and we do not add the BMC IP as an alias on our
> igb
> interfaces.
>

Agreed. I'll provide more details when I get time tonight to test around on
my dev server.
Appreciate the follow-up.


>
> --
> John Baldwin
>
Comment 9 John Baldwin freebsd_committer freebsd_triage 2011-02-03 16:57:34 UTC
On Thursday, February 03, 2011 10:43:00 am Robert Clemens wrote:
> On Thu, Feb 3, 2011 at 6:42 AM, John Baldwin <jhb@freebsd.org> wrote:
> > Can you get the ipmi lines from an older dmesg when it worked?  The output
> > of
> > dmidecode may also be useful.
> >
> 
> This is from another server I have running.
> FreeBSD abyss.solidsolutions.net 8.0-RELEASE FreeBSD 8.0-RELEASE #0: Sat Nov
> 21 15:02:08 UTC 2009
> root@mason.cse.buffalo.edu:/usr/obj/usr/src/sys/GENERIC
>  amd64
> 
> [root@abyss /var/run]# cat dmesg.boot |grep ipmi
> ipmi0: <IPMI System Interface> on isa0
> ipmi0: KCS mode found at io 0xca2 alignment 0x1 on isa
> ipmi0: IPMI device rev. 1, firmware rev. 1.81, version 1.5
> ipmi0: Number of channels 1
> ipmi0: Attached watchdog
> [root@abyss /var/run]#
> 
> Handle 0x003B, DMI type 38, 16 bytes
> IPMI Device Information
>         Interface Type: KCS (Keyboard Control Style)
>         Specification Version: 1.5
>         I2C Slave Address: 0x10
>         NV Storage Device: Not Present
>         Base Address: 0x0000000000000CA2 (I/O)

Does the server running 8.2 have identical dmidecode output?

-- 
John Baldwin
Comment 10 robert 2011-02-03 19:35:03 UTC
On 2/3/2011 10:57 AM, John Baldwin wrote:
> On Thursday, February 03, 2011 10:43:00 am Robert Clemens wrote:
>> On Thu, Feb 3, 2011 at 6:42 AM, John Baldwin<jhb@freebsd.org>  wrote:
>>> Can you get the ipmi lines from an older dmesg when it worked?  The output
>>> of
>>> dmidecode may also be useful.
>>>
>> This is from another server I have running.
>> FreeBSD abyss.solidsolutions.net 8.0-RELEASE FreeBSD 8.0-RELEASE #0: Sat Nov
>> 21 15:02:08 UTC 2009
>> root@mason.cse.buffalo.edu:/usr/obj/usr/src/sys/GENERIC
>>   amd64
>>
>> [root@abyss /var/run]# cat dmesg.boot |grep ipmi
>> ipmi0:<IPMI System Interface>  on isa0
>> ipmi0: KCS mode found at io 0xca2 alignment 0x1 on isa
>> ipmi0: IPMI device rev. 1, firmware rev. 1.81, version 1.5
>> ipmi0: Number of channels 1
>> ipmi0: Attached watchdog
>> [root@abyss /var/run]#
>>
>> Handle 0x003B, DMI type 38, 16 bytes
>> IPMI Device Information
>>          Interface Type: KCS (Keyboard Control Style)
>>          Specification Version: 1.5
>>          I2C Slave Address: 0x10
>>          NV Storage Device: Not Present
>>          Base Address: 0x0000000000000CA2 (I/O)
> Does the server running 8.2 have identical dmidecode output?
>
This is from the 8.2-RC2 server.
So yes it is identical.

Handle 0x003B, DMI type 38, 16 bytes
IPMI Device Information
         Interface Type: KCS (Keyboard Control Style)
         Specification Version: 1.5
         I2C Slave Address: 0x10
         NV Storage Device: Not Present
         Base Address: 0x0000000000000CA2 (I/O)
Comment 11 John Baldwin freebsd_committer freebsd_triage 2011-02-03 20:34:08 UTC
On Thursday, February 03, 2011 2:35:03 pm Robert Clemens wrote:
> On 2/3/2011 10:57 AM, John Baldwin wrote:
> > On Thursday, February 03, 2011 10:43:00 am Robert Clemens wrote:
> >> On Thu, Feb 3, 2011 at 6:42 AM, John Baldwin<jhb@freebsd.org>  wrote:
> >>> Can you get the ipmi lines from an older dmesg when it worked?  The output
> >>> of
> >>> dmidecode may also be useful.
> >>>
> >> This is from another server I have running.
> >> FreeBSD abyss.solidsolutions.net 8.0-RELEASE FreeBSD 8.0-RELEASE #0: Sat Nov
> >> 21 15:02:08 UTC 2009
> >> root@mason.cse.buffalo.edu:/usr/obj/usr/src/sys/GENERIC
> >>   amd64
> >>
> >> [root@abyss /var/run]# cat dmesg.boot |grep ipmi
> >> ipmi0:<IPMI System Interface>  on isa0
> >> ipmi0: KCS mode found at io 0xca2 alignment 0x1 on isa
> >> ipmi0: IPMI device rev. 1, firmware rev. 1.81, version 1.5
> >> ipmi0: Number of channels 1
> >> ipmi0: Attached watchdog
> >> [root@abyss /var/run]#
> >>
> >> Handle 0x003B, DMI type 38, 16 bytes
> >> IPMI Device Information
> >>          Interface Type: KCS (Keyboard Control Style)
> >>          Specification Version: 1.5
> >>          I2C Slave Address: 0x10
> >>          NV Storage Device: Not Present
> >>          Base Address: 0x0000000000000CA2 (I/O)
> > Does the server running 8.2 have identical dmidecode output?
> >
> This is from the 8.2-RC2 server.
> So yes it is identical.
> 
> Handle 0x003B, DMI type 38, 16 bytes
> IPMI Device Information
>          Interface Type: KCS (Keyboard Control Style)
>          Specification Version: 1.5
>          I2C Slave Address: 0x10
>          NV Storage Device: Not Present
>          Base Address: 0x0000000000000CA2 (I/O)

Hmmm, if you want to debug this, sys/dev/ipmi/ipmi_smbios.c is probably the
place to start to see if ipmi_smbios_identify() finds the IPMI table entry or
not.

-- 
John Baldwin
Comment 12 robert 2011-02-03 20:46:18 UTC
I have also just tested ipmitool remotely after removing the IP alias 
from the NIC.
It worked just fine. I probably ran into an issue previously as I was 
also concurrently
testing some lagg(4) lacp/failover scenarios and could've had routing 
issues although
that seems to be grasping at straws.

Anyway. It seems the last real question I'm having is the local 
/dev/impi0 access.
You are currently unable to locally ipmi directly.

Is there further information I could pull from either my 8.0-RELEASE or 
8.2-RC2 servers
to gain some ground on that?

I've actually been really happy with 8.2 thus far and this just "iced my 
cake" as I'd love
to be able to goto single-user mode to my servers across the country 
that use this.
Comment 13 Jaakko Heinonen freebsd_committer freebsd_triage 2011-12-11 09:47:23 UTC
State Changed
From-To: feedback->patched

Change the state according to submitter suggestion.
Comment 14 John Baldwin freebsd_committer freebsd_triage 2015-10-26 17:47:10 UTC
This was not every really "patched" (there was no patch from this bug committed, instead bge(4) was fixed by someone else between 8.1 and 8.2).  However, if the submitter has info on the smbios code in the ipmi driver is no longer attaching we can reopen this and investigate that.