Bug 218579 - bge(4): Wake on Lan (WoL) does not work
Summary: bge(4): Wake on Lan (WoL) does not work
Status: In Progress
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: 11.2-STABLE
Hardware: amd64 Any
: --- Affects Many People
Assignee: Cy Schubert
URL: https://www.nas4free.org/forums/viewt...
Keywords: needs-qa
: 171744 177184 184718 (view as bug list)
Depends on:
Blocks:
 
Reported: 2017-04-12 05:58 UTC by pozzugno
Modified: 2021-09-10 14:05 UTC (History)
13 users (show)

See Also:
koobs: mfc-stable13?
koobs: mfc-stable12?
koobs: mfc-stable11?


Attachments
if_bge.c patched with the bge-WOL patch from https://bitbucket.org/w4w/bge-wol-freebsd-10.1-patch/overview (200.63 KB, text/x-csrc)
2017-10-03 01:55 UTC, thfrdue
no flags Details
WOL patch for bge. (4.97 KB, patch)
2018-03-02 17:59 UTC, Cy Schubert
no flags Details | Diff
WOL patch for stable/11 (5.80 KB, patch)
2018-10-02 01:02 UTC, Cy Schubert
no flags Details | Diff
Latest WOL patch for bge (5.88 KB, patch)
2020-03-13 20:14 UTC, Cy Schubert
no flags Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description pozzugno 2017-04-12 05:58:56 UTC
I use Nas4Free on HP ProLiant MicroServer. Nas4Free is based on FreeBSD and I and some other people have a problem that is related to the bge NIC driver included in the last FreeBSD kernel.

We can't use Wake On Lan feature, even if it is enabled in BIOS. It is an issue with FreeBSD, because we can use WOL with other OS (such as Linux) on the same hw.

Someone noticed the WOL feature works if the NIC is connected to a Gigabit Ethernet, but not if the link is 100Mbps. Unfortunately I can't test Gigabit on my system.

Someone says older Nas4Free versions worked, for example that based on FreeBSD 9.2.

You can read much more details on a specific thread present on Nas4Free forum, here (https://www.nas4free.org/forums/viewtopic.php?f=58&t=4756&start=30).
Comment 1 thfrdue 2017-09-27 15:10:36 UTC
The same applies for FreeNAS which is also based on FreeBSD. With my HP ProLiant MicroServer N54L I have the same issue that WOL is not available - even with Gigabit ethernet connected.
Patches seem to exist (e.g., https://bitbucket.org/w4w/bge-wol-freebsd-10.1-patch).
Comment 2 thfrdue 2017-10-03 01:55:19 UTC
Created attachment 186870 [details]
if_bge.c patched with the bge-WOL patch from https://bitbucket.org/w4w/bge-wol-freebsd-10.1-patch/overview
Comment 3 Cy Schubert freebsd_committer 2018-03-02 17:59:36 UTC
Created attachment 191147 [details]
WOL patch for bge.

This patch was posted to PR 171744. The caveat is listed there, though the problem may simply be my laptop.

I'll consider closing either this or 171744 as a dup of the other.
Comment 4 Rodney W. Grimes freebsd_committer 2018-03-02 20:59:17 UTC
Please try not remove group assignment when "Taking" bugs.
Comment 5 Koen Martens 2018-03-19 19:31:30 UTC
Not sure what the fix is, and how 171744 is relevant (it's about the wake command, if I understand correctly).

Just wanted to chime in. I have a machine with a bge-driven card for which WOL worked perfectly with 10.3-RELEASE. I activated it with 'ifconfig bge0 wol_magic' and then used the wakeonlan utility on a linux (ubuntu 16.04) machine to wake up the freebsd machine.

Yesterday I upgraded to 11.1-RELEASE (something I have put off for ages because I was afraid something would break, but with the EOL coming up I decided to take the risk), and WOL just stopped working.

It isn't even advertised anymore as a capability in ifconfig:

# ifconfig -m bge0
bge0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
	options=8009b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,LINKSTATE>
	capabilities=8009b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,LINKSTATE>
	ether 00:1e:c9:5b:75:11
	hwaddr 00:1e:c9:5b:75:11
	inet 10.1.3.8 netmask 0xffffff00 broadcast 10.1.3.255 
	nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
	media: Ethernet autoselect (1000baseT <full-duplex>)
	status: active
	supported media:
		media autoselect mediaopt flowcontrol
		media autoselect
		media 1000baseT mediaopt full-duplex,master
		media 1000baseT mediaopt full-duplex
		media 1000baseT mediaopt master
		media 1000baseT
		media 100baseTX mediaopt full-duplex
		media 100baseTX
		media 10baseT/UTP mediaopt full-duplex
		media 10baseT/UTP

Needless to say, 'ifconfig bge0 wol_magic' doesn't enable it anymore either.
Comment 6 Cy Schubert freebsd_committer 2018-03-19 19:43:23 UTC
I don't understand why it worked on 10.3. The WOL code was not in bge at the time. The patch is for 12-CURRENT. I'll rework it for 11-STABLE.
Comment 7 Cy Schubert freebsd_committer 2018-03-20 00:23:25 UTC
The patch attached to this PR also works with stable/11 and releng/11.1.
Comment 8 Koen Martens 2018-04-19 15:34:42 UTC
(In reply to Cy Schubert from comment #6)
Hi, thanks for the reply (and the reworked patch). I guess the patch won't go upstream because of the reasons mentioned (ie. potential to blow up boards that can't support enough power when the system is powered down)?

I'm also surprised it did work for me. I actually woke up the machine with wake-on-lan to ssh into it and upgrade it to 11.1-RELEASE.
Comment 9 Cy Schubert freebsd_committer 2018-04-20 00:56:27 UTC
(In reply to Koen Martens from comment #8)
That bug has been fixed in the latest patch I posted here. No worries about drawing too much current when powered off any more.

I've been using it and previous versions of this patch on my laptop (only machine I have with bge) for about a year.

I suppose I should submit the patch in phabricator for review prior to commit.
Comment 10 NK 2018-08-10 13:08:21 UTC
There still have issue with WOL patch for bge driver on FreeBSD 11.2.

It seems there is boot issues, see here:
https://www.xigmanas.com/forums/viewtopic.php?f=78&t=13807#p85480
Comment 11 Cy Schubert freebsd_committer 2018-10-01 04:04:53 UTC
(In reply to NK from comment #10)
Sketchy details. There's nothing to go on.

The WOL patch was developed for 12-CURRENT only. The only issue is a system will not halt -p after being woken by WOL, but will after a reboot.

I'll try to port it back to 11-STABLE if anyone is interested.
Comment 12 thfrdue 2018-10-01 18:51:04 UTC
(In reply to Cy Schubert from comment #11)
I would very much appreciate it. Thanks a lot for your efforts!
Comment 13 Cy Schubert freebsd_committer 2018-10-02 01:02:48 UTC
Created attachment 197700 [details]
WOL patch for stable/11

Same patch, for stable/11. I haven't tried to build or use this particular patch on stable/11. However a previous version was developed for current/11 at the time. This patch has the same problem as the -CURRENT patch, the system will reboot on halt -p the first time around. I'm not sure of the solution for this yet.
Comment 14 George B 2018-10-25 11:52:56 UTC
I had 11.1 patched with https://github.com/NamTaf/if_bge_wol and it worked 
can I apply the same patch on 11.2 p4 ?
I only need this for WOL.
Comment 15 Yiannis 2020-02-23 15:48:06 UTC
Any hope to make it work please ?
I have an old but full functional HP Microserver and I can't upgrade because of this issue.
A fix would really appreciated
Comment 16 Cy Schubert freebsd_committer 2020-03-13 20:14:51 UTC
Created attachment 212388 [details]
Latest WOL patch for bge

The patch generally works, I currently use it. However halt -p after a WOL will not power off the machine but reboot it. Only after the next reboot will it be able to power off. I don't know if this is isolated to my laptop only or if this is a general problem. Try the patch if it has the same behavior as on my laptop and report here or back to me your result.

Ping me if the patch posted here doesn't apply. I'll backport the one from 13-CURRENT again.
Comment 17 Cy Schubert freebsd_committer 2020-03-13 20:17:22 UTC
*** Bug 171744 has been marked as a duplicate of this bug. ***
Comment 18 tony 2021-06-07 09:35:07 UTC
Hy guys,

Can anyone explain me how to patch the driver? I cannot find the file if_bge.c anywhere...
is there a way to make it to one of the next version? this would be really appreciated!

Best

Tony
Comment 19 Chris Hutchinson 2021-06-07 15:39:45 UTC
(In reply to tony from comment #18)
> Can anyone explain me how to patch the driver?
copy your choice of one of the 2 attached diff files
to /usr/src.
# cd /usr/src
# patch < your-chosen-diff-file
Done.
see man diff && man patch for greater detail(s). :-)

> I cannot find the file if_bge.c anywhere...
# cd /usr/src
# find . -type f -name if_bge.c

> is there a way to make it to one of the next version?
> this would be really appreciated!

HTH

--Chris
Comment 20 Cy Schubert freebsd_committer 2021-06-07 16:02:21 UTC
(In reply to Chris Hutchinson from comment #19)
The latest patch is for 14-CURRENT. It works here except for a bug which I cannot track down. The bug is that after waking the machine the next time you do a halt -p the machine will reboot if the machine hasn't been rebooted using reboot in the mean time. It appears that some kind of status bit has been set in the NIC that when the machine powers itself off the first time after being woken it powers itself back on again. Only that once and only if there haven't been any reboots prior to that.

Otherwise this patch would have been committed long ago.

Please apply the patch and let me know if you experience this or if it is simply my laptop that has this problem. (My laptop doesn't experience this when booted from Windows 10 or Fedora, only with the patch I cobbled up for FreeBSD.)
Comment 21 HIROKI MORI 2021-06-15 22:50:02 UTC
I use acer note 3820. This machine have alc interface. alc support wol.
I can wol this machine on releng/13. But poweroff do reboot. 

I think wol poweroff problem is not driver issue. I doubt ACPI or other code.
Comment 22 Cy Schubert freebsd_committer 2021-06-15 23:10:58 UTC
Mine is an Acer 4752, same problem. Windows and Fedora don't have the problem. This suggests I've missed something in my patch.

But poweroff does work after that initial reboot following WOL. I suspect this is some bit in a NIC register that fails to be reset after WOL power up.
Comment 23 Peter Libassi 2021-08-25 04:38:02 UTC
With "Latest WOL patch for bge" on FreeBSD 13/HPE ProLiant N36L multiple shutdown -p/wake works fine.
Comment 24 Kubilay Kocak freebsd_committer freebsd_triage 2021-08-25 04:44:06 UTC
(In reply to Peter Libassi from comment #23)

Thank you for the confirmation and feedback Peter
Comment 25 Michel Marcon 2021-08-27 20:03:38 UTC
(In reply to Cy Schubert from comment #20)

I successfully use the patch bge-wol-13-current-200313.diff on FreeBSD-13.
I use it on a HP ML110 G6 server. Usable after shutdown -p. I didn't test halt command though.
However, I had to "buildkernel" and "installkernel after a freebsd-update fetch install to get to FreeBSD-13 p3. Hassle 8-)

Why don't you submit your patch to FreeBSD project so they can "integrate" it in the kernel?
Comment 26 Cy Schubert freebsd_committer 2021-08-28 03:03:42 UTC
(In reply to Michel Marcon from comment #25)

I am a committer. I will commit it when one last bug in the patch has been fixed. After using WOL to wake a bge equipped system, the next power off is in fact a power cycle (poweroff/poweron, similar to a reboot but you hear the system power off for a second or less). Do you experience this bug when you applied the patch?

The bug manifests itself on Acer laptops equipped with bge0. My Acer exhibits the behavior but doesn't under Windows. I must conclude that the patch still needs set or clear a bge register.

Let me know if your system experiences the poweroff/poweron bug if you poweroff or halt -p anytime following a wake from WOL.

The bug does not manifest itself if the machine is rebooted before poweroff or halt -p: i.e. WOL the system, reboot, then power off does not exhibit the bug. Whereas WOL, then power off manifests the bug.
Comment 27 Cy Schubert freebsd_committer 2021-08-28 03:15:31 UTC
(In reply to HIROKI MORI from comment #21)

My Acer 4752 with bge exhibits the bug. My Acer 3620 with rl does not exhibit the bug. Noting that your Acer 3820 with alc also has the bug might mean that there is an ACPI or other issue. (My 4752 exhibits other ACPI bugs that the 3620 did not under FreeBSD.)

If other people can confirm that this is not a bug on their systems (and this bug only manifests itself on Acer laptops), I will commit the patch and MFC it. I do want people to confirm that it works fully on their computers before committing this patch.
Comment 28 Kubilay Kocak freebsd_committer freebsd_triage 2021-09-03 00:25:19 UTC
*** Bug 184718 has been marked as a duplicate of this bug. ***
Comment 29 Kubilay Kocak freebsd_committer freebsd_triage 2021-09-03 00:27:40 UTC
*** Bug 177184 has been marked as a duplicate of this bug. ***
Comment 30 Cy Schubert freebsd_committer 2021-09-03 00:40:59 UTC
(In reply to Michel Marcon from comment #25)

Please halt test.
Comment 31 Peter Libassi 2021-09-04 11:28:30 UTC
(In reply to Cy Schubert from comment #30)

On my FreeBSD 13/HPE ProLiant N36L with the latest WOL patch I did the following test:

1. wake

2. shutdown -h now, went down to halt state and stayed there

3. cycled power

4. shutdown -p now

5. wake, succeful

Let me know if you want me to do the test differently.
Comment 32 Cy Schubert freebsd_committer 2021-09-04 13:20:30 UTC
(In reply to Peter Libassi from comment #31)

This is perfect. The problem isn't the patch but Acer ACPI. I will commit this patch then.
Comment 33 Cy Schubert freebsd_committer 2021-09-04 13:38:15 UTC
PR/226763 is not related to this. Different NIC (re).
Comment 34 Cy Schubert freebsd_committer 2021-09-04 13:52:42 UTC
Phabricator review is at https://reviews.freebsd.org/D31834.
Comment 35 Michel Marcon 2021-09-09 09:25:10 UTC
(In reply to Peter Libassi from comment #31)

On my HP Proliant server HP ML110 G6 w/ FreeBSD 13, I've done exactly the same procédure (Shutdown -h, cycle power, shutdown -p, and then wake WOL) and it works. 

But then, after a simple "shutdown -p now", WOL doesn't work.
Comment 36 Cy Schubert freebsd_committer 2021-09-09 13:57:52 UTC
(In reply to Michel Marcon from comment #35)

What does ifconfig of the interface show? Are any of the WOL flags still on?
Comment 37 Michel Marcon 2021-09-09 14:36:52 UTC
(In reply to Cy Schubert from comment #36)
Yep. The flag WOL_MAGIC is on:

zombie/usr/home/cmic >ifconfig bge0
bge0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
        options=c219b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,TSO4,WOL_MAGIC,VLAN_HWTSO,LINKSTATE>
        ether 2c:41:38:87:3a:6f
        inet 192.168.1.102 netmask 0xffffff00 broadcast 192.168.1.255
        media: Ethernet autoselect (1000baseT <full-duplex>)
        status: active
        nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
Comment 38 Cy Schubert freebsd_committer 2021-09-09 14:52:50 UTC
(In reply to Michel Marcon from comment #37)

To make sure I understand this correctly, you followed the test procedure of:

1. Enable WOL.
2. halt -p or shutdown -p
3. WOL the machine (it works)
4. shutdown -p and the machine fails to halt.

Is this correct?
Comment 39 Cy Schubert freebsd_committer 2021-09-09 14:55:15 UTC
Removed regression flag because this is not a regression. bge(4) never supported WOL. The patch adds WOL support to bge(4).
Comment 40 Cy Schubert freebsd_committer 2021-09-09 18:16:45 UTC
(In reply to Michel Marcon from comment #37)

In the absence of a reply, I will assume this is the same problem experienced on Acer laptops. This proves that the problem is not caused by Acer ACPI but that the patch is incomplete.
Comment 41 Michel Marcon 2021-09-10 12:52:10 UTC
(In reply to Cy Schubert from comment #40)
Sorry to be so late.

After a lots of "shutdown -p now" and restarts I can only say that the behaviour is erratic: sometimes WOL works, sometimes not. (bang my head on the wall ?)

But stay tuneed: I keep on searching for a reproductible behaviour.

By the way, are you interested by a lot of acpi lines in /var/log/messages?
Comment 42 Cy Schubert freebsd_committer 2021-09-10 14:05:59 UTC
(In reply to Michel Marcon from comment #41)

Actually, it is not erratic at all. But your tests have confirmed why I haven't committed it yet. There is something lacking and I haven't discovered it yet.

Here are two scenarios: Your test:

1. halt -p.
2. Machine powers off.
3. WOL wakes the machine.
4. halt -p.
5. Machine appears to reboot.

It doesn't reboot, actually. It powers off but a second later it powers back on again. You can tell by the sound of the CPU fan.

The second scenario:

1. halt -p
2. Machine powers off.
3. WOL wakes the machine.
4. Reboot (like after an installkernel or other reason).
5. halt -p.
6. WOL wakes the macihne.
7. halt -p.
8. Machine powers off.
9. WOL will work.

The difference is the reboot in the second scenario. This suggests that something is missing from the patch to reset a bit in a bge(4) hardware register to not wake it a second time. This could also be an ACPI issue. (HP and Acer ACPI rely on WMI.)

This is the reason the patch hasn't been committed yet. If you remember to do a scheduled reboot sometime after a WOL it will appear fine.

I will commit this patch when this last bug has been resolved.