Bug 203166

Summary: [wbwd][patch] Stop watchdog timer in attach routine if it is already started
Product: Base System Reporter: t_uemura
Component: kernAssignee: freebsd-bugs (Nobody) <bugs>
Status: New ---    
Severity: Affects Some People CC: delphij, emaste
Priority: --- Keywords: patch
Version: 10.2-STABLE   
Hardware: amd64   
OS: Any   
Attachments:
Description Flags
Patch to fix the issue.
none
Make the change a loader option. none

Description t_uemura 2015-09-17 01:44:34 UTC
Created attachment 161137 [details]
Patch to fix the issue.

According to comments in wbwd.c, the wbwd watchdog timer may be started before the wbwd driver is attaching.

If so, watchdogd(8) must be started to prevent unnecessary reboot before the watchdog fires, even if the machine is booting directly to single user mode.

Attached patch works around this. When the timer is already started, then attach routine stops it.

Please someone fix the issue. Thanks.
Comment 1 Ed Maste freebsd_committer 2015-09-21 20:29:29 UTC
We definitely need something like this change.

I'm a little worried that this introduces a window in normal operation though where the watchdog is ineffective, between wbwd attach and watchdogd starting up. For example we may have a driver which intermittently hangs in attach, and we'd want the watchdog to reset the system in that case.

A few ideas from IRC discussion:
- we could pat the watchdog once at attach time so that we don't reboot in cases where multiuser system startup takes a while after wbwd attaching
- watchdog disabling could happen when the system enters single user mode
- we ought to have a tunable or other configuration to control this, as some system vendors etc. may not want the watchdog disabled

In any case I think it's probably better to have this change than not.
Comment 2 Xin LI freebsd_committer 2015-09-21 20:40:40 UTC
Please make this one optional via a loader tunable or compile time setting (I have no strong option with the default and if it helps people, let's enable it by default).

The problem with doing resets unconditionally in kernel is that if e.g. the user is using nextboot(8) and the kernel stuck at a later time, there is nothing that resets the system to recover it to a previously known good state, and that would defeat a good fraction of legitimate usage of watchdogs.
Comment 3 t_uemura 2015-09-25 05:19:35 UTC
Created attachment 161368 [details]
Make the change a loader option.