Bug 139718 - [reboot] all mounted fs don't get synced during reboot/shutdown with >= 1 mounted inaccessible device
Summary: [reboot] all mounted fs don't get synced during reboot/shutdown with >= 1 mou...
Status: Open
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: 9.0-CURRENT
Hardware: Any Any
: Normal Affects Only Me
Assignee: freebsd-bugs (Nobody)
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2009-10-18 13:40 UTC by Alexander Best
Modified: 2025-03-03 02:55 UTC (History)
1 user (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Alexander Best 2009-10-18 13:40:01 UTC
when the system is being shutdown or rebooted and a mounted device isn't accessible any longer all other mounted devices aren't being synced correctly and thus marked dirty. this also happens if the inaccessible device was mounted read-only.

the reboot/shutdown sequence hangs after the message "All buffers synced.". after a reset all previosly mounted sttorage devices need to be fsck'ed.

see this thread for further info: http://lists.freebsd.org/pipermail/freebsd-current/2009-October/012679.html

Matthias Andree described the problem like this:

"1. If the device for one file system is gone, why would I mark *other* file
systems dirty? There is no reason to do so.

2. If a file system was mounted read-only, and its device is removed, there are
by definition ZERO dirty buffers that we need to synch on shutdown, so why does
the premature unplug-readonly-before-unmount spoil the shutdown?"

How-To-Repeat: 1. mount a removable device (e.g. an usb stick) (better use -r to prevent data
loss)
2. unplug the device (without unmounting it)
3. `shutdown -r now`
Comment 1 Mark Linimon freebsd_committer freebsd_triage 2009-10-18 17:34:44 UTC
Responsible Changed
From-To: freebsd-bugs->freebsd-fs

Over to maintainer(s).
Comment 2 Edward Tomasz Napierala freebsd_committer freebsd_triage 2009-10-18 19:42:21 UTC
Responsible Changed
From-To: freebsd-fs->trasz

I'll take it.
Comment 3 Alexander Best 2009-11-07 17:54:57 UTC
this problem is currently being worked on by Edward Tomasz Napierala. running
r199016 i'm not able to reproduce the problem any longer. the issue seems to
have been (partially, maybe entirely) resolved by commits r19887[3-7].

i don't think trasz@ has finished committing all the necessary changes to HEAD
yet, but it seems this pr will get fixed entirely (in HEAD) in the next few
days.

changes will be mfc'ed to 8-stable, 7-stable and maybe 8.0-release (if re@
approves the changes). don't know how hard it'll be to merge them into
6-stable.

so anyone running current please test the recent changes committed by @trasz.
everybody else running <= 8-stable stay tuned for the fixes to get committed
to those branches.

might be a good idea to set this pr into analysed or even patched state.

cheers.
alex
Comment 4 Alexander Best 2009-11-09 20:45:05 UTC
issue has been partly solved. detaching a usb device in a clean state and
unmounting it afterwards works.

however neither `umount` nor `umount -f` work in this case:

1) when a read/write to a usb device fail like in this example:
g_vfs_done():da0[READ(offset=311132160, length=65536)]error = 5
the device becomes completely inaccessible.

2) when calling umount this also fails with a similar error:
g_vfs_done():da0[READ(offset=16384, length=4096)]error = 5 since the device is
still present umount tries to write metadata to it, but fails. removing the
device and then trying to issue umount fails too.

3) in this situation the problem described in this pr still occurs leaving all
mounted devices tagged dirty after a reboot.

alex
Comment 5 Mark Linimon freebsd_committer freebsd_triage 2009-11-10 08:15:00 UTC
State Changed
From-To: open->analyzed

The issue is partially solved (see Audit-Trail).
Comment 6 Edward Tomasz Napierala freebsd_committer freebsd_triage 2009-11-16 10:29:52 UTC
Wiadomo=B6=E6 napisana przez Alexander Best w dniu 2009-11-09, o godz. =
21:50:
> The following reply was made to PR kern/139718; it has been noted by =
GNATS.
>=20
> From: Alexander Best <alexbestms@wwu.de>
> To: <bug-followup@FreeBSD.org>
> Cc: =20
> Subject: Re: kern/139718: [reboot] all mounted fs don't get synced =
during
> reboot/shutdown with &gt;=3D 1 mounted inaccessible device
> Date: Mon, 09 Nov 2009 21:45:05 +0100 (CET)
>=20
> issue has been partly solved. detaching a usb device in a clean state =
and
> unmounting it afterwards works.

I was not able to reproduce this.  I don't think anything has changed in =
this regard
for the last six months - removing the device and unmounting it =
afterwards just worked
in my tests.

> however neither `umount` nor `umount -f` work in this case:
>=20
> 1) when a read/write to a usb device fail like in this example:
> g_vfs_done():da0[READ(offset=3D311132160, length=3D65536)]error =3D 5
> the device becomes completely inaccessible.
>=20
> 2) when calling umount this also fails with a similar error:
> g_vfs_done():da0[READ(offset=3D16384, length=3D4096)]error =3D 5 since =
the device is
> still present umount tries to write metadata to it, but fails. =
removing the
> device and then trying to issue umount fails too.

Now, this changes everything.  The device is supposed to fail with error =
6, which
is ENXIO, "Device not configured".  This, however, fails with EIO, =
"Input/output error".
If the device never returns ENXIO, then GEOM didn't detach it - in other =
words,
from the system point of view, it still exists.

What do you do to make it fail this way?

> 3) in this situation the problem described in this pr still occurs =
leaving all
> mounted devices tagged dirty after a reboot.

I think it's a separate problem - it seems that, for some reason, we =
don't unmount
filesystems properly if one of them fails.

--
If you cut off my head, what would I say?  Me and my head, or me and my =
body?
Comment 7 Alexander Best 2009-11-16 12:52:58 UTC
Edward Tomasz Napiera&#322;a schrieb am 2009-11-16:
> Wiadomo&#347;&#263; napisana przez Alexander Best w dniu 2009-11-09, o godz.
> 21:50:
> > The following reply was made to PR kern/139718; it has been noted
> > by GNATS.

> > From: Alexander Best <alexbestms@wwu.de>
> > To: <bug-followup@FreeBSD.org>
> > Cc:
> > Subject: Re: kern/139718: [reboot] all mounted fs don't get synced
> >  during
> > reboot/shutdown with &gt;= 1 mounted inaccessible device
> > Date: Mon, 09 Nov 2009 21:45:05 +0100 (CET)

> > issue has been partly solved. detaching a usb device in a clean
> > state and
> > unmounting it afterwards works.

> I was not able to reproduce this.  I don't think anything has changed
> in this regard
> for the last six months - removing the device and unmounting it
> afterwards just worked
> in my tests.

oh i see. to be honest i haven't tried this procedure for a long time. just
remember that it wasn't possible to unmount inaccessible devices, but that
might have been long time ago. thought this might be related to the changes i
mentioned, but apperently i was wrong. ;)

> > however neither `umount` nor `umount -f` work in this case:

> > 1) when a read/write to a usb device fail like in this example:
> > g_vfs_done():da0[READ(offset=311132160, length=65536)]error = 5
> > the device becomes completely inaccessible.

> > 2) when calling umount this also fails with a similar error:
> > g_vfs_done():da0[READ(offset=16384, length=4096)]error = 5 since
> > the device is
> > still present umount tries to write metadata to it, but fails.
> > removing the
> > device and then trying to issue umount fails too.

> Now, this changes everything.  The device is supposed to fail with
> error 6, which
> is ENXIO, "Device not configured".  This, however, fails with EIO,
> "Input/output error".
> If the device never returns ENXIO, then GEOM didn't detach it - in
> other words,
> from the system point of view, it still exists.

> What do you do to make it fail this way?

i'm not sure this is reproducable in an easy manor. the problem appears with
only a single umass device. writing large amounts of data to it triggers this
problem. the problem however doesn't seem to be related to the device itself,
but the usb2 stack. i ran several health scans under windows and they reported
no problem with the device.

maybe you could find the exact function returning EIO and replace
return(errno) with return(EIO) so the problem gets triggered with every umass
device. just a thought though.

> > 3) in this situation the problem described in this pr still occurs
> >    leaving all
> > mounted devices tagged dirty after a reboot.

> I think it's a separate problem - it seems that, for some reason, we
> don't unmount
> filesystems properly if one of them fails.

there was a discussion at some point to introduce a -F switch to umount which
should "really" umount devices forcefully. the -f switch doesn't seem to have
any effect in regard to this issue. i don't even know that it's purpose is.

> --
> If you cut off my head, what would I say?  Me and my head, or me and
> my body?
Comment 8 Alexander Best 2009-11-25 21:06:30 UTC
have there been any recent developments concerning this problem? because after
some i/o errors:

g_vfs_done():label/usb[WRITE(offset=26853376, length=16384)]error = 5
g_vfs_done():label/usb[READ(offset=435863552, length=36864)]error = 5
vnode_pager_getpages: I/O read error
vm_fault: pager read error, pid 2171 (cp)
g_vfs_done():label/usb[WRITE(offset=26853376, length=16384)]error = 5

i was able to unmount the device without any problems.

alex
Comment 9 Alexander Best 2010-03-01 02:49:21 UTC
i believe this pr can be closed. i'm no longer able to produce a system hang
during reboot.

if the device becomes inaccessible i still get these errors:

Mar  1 03:16:38 otaku kernel: g_vfs_done():label/usb[WRITE(offset=671744,
length=4096)]error = 5
Mar  1 03:16:38 otaku kernel: g_vfs_done():label/usb[WRITE(offset=1667072,
length=4096)]error = 5
Mar  1 03:16:38 otaku kernel: g_vfs_done():label/usb[WRITE(offset=1671168,
length=4096)]error = 5
Mar  1 03:16:38 otaku kernel: g_vfs_done():label/usb[WRITE(offset=2603204608,
length=16384)]error = 5
Mar  1 03:16:38 otaku kernel: g_vfs_done():label/usb[WRITE(offset=2677309440,
length=49152)]error = 5
Mar  1 03:16:38 otaku kernel: g_vfs_done():label/usb[WRITE(offset=2684502016,
length=65536)]error = 5
Mar  1 03:16:38 otaku kernel: g_vfs_done():label/usb[WRITE(offset=2694135808,
length=32768)]error = 5
Mar  1 03:16:38 otaku kernel: g_vfs_done():label/usb[WRITE(offset=512,
length=512)]error = 5
Mar  1 03:16:38 otaku kernel: g_vfs_done():label/usb[WRITE(offset=667648,
length=4096)]error = 5

if i try to do an umount it fails with EAGAIN (which is odd, because
unmount(2) doesn't mention EAGAIN).

when removing the device i get these warnings:

Mar  1 03:16:42 otaku kernel: Device usb went missing before all of the data
could be written to it; expect data loss.
Mar  1 03:16:58 otaku kernel: deget(): pcbmap returned 6
Mar  1 03:16:58 otaku last message repeated 2 times

but i'm now able to umount the device properly.

i tried rebooting after seeing the WRITE errors with the device

1) removed and unmounted
2) removed but still mounted and
3) with it being still attached

in all cases the kernel manages to sync all buffers and reboot properly.

alex
Comment 10 Alexander Best freebsd_committer freebsd_triage 2010-09-23 00:12:04 UTC
State Changed
From-To: analyzed->open

It seems this issue still exists. Quite regularly during reboot/shutdown FreeBSD 
is unable to sync all vnodes and times out. Syncing buffers results in a messy 
iterating output which never reaches zero. No timeout gets hit and only a 
physical reset will bring the system back up with all mounted devices being 
marked dirty.
Comment 11 Edward Tomasz Napierala freebsd_committer freebsd_triage 2014-06-18 12:43:44 UTC
Reset the owner; I'm not working on this in any way.
Comment 12 Eitan Adler freebsd_committer freebsd_triage 2018-05-20 23:57:17 UTC
For bugs matching the following conditions:
- Status == In Progress
- Assignee == "bugs@FreeBSD.org"
- Last Modified Year <= 2017

Do
- Set Status to "Open"
Comment 13 Mark Linimon freebsd_committer freebsd_triage 2025-03-03 02:55:51 UTC
^To submitter: is this still a problem on recent versions of FreeBSD?