Bug 289779 - FreeBSD 15.0 NVMe disk hot add&remove needs reboot to validate
Summary: FreeBSD 15.0 NVMe disk hot add&remove needs reboot to validate
Status: New
Alias: None
Product: Base System
Classification: Unclassified
Component: standards (show other bugs)
Version: 15.0-CURRENT
Hardware: Any Any
: --- Affects Only Me
Assignee: Warner Losh
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2025-09-23 03:36 UTC by Yanhui He
Modified: 2025-12-03 07:35 UTC (History)
3 users (show)

See Also:


Attachments
nvme crash on resume from suspend. (914.87 KB, image/jpeg)
2025-10-16 06:58 UTC, Fabio Comolli
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Yanhui He 2025-09-23 03:36:15 UTC
Tested FreeBSD 15.0 64bit ALPHA3 ISO Image installation on vSphere, and found that the VM needs reboot after hot adding a NVMe hard disk to an existing NVMe controller or hot removing a NVMe hard disk from a NVMe controller.

But for FreeBSD 14.x like FreeBSD 14.3, it doesn't need to reboot for recognizing the newly added NVMe hard disk like by lsblk command.

# lsblk
DEVICE         MAJ:MIN SIZE TYPE                                    LABEL MOUNT
da0              0:86   20G GPT                                         - -
  da0p1          0:87  512K freebsd-boot                     gpt/gptboot0 -
  <FREE>         -:-   492K -                                           - -
  da0p2          0:88  2.0G freebsd-swap                 gpt/freebsd-swap SWAP
  da0p3          0:89   18G freebsd-zfs                   gpt/freebsd-zfs <ZFS>
  <FREE>         -:-   1.0M -                                           - -
nda0             0:83  1.0G -                                           - -

Would you please take a look at this issue on FreeBSD 15.0?

Thanks!
Yanhui
Comment 1 Fabio Comolli 2025-10-16 06:58:50 UTC
Created attachment 264613 [details]
nvme crash on resume from suspend.
Comment 2 Fabio Comolli 2025-10-16 07:00:31 UTC
(In reply to Fabio Comolli from comment #1)
sorry,the picture was meant to be attached to a different bug.
Comment 3 Yanhui He 2025-11-11 06:31:04 UTC
Hi,

Would you please take a look?

It still happens on FreeBSD 15.0 BETA5.

Thanks!
Yanhui
Comment 4 Yanhui He 2025-11-18 02:42:52 UTC
Still happens on FreeBSD 15.0 RC1.
Comment 5 Warner Losh freebsd_committer freebsd_triage 2025-11-18 20:27:34 UTC
It's not clear to me this is an nvme related issue.
The nvme controller isn't responding at all after the resume. The root cause for that is needed.
If there's a hot remove of root, that's never going to work.

So how do I recreate this with software I might have access to? I don't have vsphere.

So I'm a bit confused. The description on the attachment says something about suspend/resume, but the other talks about a hot add. I am aware that changing the size of an nvme namespace won't work in 15.0, but will shortly in -current. But that's not adding a namespace to an existing controller. I do see a change that might cause that to fail in 15 but not 14.

So I think I need a few more details from this shorthand to really know what's going on.
Comment 6 Fabio Comolli 2025-11-18 21:44:02 UTC
If you're talking about my attachment please note that it's not related to this bug, it was a mistake on my part. I don't seem to be able to delete it.
Comment 7 Fabio Comolli 2025-11-18 21:51:40 UTC
(In reply to Fabio Comolli from comment #6)
The attachment is related to https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=290265
Comment 8 Warner Losh freebsd_committer freebsd_triage 2025-11-18 22:49:22 UTC
I think there's a small chance that

commit 27481c268916b0790c7ad16202a5b012625ce1a8
Author: Warner Losh <imp@FreeBSD.org>
Date:   Tue Nov 18 13:07:11 2025 -0700

    nvme: Fix backwards sense of error condition

    b21e67875bf0c tested for the good condition, not the error condition, so
    we'd never do anything else in this function. This was causing certain
    logging not to happen, and also prevented forthcoming namespace size
    change code from working as well.

    Fixes: b21e67875bf0c
    Sponsored by: Netflix

will fix this specific issue. But it may be too late to get into 15.0. I also think that the rest of the patches from there through


commit 4640f5008922c5b189d2f7b63edf73300277e6df (HEAD -> main, freebsd/main, freebsd/HEAD)
Author: Wanpeng Qian <wanpengqian@gmail.com>
Date:   Tue Nov 18 10:24:13 2025 -0700

    nvme_sim: signal namespace depature

    Signal when the namespace is gone so we can tear down the disk when a
    nvme drive is removed.

    Reviewed by:            imp
    Differential Revision:  https://reviews.freebsd.org/D33032


have an excellent chance of making life a lot better for you in general.
Comment 9 Yanhui He 2025-12-03 06:33:44 UTC
(In reply to Warner Losh from comment #8)
Thanks Warner! We have finished the testing for FreeBSD 15.0, and will note this known issue for the next release of FreeBSD 15.1.