Bug 276734 - bhyve gets stuck during VM boot
Summary: bhyve gets stuck during VM boot
Status: Open
Alias: None
Product: Base System
Classification: Unclassified
Component: bhyve (show other bugs)
Version: 14.0-RELEASE
Hardware: Any Any
: --- Affects Only Me
Assignee: freebsd-virtualization (Nobody)
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2024-01-30 14:34 UTC by Antranig Vartanian
Modified: 2024-02-04 20:27 UTC (History)
1 user (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Antranig Vartanian 2024-01-30 14:34:27 UTC
Every once in a while (around 1/10) bhyve doesn't boot the VM. It's not clear why.

Here's all the information I can provide, and we're open to rebooting our VM to replicate when needed in order to send more information.

# pgrep -lf bhyve
67937 bhyve -c 200,sockets=2,cores=50,threads=2 -m 1536G -AHP -p 0:28 -p 100:156 -p 1:29 -p 101:157 -p 2:30 -p 102:158 -p 3:31 -p 103:159 -p 4:32 -p 104:160 -p 5:33 -p 105:161 -p 6:34 -p 106:162 -p 7:35 -p 107:163 -p 8:36 -p 108:164 -p 9:37 -p 109:165 -p 10:38 -p 110:166 -p 11:39 -p 111:167 -p 12:40 -p 112:168 -p 13:41 -p 113:169 -p 14:42 -p 114:170 -p 15:43 -p 115:171 -p 16:44 -p 116:172 -p 17:45 -p 117:173 -p 18:46 -p 118:174 -p 19:47 -p 119:175 -p 20:48 -p 120:176 -p 21:49 -p 121:177 -p 22:50 -p 122:178 -p 23:51 -p 123:179 -p 24:52 -p 124:180 -p 25:53 -p 125:181 -p 26:54 -p 126:182 -p 27:55 -p 127:183 -p 28:56 -p 128:184 -p 29:57 -p 129:185 -p 30:58 -p 130:186 -p 31:59 -p 131:187 -p 32:60 -p 132:188 -p 33:61 -p 133:189 -p 34:62 -p 134:190 -p 35:63 -p 135:191 -p 36:64 -p 136:192 -p 37:65 -p 137:193 -p 38:66 -p 138:194 -p 39:67 -p 139:195 -p 40:68 -p 140:196 -p 41:69 -p 141:197 -p 42:70 -p 142:198 -p 43:71 -p 143:199 -p 44:72 -p 144:200 -p 45:73 -p 145:201 -p 46:74 -p 146:202 -p 47:75 -p 147:203 -p 48:76 -p 148:204 -p 49:77 -p 149:205 -p 50:78 -p 150:206 -p 51:79 -p 151:207 -p 52:80 -p 152:208 -p 53:81 -p 153:209 -p 54:82 -p 154:210 -p 55:83 -p 155:211 -p 56:84 -p 156:212 -p 57:85 -p 157:213 -p 58:86 -p 158:214 -p 59:87 -p 159:215 -p 60:88 -p 160:216 -p 61:89 -p 161:217 -p 62:90 -p 162:218 -p 63:91 -p 163:219 -p 64:92 -p 164:220 -p 65:93 -p 165:221 -p 66:94 -p 166:222 -p 67:95 -p 167:223 -p 68:96 -p 168:224 -p 69:97 -p 169:225 -p 70:98 -p 170:226 -p 71:99 -p 171:227 -p 72:100 -p 172:228 -p 73:101 -p 173:229 -p 74:102 -p 174:230 -p 75:103 -p 175:231 -p 76:104 -p 176:232 -p 77:105 -p 177:233 -p 78:106 -p 178:234 -p 79:107 -p 179:235 -p 80:108 -p 180:236 -p 81:109 -p 181:237 -p 82:110 -p 182:238 -p 83:111 -p 183:239 -p 84:112 -p 184:240 -p 85:113 -p 185:241 -p 86:114 -p 186:242 -p 87:115 -p 187:243 -p 88:116 -p 188:244 -p 89:117 -p 189:245 -p 90:118 -p 190:246 -p 91:119 -p 191:247 -p 92:120 -p 192:248 -p 93:121 -p 193:249 -p 94:122 -p 194:250 -p 95:123 -p 195:251 -p 96:124 -p 196:252 -p 97:125 -p 197:253 -p 98:126 -p 198:254 -p 99:127 -p 199:255 -U 2893e12c-637d-11ee-af94-7cc255269cb8 -u -S -s 0,amd_hostbridge -s 31,lpc -s 4:0,virtio-blk,/usr/local/vm/comp0/disk0.img -s 4:1,virtio-9p,proj=/mnt/proj -s 4:2,virtio-9p,user=/mnt/user -s 4:3,nvme,/dev/zvol/zscratch/scratch -s 5:0,virtio-net,tap0,mac=58:9c:fc:0e:4b:00 -s 6:0,passthru,68/0/1 -l com1,/dev/nmdm-comp0.1A comp0
69027 bhyve: system.grp
69850 bhyve: system.pwd
68574 bhyve: system.pwd
70386 bhyve: system.grp


# procstat -k 67937
  PID    TID COMM                TDNAME              KSTACK                       
67937 117360 bhyve               -                   mi_switch _sleep vm_wait_doms vm_wait_domain vm_page_alloc_noobj_domain uma_small_alloc keg_alloc_slab zone_import zone_alloc_item malloc amdvi_update_mapping iommu_create_mapping vm_iommu_modify vm_assign_pptdev vmmdev_ioctl devfs_ioctl vn_ioctl devfs_ioctl_f 
67937 117420 bhyve               blk-4:0-0           mi_switch sleepq_catch_signals sleepq_wait_sig _sleep umtxq_sleep do_wait __umtx_op_wait_uint_private sys__umtx_op amd64_syscall fast_syscall_common 
67937 117421 bhyve               blk-4:0-1           mi_switch sleepq_catch_signals sleepq_wait_sig _sleep umtxq_sleep do_wait __umtx_op_wait_uint_private sys__umtx_op amd64_syscall fast_syscall_common 
67937 117422 bhyve               blk-4:0-2           mi_switch sleepq_catch_signals sleepq_wait_sig _sleep umtxq_sleep do_wait __umtx_op_wait_uint_private sys__umtx_op amd64_syscall fast_syscall_common 
67937 117423 bhyve               blk-4:0-3           mi_switch sleepq_catch_signals sleepq_wait_sig _sleep umtxq_sleep do_wait __umtx_op_wait_uint_private sys__umtx_op amd64_syscall fast_syscall_common 
67937 117424 bhyve               blk-4:0-4           mi_switch sleepq_catch_signals sleepq_wait_sig _sleep umtxq_sleep do_wait __umtx_op_wait_uint_private sys__umtx_op amd64_syscall fast_syscall_common 
67937 117425 bhyve               blk-4:0-5           mi_switch sleepq_catch_signals sleepq_wait_sig _sleep umtxq_sleep do_wait __umtx_op_wait_uint_private sys__umtx_op amd64_syscall fast_syscall_common 
67937 117426 bhyve               blk-4:0-6           mi_switch sleepq_catch_signals sleepq_wait_sig _sleep umtxq_sleep do_wait __umtx_op_wait_uint_private sys__umtx_op amd64_syscall fast_syscall_common 
67937 117427 bhyve               blk-4:0-7           mi_switch sleepq_catch_signals sleepq_wait_sig _sleep umtxq_sleep do_wait __umtx_op_wait_uint_private sys__umtx_op amd64_syscall fast_syscall_common 
67937 117430 bhyve               9p-responder        mi_switch sleepq_catch_signals sleepq_wait_sig _sleep umtxq_sleep do_wait __umtx_op_wait_uint_private sys__umtx_op amd64_syscall fast_syscall_common 
67937 117431 bhyve               9p-worker:0         mi_switch sleepq_catch_signals sleepq_wait_sig _sleep umtxq_sleep do_wait __umtx_op_wait_uint_private sys__umtx_op amd64_syscall fast_syscall_common 
67937 117432 bhyve               9p-worker:1         mi_switch sleepq_catch_signals sleepq_wait_sig _sleep umtxq_sleep do_wait __umtx_op_wait_uint_private sys__umtx_op amd64_syscall fast_syscall_common 
67937 117433 bhyve               9p-worker:2         mi_switch sleepq_catch_signals sleepq_wait_sig _sleep umtxq_sleep do_wait __umtx_op_wait_uint_private sys__umtx_op amd64_syscall fast_syscall_common 
67937 117434 bhyve               9p-worker:3         mi_switch sleepq_catch_signals sleepq_wait_sig _sleep umtxq_sleep do_wait __umtx_op_wait_uint_private sys__umtx_op amd64_syscall fast_syscall_common 
67937 117435 bhyve               9p-worker:4         mi_switch sleepq_catch_signals sleepq_wait_sig _sleep umtxq_sleep do_wait __umtx_op_wait_uint_private sys__umtx_op amd64_syscall fast_syscall_common 
67937 117436 bhyve               9p-worker:5         mi_switch sleepq_catch_signals sleepq_wait_sig _sleep umtxq_sleep do_wait __umtx_op_wait_uint_private sys__umtx_op amd64_syscall fast_syscall_common 
67937 117437 bhyve               9p-worker:6         mi_switch sleepq_catch_signals sleepq_wait_sig _sleep umtxq_sleep do_wait __umtx_op_wait_uint_private sys__umtx_op amd64_syscall fast_syscall_common 
67937 117438 bhyve               9p-worker:7         mi_switch sleepq_catch_signals sleepq_wait_sig _sleep umtxq_sleep do_wait __umtx_op_wait_uint_private sys__umtx_op amd64_syscall fast_syscall_common 
67937 117442 bhyve               9p-responder        mi_switch sleepq_catch_signals sleepq_wait_sig _sleep umtxq_sleep do_wait __umtx_op_wait_uint_private sys__umtx_op amd64_syscall fast_syscall_common 
67937 117443 bhyve               9p-worker:0         mi_switch sleepq_catch_signals sleepq_wait_sig _sleep umtxq_sleep do_wait __umtx_op_wait_uint_private sys__umtx_op amd64_syscall fast_syscall_common 
67937 117444 bhyve               9p-worker:1         mi_switch sleepq_catch_signals sleepq_wait_sig _sleep umtxq_sleep do_wait __umtx_op_wait_uint_private sys__umtx_op amd64_syscall fast_syscall_common 
67937 117445 bhyve               9p-worker:2         mi_switch sleepq_catch_signals sleepq_wait_sig _sleep umtxq_sleep do_wait __umtx_op_wait_uint_private sys__umtx_op amd64_syscall fast_syscall_common 
67937 117446 bhyve               9p-worker:3         mi_switch sleepq_catch_signals sleepq_wait_sig _sleep umtxq_sleep do_wait __umtx_op_wait_uint_private sys__umtx_op amd64_syscall fast_syscall_common 
67937 117447 bhyve               9p-worker:4         mi_switch sleepq_catch_signals sleepq_wait_sig _sleep umtxq_sleep do_wait __umtx_op_wait_uint_private sys__umtx_op amd64_syscall fast_syscall_common 
67937 117448 bhyve               9p-worker:5         mi_switch sleepq_catch_signals sleepq_wait_sig _sleep umtxq_sleep do_wait __umtx_op_wait_uint_private sys__umtx_op amd64_syscall fast_syscall_common 
67937 117449 bhyve               9p-worker:6         mi_switch sleepq_catch_signals sleepq_wait_sig _sleep umtxq_sleep do_wait __umtx_op_wait_uint_private sys__umtx_op amd64_syscall fast_syscall_common 
67937 117450 bhyve               9p-worker:7         mi_switch sleepq_catch_signals sleepq_wait_sig _sleep umtxq_sleep do_wait __umtx_op_wait_uint_private sys__umtx_op amd64_syscall fast_syscall_common 
67937 117451 bhyve               blk-4:3-0           mi_switch sleepq_catch_signals sleepq_wait_sig _sleep umtxq_sleep do_wait __umtx_op_wait_uint_private sys__umtx_op amd64_syscall fast_syscall_common 
67937 117452 bhyve               blk-4:3-1           mi_switch sleepq_catch_signals sleepq_wait_sig _sleep umtxq_sleep do_wait __umtx_op_wait_uint_private sys__umtx_op amd64_syscall fast_syscall_common 
67937 117453 bhyve               blk-4:3-2           mi_switch sleepq_catch_signals sleepq_wait_sig _sleep umtxq_sleep do_wait __umtx_op_wait_uint_private sys__umtx_op amd64_syscall fast_syscall_common 
67937 117454 bhyve               blk-4:3-3           mi_switch sleepq_catch_signals sleepq_wait_sig _sleep umtxq_sleep do_wait __umtx_op_wait_uint_private sys__umtx_op amd64_syscall fast_syscall_common 
67937 117455 bhyve               blk-4:3-4           mi_switch sleepq_catch_signals sleepq_wait_sig _sleep umtxq_sleep do_wait __umtx_op_wait_uint_private sys__umtx_op amd64_syscall fast_syscall_common 
67937 117456 bhyve               blk-4:3-5           mi_switch sleepq_catch_signals sleepq_wait_sig _sleep umtxq_sleep do_wait __umtx_op_wait_uint_private sys__umtx_op amd64_syscall fast_syscall_common 
67937 117457 bhyve               blk-4:3-6           mi_switch sleepq_catch_signals sleepq_wait_sig _sleep umtxq_sleep do_wait __umtx_op_wait_uint_private sys__umtx_op amd64_syscall fast_syscall_common 
67937 117458 bhyve               blk-4:3-7           mi_switch sleepq_catch_signals sleepq_wait_sig _sleep umtxq_sleep do_wait __umtx_op_wait_uint_private sys__umtx_op amd64_syscall fast_syscall_common 
67937 117459 bhyve               nvme-aen-4:3        mi_switch sleepq_catch_signals sleepq_wait_sig _sleep umtxq_sleep do_wait __umtx_op_wait_uint_private sys__umtx_op amd64_syscall fast_syscall_common 
67937 117460 bhyve               vtnet-5:0 tx        mi_switch sleepq_catch_signals sleepq_wait_sig _sleep umtxq_sleep do_wait __umtx_op_wait_uint_private sys__umtx_op amd64_syscall fast_syscall_common 


# procstat fds -C 67937
  PID COMM                FD T FLAGS    CAPABILITIES                                                                                                                                                                                                                                PRO NAME        
67937 bhyve             text v r------- -                                                                                                                                                                                                                                           -   /usr/sbin/bhyve   
67937 bhyve             ctty v rw------ -                                                                                                                                                                                                                                           -   /dev/pts/4        
67937 bhyve              cwd v r------- -                                                                                                                                                                                                                                           -   /                 
67937 bhyve             root v r------- -                                                                                                                                                                                                                                           -   /                 
67937 bhyve                0 v r------- rd,wr,se,mm,cr,fe,fy,ft,cd,cf,cm,cn,fc,fl,fp,fk,fs,sf,fu,ls,lt,md,mf,mn,rs,rt,sl,un,lo,eg,es,ed,el,ag,as,ad,ac,at,bd,co,pn,sn,gs,ln,pf,ss,sh,mg,ms,sg,sp,sw,ev,ke,kc,io,ty,pg,pw,pk,ba,ca,prd,pwr,mmr,mmw,mmx,mrw,mrx,mwx,mma,re,sd,scl,ssr -   /dev/null         
67937 bhyve                1 v -w------ rd,wr,se,mm,cr,fe,fy,ft,cd,cf,cm,cn,fc,fl,fp,fk,fs,sf,fu,ls,lt,md,mf,mn,rs,rt,sl,un,lo,eg,es,ed,el,ag,as,ad,ac,at,bd,co,pn,sn,gs,ln,pf,ss,sh,mg,ms,sg,sp,sw,ev,ke,kc,io,ty,pg,pw,pk,ba,ca,prd,pwr,mmr,mmw,mmx,mrw,mrx,mwx,mma,re,sd,scl,ssr -   /dev/null         
67937 bhyve                2 v -w------ rd,wr,se,mm,cr,fe,fy,ft,cd,cf,cm,cn,fc,fl,fp,fk,fs,sf,fu,ls,lt,md,mf,mn,rs,rt,sl,un,lo,eg,es,ed,el,ag,as,ad,ac,at,bd,co,pn,sn,gs,ln,pf,ss,sh,mg,ms,sg,sp,sw,ev,ke,kc,io,ty,pg,pw,pk,ba,ca,prd,pwr,mmr,mmw,mmx,mrw,mrx,mwx,mma,re,sd,scl,ssr -   /usr/local/vm/comp0/bhyve.log
67937 bhyve                3 v rw------                                                                                                                                                                                                    rd,wr,se,mm,io,prd,pwr,mmr,mmw,mrw,re,sd -   /dev/vmm/comp0    
67937 bhyve                4 k rw------                                                                                                                                                                                                                                       ke,kc -   -                 
67937 bhyve                5 v rw------                                                                                                                                                                                                       rd,wr,se,fy,fp,fs,ev,io,prd,pwr,re,sd -   /usr/local/vm/comp0/disk0.img
67937 bhyve                6 v r-------                                                                                                                                        rd,wr,se,cr,fy,ft,cm,cn,fp,fs,sf,fu,ls,lt,md,mn,rs,rt,sl,un,lo,eg,es,ed,el,ag,as,ad,ac,prd,pwr,re,sd -   /mnt/proj         
67937 bhyve                7 v r-------                                                                                                                                        rd,wr,se,cr,fy,ft,cm,cn,fp,fs,sf,fu,ls,lt,md,mn,rs,rt,sl,un,lo,eg,es,ed,el,ag,as,ad,ac,prd,pwr,re,sd -   /mnt/user         
67937 bhyve                8 s rw---n-- rd,wr,se,mm,cr,fe,fy,ft,cd,cf,cm,cn,fc,fl,fp,fk,fs,sf,fu,ls,lt,md,mf,mn,rs,rt,sl,un,lo,eg,es,ed,el,ag,as,ad,ac,at,bd,co,pn,sn,gs,ln,pf,ss,sh,mg,ms,sg,sp,sw,ev,ke,kc,io,ty,pg,pw,pk,ba,ca,prd,pwr,mmr,mmw,mmx,mrw,mrx,mwx,mma,re,sd,scl,ssr UDS 0 0 -
67937 bhyve                9 v rw------                                                                                                                                                                                                       rd,wr,se,fy,fp,fs,ev,io,prd,pwr,re,sd -   /dev/zvol/zscratch/scratch
67937 bhyve               10 s rw---n-- rd,wr,se,mm,cr,fe,fy,ft,cd,cf,cm,cn,fc,fl,fp,fk,fs,sf,fu,ls,lt,md,mf,mn,rs,rt,sl,un,lo,eg,es,ed,el,ag,as,ad,ac,at,bd,co,pn,sn,gs,ln,pf,ss,sh,mg,ms,sg,sp,sw,ev,ke,kc,io,ty,pg,pw,pk,ba,ca,prd,pwr,mmr,mmw,mmx,mrw,mrx,mwx,mma,re,sd,scl,ssr UDS 0 0 -
67937 bhyve               11 s rw---n-- rd,wr,se,mm,cr,fe,fy,ft,cd,cf,cm,cn,fc,fl,fp,fk,fs,sf,fu,ls,lt,md,mf,mn,rs,rt,sl,un,lo,eg,es,ed,el,ag,as,ad,ac,at,bd,co,pn,sn,gs,ln,pf,ss,sh,mg,ms,sg,sp,sw,ev,ke,kc,io,ty,pg,pw,pk,ba,ca,prd,pwr,mmr,mmw,mmx,mrw,mrx,mwx,mma,re,sd,scl,ssr UDS 0 0 -
67937 bhyve               12 v rw---n--                                                                                                                                                                                                                              rd,wr,ev,re,sd -   /dev/tap0         
67937 bhyve               13 s rw---n-- rd,wr,se,mm,cr,fe,fy,ft,cd,cf,cm,cn,fc,fl,fp,fk,fs,sf,fu,ls,lt,md,mf,mn,rs,rt,sl,un,lo,eg,es,ed,el,ag,as,ad,ac,at,bd,co,pn,sn,gs,ln,pf,ss,sh,mg,ms,sg,sp,sw,ev,ke,kc,io,ty,pg,pw,pk,ba,ca,prd,pwr,mmr,mmw,mmx,mrw,mrx,mwx,mma,re,sd,scl,ssr UDS 0 0 -
67937 bhyve               14 v rw------                                                                                                                                                                                                                              rd,wr,io,re,sd -   /dev/pci          

# ps aux | grep bhyve
root      68574     0.0  0.0 1610652128 399900  -  Is   13:16       0:00.00 bhyve: system.pwd (bhyve)
root      69027     0.0  0.0 1610652128 399900  -  Is   13:16       0:00.00 bhyve: system.grp (bhyve)
root      69850     0.0  0.0 1610653424 400128  -  Is   13:16       0:00.00 bhyve: system.pwd (bhyve)
root      70386     0.0  0.0 1610653424 400124  -  Is   13:16       0:00.00 bhyve: system.grp (bhyve)
root      67937     0.0  0.0 1610693008 401856  4  D    13:16       0:16.77 bhyve -c 200,sockets=2,cores=50,threads=2 -m 1536G -AHP -p 0:28 -p 100:156 -p 1:29 -p 101:157 -p 2:30 -p 102:158 -p 3:31 -p 103:159 -p 4:32 -p 


Any tips to debug would be appreciated.

Cheers!
Comment 1 Antranig Vartanian 2024-01-30 14:38:28 UTC
Just noticed. I can't even kill the VM instance nor I can do bhyvectl destroy.

# bhyvectl --destroy --vm=comp0
load: 0.56  cmd: bhyvectl 48937 [devdrn] 5.61r 0.00u 0.00s 0% 2184k

It has been stuck like this for a while and memory usage hasn't change.
Comment 2 Mark Johnston freebsd_committer freebsd_triage 2024-02-04 20:27:41 UTC
67937 117360 bhyve               -                   mi_switch _sleep vm_wait_doms vm_wait_domain vm_page_alloc_noobj_domain uma_small_alloc keg_alloc_slab zone_import zone_alloc_item malloc amdvi_update_mapping iommu_create_mapping vm_iommu_modify vm_assign_pptdev vmmdev_ioctl devfs_ioctl vn_ioctl devfs_ioctl_f

This means that one of the bhyve threads is stuck in the kernel waiting for memory.  Is this running on a NUMA system?