|Summary:||panic when running as PVHVM under Xen with 4 cores and 4+ network interfaces|
|Product:||Base System||Reporter:||John <john>|
|Component:||kern||Assignee:||freebsd-xen mailing list <xen>|
|Severity:||Affects Many People||CC:||royger|
Description John 2017-08-04 13:16:47 UTC
Created attachment 185015 [details] serial console log When running FreeBSD 11.0 in a Xen virtual machine with PVHVM support on, 4 cpu cores and 4+ VIF network interfaces, the system always kernel panics. Using less cpu cores or less interfaces never panics. (using 'always', and 'never' in the context of this bug) Attached is a boot log with 5 virtual interfaces and 4 CPU cores, at db> I typed bt as that's about as much I can do regarding FreeBSD kernel debugging.
Comment 1 John 2017-08-04 14:36:00 UTC
Created attachment 185017 [details] textdump.tar from /var/crash
Comment 5 Roger Pau MonnÃ© 2017-08-07 11:50:23 UTC
I've been trying to reproduce this, but so far I'm unable to do so. Have you tried if removing bios=ovmf solves the problem? Also, do you know anything about the underlying host? Number of physical CPUs and number of vCPUs assigned to Dom0?
Comment 6 Roger Pau MonnÃ© 2017-08-07 11:52:44 UTC
Also, if you have access to Dom0 can you paste the output of `xl dmesg`? (would be good if this was done on a hypervisor built with debug=y)
Comment 7 John 2017-08-07 12:14:12 UTC
The VM is booting in EFI mode so removing ovmf wouldn't work. I haven't tried this issue in BIOS mode, so it may be EFI-only. I can access the dom0, but it's not built with debug=y. I do have serial console, so I can access the Xen debug menu. I have attached the xl dmesg. Regarding the underlying hardware: it's a Dell R730xd with 2x E5-2630 v4 (total of 40 threads IIRC). It is a test setup with nothing else running on it, only the dom0 and FreeBSD 11.0 (pfSense 2.4). There is no CPU or vCPU pinning. The reason I'm using OVMF is to have the EFI console available over the virtual serial console (as well as the bootloader and of course the first tty). Underlying disk is a ZVOL for the domU, there is 192GB RAM, so it shouldn't be resource constrained.
Comment 9 Roger Pau MonnÃ© 2017-08-07 13:00:51 UTC
Is it a FreeBSD Dom0 or a Linux Dom0? Can you paste the output of `xenstore-ls -fp` when the DomU panics? Thanks, Roger.
Comment 10 John 2017-08-07 13:22:06 UTC
(In reply to Roger Pau MonnÃ© from comment #9) It is a Linux Dom0 (Debian 9). I'll have it panic and get the xenstore.
Comment 12 Roger Pau MonnÃ© 2017-08-07 14:20:44 UTC
Can you reproduce the same issue using one of the vanilla FreeBSD images? I've tried to reproduce this with both the upstream FreeBSD images and the pfSense install iso, and so far I'm unable to reproduce it. Can you also paste the output of `xl info` in Dom0? Thanks, Roger.
Comment 13 John 2017-08-07 15:06:38 UTC
I can't boot pfSense 2.3 in UEFI mode, that's why I'm using their 2.4 beta. XL Info: host : xen-1-prod release : 4.9.0-3-amd64 version : #1 SMP Debian 4.9.30-2+deb9u2 (2017-06-26) machine : x86_64 nr_cpus : 40 max_cpu_id : 191 nr_nodes : 2 cores_per_socket : 10 threads_per_core : 2 cpu_mhz : 2200 hw_caps : b7ebfbff:77fef3ff:2c100800:00000121:00000001:001cbfbb:00000000:00000100 virt_caps : hvm hvm_directio total_memory : 130976 free_memory : 2045 sharing_freed_memory : 0 sharing_used_memory : 0 outstanding_claims : 0 free_cpus : 0 xen_major : 4 xen_minor : 8 xen_extra : .1 xen_version : 4.8.1 xen_caps : xen-3.0-x86_64 xen-3.0-x86_32p hvm-3.0-x86_32 hvm-3.0-x86_32p hvm-3.0-x86_64 xen_scheduler : credit xen_pagesize : 4096 platform_params : virt_start=0xffff800000000000 xen_changeset : xen_commandline : console=vga,com1 com1=115200 loglvl=all cc_compiler : gcc (Debian 6.3.0-18) 6.3.0 20170516 cc_compile_by : ian.jackson cc_compile_domain : eu.citrix.com cc_compile_date : Sat Jul 29 13:47:03 CEST 2017 build_id : db76baddf8aca42d05cff58718878d37 xend_config_format : 4
Comment 14 Roger Pau MonnÃ© 2017-08-07 15:38:33 UTC
Can you try of the same happens with a plain vanilla FreeBSD 11.0 image? You can get them from: ftp://ftp.nl.freebsd.org/pub/FreeBSD/releases/VM-IMAGES/11.0-RELEASE/amd64/Latest/
Comment 15 Roger Pau MonnÃ© 2017-08-08 08:12:26 UTC
I've installed pfSense-CE-2.4.0-BETA-amd64 on an OVMF VM with 2GB of RAM, 4vpcus and 5 network interfaces, and still unable to reproduce. Can you get a hypervisor with debug enabled? You can pick the sources of the debian package and rebuild the hypervisor only with debug=y. Without that I'm afraid it's going to be quite difficult to figure out what's going on. Thanks, Roger.
Comment 16 Roger Pau MonnÃ© 2017-08-15 10:47:19 UTC
Is there any news on the issue?
Comment 17 John 2017-08-28 12:02:26 UTC
Sorry for the delay, the issue still exists. I reproduced it with https://download.freebsd.org/ftp/releases/VM-IMAGES/11.1-RELEASE/amd64/Latest/FreeBSD-11.1-RELEASE-amd64.raw.xz but I still have no idea how to debug this.
Comment 18 John 2017-08-28 12:18:04 UTC
Oh no, strike that. I booted that one without OVMF and the problem was a different one (it as the Xen-pf bug where all network access is dropped when checksum offloading is on). I jumped to conclusions because it wouldn't come up once it was booted, but later on I connected to the VNC console of the VM and it was there. I'll try a clean install from the installer disk to a virtual drive. I'll see if I can get a Xen debug build as well.