Bug 277196 - Azure reports high CPU usage on ARM64 14.0 Base Install
Summary: Azure reports high CPU usage on ARM64 14.0 Base Install
Status: Open
Alias: None
Product: Base System
Classification: Unclassified
Component: arm (show other bugs)
Version: 14.0-RELEASE
Hardware: arm64 Any
: --- Affects Many People
Assignee: freebsd-arm (Nobody)
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2024-02-20 16:33 UTC by Josh
Modified: 2024-02-26 17:12 UTC (History)
4 users (show)

See Also:


Attachments
Azure CPU Usage Standard B2pls v2 (2 vcpus, 4 GiB memory) (28.76 KB, image/png)
2024-02-23 13:57 UTC, Josh
no flags Details
Azure CPU Usage Standard D2pls v5 (2 vcpus, 4 GiB memory) (17.48 KB, image/png)
2024-02-23 13:59 UTC, Josh
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Josh 2024-02-20 16:33:01 UTC
Overview:
High CPU usage reported using Azure Marketplace image.

Steps to Reproduce:
Build the following VM in Azure
Marketplace Image: thefreebsdfoundation:freebsd-14_0:14_0-release-arm64-gen2-ufs:14.0.0
VM Size: Standard B2pls v2 (2 vcpus, 4 GiB memory)
VM architecture: Arm64 

# Cloud Shell command line (Will need to modify --resource-group)
az vm create --name testarm64 --resource-group rg-test-arm64 --nsg '' --image thefreebsdfoundation:freebsd-14_0:14_0-release-arm64-gen2-ufs:14.0.0 --admin-username abcdroot --admin-password "abcd12341234!" --accept-term --size Standard_B2pls_v2 --public-ip-address ""

Base install should reproduce the issue, no other steps other than running the system.

Actual Results:
Azure CPU Metrics reports CPU Percentage average of 43%
running "top" shows the following with less than 2% deviation of any specific metric
CPU 0:  0.0% user,  0.0% nice,  0.0% system,  0.0% interrupt,  100% idle
CPU 1:  0.0% user,  0.0% nice,  0.0% system,  0.0% interrupt,  100% idle

Expected Results:
Azure CPU Metrics should be in line with metrics reported by OS.

Build:
FreeBSD testarm64 14.0-RELEASE-p5 FreeBSD 14.0-RELEASE-p5 #0: Tue Feb 13 23:49:05 UTC 2024

Additional Info:
I have tested creating multiple VMs in multiple subscriptions in the EastUS2 and North Central US regions with identical results. 

I have also reached out to Azure and reported the issue.  They state they do not support third party products in the Azure Marketplace and that I should reach out to the vendor for support.
Comment 1 Josh 2024-02-22 18:42:02 UTC
Additional Info:
Tested with a non burstable VM "Standard D2pls v5 (2 vcpus, 4 GiB memory)" with the same results.
Comment 2 Wei Hu 2024-02-23 06:43:49 UTC
Josh, would you attach the screenshot of Azure CPU metrics showing the higher cpu usage?
Comment 3 Josh 2024-02-23 13:57:41 UTC
Created attachment 248695 [details]
Azure CPU Usage Standard B2pls v2 (2 vcpus, 4 GiB memory)

I have CPU Credits shown, with regards to this burstable VM type.  Once it runs out of CPU credits it throttles it to around 30% of the total CPU capacity.
Comment 4 Josh 2024-02-23 13:59:14 UTC
Created attachment 248696 [details]
Azure CPU Usage Standard D2pls v5 (2 vcpus, 4 GiB memory)

Non Burstable Azure VM
Comment 5 Li-Wen Hsu freebsd_committer freebsd_triage 2024-02-23 21:47:18 UTC
(In reply to Josh from comment #0)
> I have also reached out to Azure and reported the issue.  They state they do not support third party products in the Azure Marketplace and that I should reach out to the vendor for support.

Also, we're interested in how did you reach out to Azure. Through email or via the issue tracking system?  The FreeBSD in Azure Marketplace indeed built and uploaded by the FreeBSD project with supporting from the FreeBSD Foundation, but many technical issues specified to Hyper-V and Azure are solved by the MS engineers.  Letting them know there are more FreeBSD users in Azure can help MS allocate more resource on FreeBSD.
Comment 6 Josh 2024-02-23 22:01:27 UTC
I used the issue tracking system.  I still have this ticket open with Azure and have reported additional information to them, specifically, the D2pls VM, as well as the screenshots attached here.  The "Microsoft Azure IaaS Support Engineer" and I have done a remote support session where they were able to witness the issue in real time.  Below is the ticket number and response I received from Microsoft (redacted support names and confidential information):
------------------------------

Subject: Metrics don't match actual usage - TrackingID#2402190040008536

Hello Josh,
 
Thank you for contacting Microsoft Support. I am the Microsoft Azure IaaS Support Engineer who will be working with you on this Service Request.
 
As you have mentioned that you saw and checked that the VM CPU spiked to 43%, therefore I checked the VM ------ details and found that the VM size is Standard_B2pls_v2 which is a 2 core 4GB RAM burstable VM which operates on CPU credit system, and which has a Standard HDD S4 disk attached to it which has a IOPS of 500 and bandwidth of 60 MB/s.

Bpsv2 Series (preview) - Azure Virtual Machines | Microsoft Learn
https://learn.microsoft.com/en-us/azure/virtual-machines/bpsv2-arm

The VM also host the image freebsd which is Third party image and as mentioned in the below article it is not endorsed by Azure therefore, we have a limited scope of support for it on the Guest OS level.
Linux distributions endorsed on Azure - Azure Virtual Machines | Microsoft Learn
https://learn.microsoft.com/en-us/azure/virtual-machines/linux/endorsed-distros
Comment 7 Li-Wen Hsu freebsd_committer freebsd_triage 2024-02-23 22:18:27 UTC
(In reply to Josh from comment #6)
In the same document they do mention FreeBSD https://learn.microsoft.com/en-us/azure/virtual-machines/linux/freebsd-intro-on-azure :)

In fact FreeBSD image in Azure Marketplace was built and published Microsoft in the beginning and continuing for a while, then we started a collaboration on technical topics, and we found that it's better to integrate the release building and publishing process to the official FreeBSD release engineering process so the final images will be available in the fast way. So now FreeBSD looks a third party product but in fact the way to work together are similar to those endorsed Linux distributions. The MS engineers still work on the most of the things for FreeBSD in Azure and Hyper-V platform. The details of those collaboration are published in many issues of FreeBSD quarterly reports.

That said, it's not wrong to report here (and we do have MS engineers invited here to check this issue), but it's always good to let the issue also being tracked in MS.
Comment 8 Josh 2024-02-23 22:23:52 UTC
(In reply to Li-Wen Hsu from comment #7)
Good to know.  Thank you Li-Wen.  Will continue to monitor both and will continue testing, researching, and will report relevant information if I find anything.
Comment 9 schakrabarti@microsoft.com 2024-02-26 07:50:34 UTC
(In reply to Josh from comment #4)
Hi Josh, is it only happening in ARM VM or in AMD64 VM as well?
Comment 10 schakrabarti@microsoft.com 2024-02-26 08:00:50 UTC
(In reply to Josh from comment #0)
Also can you please check the top -P and systat -vm output and share here.
So that we can compare.
Comment 11 Josh 2024-02-26 13:57:49 UTC
top -P

last pid:   952;  load averages:  0.19,  0.32,  0.27                                                                                                                                                                 up 0+00:09:06  13:53:54
24 processes:  1 running, 23 sleeping
CPU 0:  0.0% user,  0.0% nice,  0.0% system,  0.0% interrupt,  100% idle
CPU 1:  0.0% user,  0.0% nice,  0.0% system,  0.0% interrupt,  100% idle
Mem: 68M Active, 23M Inact, 316M Wired, 264K Buf, 3558M Free
ARC: 67M Total, 37M MFU, 27M MRU, 402K Header, 2536K Other
     52M Compressed, 63M Uncompressed, 1.19:1 Ratio

  PID USERNAME    THR PRI NICE   SIZE    RES STATE    C   TIME    WCPU COMMAND
  951 lccaroot      1  20    0    14M  3340K CPU1     1   0:00   0.02% top
  482 root          1  20    0    12M  2252K select   0   0:00   0.02% hv_kvp_daemon
  947 lccaroot      1  20    0    22M    11M select   0   0:00   0.00% sshd
  783 ntpd          1  20    0    22M  7424K select   1   0:00   0.00% ntpd
  845 root          1  20    0    50M    36M select   0   0:00   0.00% python3.9
  899 root          5  20    0    67M    38M select   1   0:01   0.00% python3.9
  943 root          1  20    0    22M    10M select   0   0:00   0.00% sshd
  948 lccaroot      1  20    0    13M  3096K wait     0   0:00   0.00% sh
  729 root          1  20    0    13M  2604K select   0   0:00   0.00% syslogd
  865 root          1  36    0    12M  2200K ttyin    1   0:00   0.00% getty
  807 root          1  20    0    13M  2384K nanslp   1   0:00   0.00% cron
  823 root          1  20    0    22M  9448K select   1   0:00   0.00% sshd
  858 root          1  32    0    12M  2192K ttyin    1   0:00   0.00% getty
  860 root          1  34    0    12M  2200K ttyin    0   0:00   0.00% getty
  863 root          1  34    0    12M  2204K ttyin    1   0:00   0.00% getty
  864 root          1  34    0    12M  2200K ttyin    1   0:00   0.00% getty
  861 root          1  34    0    12M  2192K ttyin    1   0:00   0.00% getty
  857 root          1  32    0    12M  2204K ttyin    1   0:00   0.00% getty
  862 root          1  34    0    12M  2204K ttyin    1   0:00   0.00% getty
  859 root          1  32    0    12M  2200K ttyin    1   0:00   0.00% getty
  289 root          1  68    0    13M  2472K select   0   0:00   0.00% dhclient
  292 root          1   4    0    13M  2620K select   0   0:00   0.00% dhclient
  524 root          1  20    0    14M  3784K select   1   0:00   0.00% devd
  358 _dhcp         1  39    0    13M  2668K select   0   0:00   0.00% dhclient
Comment 12 Josh 2024-02-26 13:58:05 UTC
systat -vm
	
	1 users    Load  0.04  0.22  0.23                  Feb 26 13:56:15
   Mem usage:  10%Phy  6%Kmem                           VN PAGER   SWAP PAGER
Mem:      REAL           VIRTUAL                        in   out     in   out
       Tot   Share     Tot    Share     Free   count
Act 91156K  16668K    511M   20828K    3552M   pages
All 93104K  18512K    543M   45668K                       ioflt  Interrupts
Proc:                                                  57 cow     604 total
  r   p   d    s   w   Csw  Trp  Sys  Int  Sof  Flt    47 zfod      5 gic0,p2: v
              27       526    3   76  360       160       ozfod   241 gic0,p4:-r
                                                         %ozfod   244 gic0,s1: u
 0.0%Sys   0.0%Intr  0.0%User  0.0%Nice  100%Idle         daefr    72 gic0,s33:-
|    |    |    |    |    |    |    |    |    |    |    69 prcfr     1 gic0,s35:-
                                                      148 totfr       gic0,s36:-
                                           dtbuf          react       cpu1:ast
Namei     Name-cache   Dir-cache    146666 maxvn          pdwak    24 cpu0:preem
   Calls    hits   %    hits   %      1523 numvn       30 pdpgs    17 cpu1:preem
      83      83 100                  1008 frevn          intrn       cpu0:rende
                                                     317M wire        cpu1:rende
Disks   da0 pass0                                     72M act         cpu0:hardc
KB/t   9.54  0.00                                     23M inact
tps       5     0                                       0 laund
MB/s   0.05  0.00                                   3552M free
%busy    24     0                                    265K buf
Comment 13 schakrabarti@microsoft.com 2024-02-26 15:33:42 UTC
(In reply to Josh from comment #11)
In the top output there is no high cpu usage and also systat output does not show any high interrupt rate. Is it only showing on the azure tool?
Comment 14 Josh 2024-02-26 17:12:53 UTC
That is correct.  If you look at the Azure screenshot "B2pls" attached, you can see how it consumes all the CPU credits and then Azure clamps down on the processor once all the credits are consumed.

Azure support says that we shouldn't use these types of VMs.  However, if you look at the screenshot "D2pls" it exhibits that same issue.  I've not attempted to stress the CPU to see if Azure clamps down the CPU on the D2pls VM.