Bug 256781

Summary: EC2 Nitro: TSC-low timecounter lags
Product: Base System Reporter: martin.stoyanov
Component: kernAssignee: freebsd-virtualization (Nobody) <virtualization>
Status: Closed Overcome By Events    
Severity: Affects Some People CC: cperciva, dgmorris, dwmw2, koobs, martin.stoyanov, royger
Priority: --- Keywords: needs-patch, needs-qa
Version: 13.0-RELEASEFlags: koobs: maintainer-feedback? (cperciva)
koobs: mfc-stable13?
koobs: mfc-stable12?
Hardware: amd64   
OS: Any   
Attachments:
Description Flags
dmesg.boot
none
cpu utilization and credit balance graphs
none
terraform testcase
none
dmesg.boot - 14.0-CURRENT none

Description martin.stoyanov 2021-06-22 19:04:34 UTC
I am trying to use the latest FreeBSD 13 Release AMIs on EC2 and by default they start up with:
kern.timecounter.hardware: TSC-low
which results in a very slow clock. I did these rudimentary tests, hopefully demonstrating the problem. (all date commands were executed one minute apart after switching the timecounter.

root@buildbot_master: # sysctl kern.timecounter.hardware=i8254
kern.timecounter.hardware: TSC-low -> i8254
root@buildbot_master: # date
Tue Jun 22 18:22:45 UTC 2021
root@buildbot_master: # date
Tue Jun 22 18:23:46 UTC 2021

root@buildbot_master: # sysctl kern.timecounter.hardware=ACPI-fast
kern.timecounter.hardware: i8254 -> ACPI-fast
root@buildbot_master: # date
Tue Jun 22 18:24:46 UTC 2021
root@buildbot_master: # date
Tue Jun 22 18:25:45 UTC 2021

root@buildbot_master: # sysctl kern.timecounter.hardware=HPET
kern.timecounter.hardware: ACPI-fast -> HPET
root@buildbot_master: # date
Tue Jun 22 18:26:45 UTC 2021
root@buildbot_master: # date
Tue Jun 22 18:27:45 UTC 2021

root@buildbot_master:/var/service-dir # sysctl kern.timecounter.hardware=TSC-low
kern.timecounter.hardware: HPET -> TSC-low
root@buildbot_master: # date
Tue Jun 22 18:28:16 UTC 2021
root@buildbot_master: # date
Tue Jun 22 18:28:43 UTC 2021

This is the uname of the machine and if you need any additional info I'd be happy to provide it:
root@buildbot_master: # uname -a
FreeBSD buildbot_master 13.0-RELEASE-p1 FreeBSD 13.0-RELEASE-p1 #0: Wed May 26 22:15:09 UTC 2021     root@amd64-builder.daemonology.net:/usr/obj/usr/src/amd64.amd64/sys/GENERIC  amd64
Comment 1 Mark Linimon freebsd_committer freebsd_triage 2021-06-22 20:32:33 UTC
^Triage: affects AWS instances, so assign.
Comment 2 Kubilay Kocak freebsd_committer freebsd_triage 2021-06-26 02:11:00 UTC
@Reporter Could you include /var/run/dmesg.boot as an attachment please
Comment 3 Colin Percival freebsd_committer freebsd_triage 2021-06-26 02:33:44 UTC
Exactly which instance type is this?
Comment 4 martin.stoyanov 2021-06-27 00:54:51 UTC
Created attachment 226069 [details]
dmesg.boot
Comment 5 martin.stoyanov 2021-06-27 00:57:51 UTC
(In reply to Colin Percival from comment #3)
I get this behavior with t3 and t3a instances.
Comment 6 Colin Percival freebsd_committer freebsd_triage 2021-06-27 01:55:14 UTC
It's possible that this is a problem with "stolen" CPU time.

Martin: Is the EC2 "CPU credit balance" positive on that instance?
Comment 7 martin.stoyanov 2021-06-27 03:12:45 UTC
Created attachment 226072 [details]
cpu utilization and credit balance graphs
Comment 8 martin.stoyanov 2021-06-27 03:32:52 UTC
(In reply to Colin Percival from comment #6)

Yes, it is positive, the instance isn't very loaded, and also the same thing happens even if I just let it sit after it starts up. By the time I am able to ssh in when the instance starts the clock is already too far behind to fetch SSM parameters for example... On the other hand if I use configinit to change the timecounter to any of the other options, the time is accurate enough.
Comment 9 Colin Percival freebsd_committer freebsd_triage 2021-06-27 03:38:48 UTC
Hmm, very strange.  I can't reproduce this.

Which EC2 region are you in?  Have you tried other regions?  It's possible that AWS is rolling out a change which breaks things.
Comment 10 martin.stoyanov 2021-06-27 03:47:57 UTC
I am using the us-east-1 region.
Comment 11 Colin Percival freebsd_committer freebsd_triage 2021-06-27 03:49:24 UTC
Ok, that's where I was testing too... have you tried all the different AZs?
Comment 12 martin.stoyanov 2021-06-27 03:56:29 UTC
I think all my tests were in us-east-1a, I don't think I have tested other AZs
Comment 13 Colin Percival freebsd_committer freebsd_triage 2021-06-27 04:43:45 UTC
Can you try other AZs, and also tell me if running something CPU intensive (say, 'dd if=/dev/random | sha256') changes how fast the clock runs?
Comment 14 Colin Percival freebsd_committer freebsd_triage 2021-06-27 04:50:11 UTC
Also, what does `sysctl kern.timecounter` say?
Comment 15 martin.stoyanov 2021-06-28 19:56:58 UTC
Created attachment 226092 [details]
terraform testcase
Comment 16 martin.stoyanov 2021-06-28 20:17:44 UTC
(In reply to Colin Percival from comment #13)

I created a terraform file that launches an instance with this behavior in any region (provided you update the region and the AMI). It seems that if you choose standard cpu_credits mode when launching a t3/t3a instance, TSC-low is 50% slower, while if you choose unlimited - everything works fine (at least in my tests...).
Comment 17 martin.stoyanov 2021-06-28 20:18:18 UTC
(In reply to Colin Percival from comment #14)

ec2-user@freebsd:~ $ sysctl kern.timecounter
kern.timecounter.tsc_shift: 1
kern.timecounter.smp_tsc_adjust: 0
kern.timecounter.smp_tsc: 1
kern.timecounter.invariant_tsc: 1
kern.timecounter.fast_gettime: 1
kern.timecounter.tick: 1
kern.timecounter.choice: i8254(0) ACPI-fast(900) HPET(950) TSC-low(1000) dummy(-1000000)
kern.timecounter.hardware: TSC-low
kern.timecounter.alloweddeviation: 5
kern.timecounter.timehands_count: 2
kern.timecounter.stepwarnings: 0
kern.timecounter.tc.i8254.quality: 0
kern.timecounter.tc.i8254.frequency: 1193182
kern.timecounter.tc.i8254.counter: 5840
kern.timecounter.tc.i8254.mask: 65535
kern.timecounter.tc.ACPI-fast.quality: 900
kern.timecounter.tc.ACPI-fast.frequency: 3579545
kern.timecounter.tc.ACPI-fast.counter: 15311745
kern.timecounter.tc.ACPI-fast.mask: 16777215
kern.timecounter.tc.HPET.quality: 950
kern.timecounter.tc.HPET.frequency: 62500000
kern.timecounter.tc.HPET.counter: 1237447075
kern.timecounter.tc.HPET.mask: 4294967295
kern.timecounter.tc.TSC-low.quality: 1000
kern.timecounter.tc.TSC-low.frequency: 1426718697
kern.timecounter.tc.TSC-low.counter: 3076039010
kern.timecounter.tc.TSC-low.mask: 4294967295
Comment 18 Colin Percival freebsd_committer freebsd_triage 2021-06-30 03:57:01 UTC
Thanks for all the details.  After some investigation, it looks like there's a specific issue with TSC on T3/T3A instances which affects the clock calibration.

In the long term we should switch over to using the KVM pvclock (which won't have this problem) but that code isn't in the tree yet; for 13.0 I don't think there's any good solution.
Comment 19 Roger Pau Monné freebsd_committer freebsd_triage 2021-06-30 17:38:56 UTC
Can you try to set kern.timecounter.smp_tsc_adjust=1 when using TSC-low and see if that makes a difference?

So this system reports invariant TSC, but the TSCs are not synchronized across CPUs?
Comment 20 Colin Percival freebsd_committer freebsd_triage 2021-06-30 17:42:34 UTC
(In reply to Roger Pau Monné from comment #19)

> So this system reports invariant TSC, but the TSCs are not synchronized across CPUs?

My understanding is that it's even worse than that: The TSC is not synchronized across a *single* CPU, since we're a VM guest and what we see as a single CPU might move between host CPUs which aren't synchronized.
Comment 21 Roger Pau Monné freebsd_committer freebsd_triage 2021-06-30 17:50:16 UTC
(In reply to Colin Percival from comment #20)
That sounds like a bug in their implementation? As the hypervisor should keep the TSC of the virtual CPU synchronised across physical CPU movements. Have you tried to report to AWS?
Comment 22 Colin Percival freebsd_committer freebsd_triage 2021-06-30 17:52:50 UTC
They're the people who told me that we should use pvclock instead. ;-)
Comment 23 Roger Pau Monné freebsd_committer freebsd_triage 2021-06-30 17:59:45 UTC
(In reply to Colin Percival from comment #22)
Oh OK. Maybe they don't do the vCPU TSC adjustments in order to avoid some of the overhead of vCPU scheduling. Not sure exactly what are the requirement/expectations of OSes running on Nitro instances.
Comment 24 martin.stoyanov 2021-06-30 19:07:51 UTC
(In reply to Roger Pau Monné from comment #19)

That seems to help, although one has to edit /boot/loader.conf and reboot the instance to apply it... Also my only test is just running date one minute apart, so if you have any ideas how to test more, I'd be happy to...

root@freebsd:~ # sysctl -a | grep kern.timecounter.smp_tsc_adjust
kern.timecounter.smp_tsc_adjust: 1
root@freebsd:~ # date
Wed Jun 30 19:00:59 UTC 2021
root@freebsd:~ # date
Wed Jun 30 19:01:59 UTC 2021
Comment 25 Colin Percival freebsd_committer freebsd_triage 2021-09-10 16:47:24 UTC
Martin: Can you try the latest 14-CURRENT AMI, and confirm that setting kern.timecounter.hardware=kvmclock fixes it?
Comment 26 martin.stoyanov 2021-09-10 16:57:24 UTC
(In reply to Colin Percival from comment #25)
You mean ami-0c69868f578f6cc8a with this description - "FreeBSD/amd64 main@58a7bf124cc"?(In reply to Colin Percival from comment #25)
Comment 27 Colin Percival freebsd_committer freebsd_triage 2021-09-10 17:48:11 UTC
Yep, that's the one.
Comment 28 martin.stoyanov 2021-09-10 18:15:37 UTC
Created attachment 227814 [details]
dmesg.boot - 14.0-CURRENT
Comment 29 martin.stoyanov 2021-09-10 18:17:58 UTC
(In reply to Colin Percival from comment #27)

Setting kvmclock seems to work as expected:

root@freebsd:~ # sysctl kern.timecounter.hardware=kvmclock
kern.timecounter.hardware: TSC-low -> kvmclock
root@freebsd:~ # date
Fri Sep 10 18:07:12 UTC 2021
root@freebsd:~ # sysctl kern.timecounter
kern.timecounter.tsc_shift: 1
kern.timecounter.smp_tsc_adjust: 0
kern.timecounter.smp_tsc: 1
kern.timecounter.invariant_tsc: 1
kern.timecounter.fast_gettime: 1
kern.timecounter.tick: 1
kern.timecounter.choice: i8254(0) ACPI-fast(900) HPET(950) kvmclock(975) TSC-low(1000) dummy(-1000000)
kern.timecounter.hardware: kvmclock
kern.timecounter.alloweddeviation: 5
kern.timecounter.timehands_count: 2
kern.timecounter.stepwarnings: 0
kern.timecounter.tc.i8254.quality: 0
kern.timecounter.tc.i8254.frequency: 1193182
kern.timecounter.tc.i8254.counter: 25411
kern.timecounter.tc.i8254.mask: 65535
kern.timecounter.tc.ACPI-fast.quality: 900
kern.timecounter.tc.ACPI-fast.frequency: 3579545
kern.timecounter.tc.ACPI-fast.counter: 5485062
kern.timecounter.tc.ACPI-fast.mask: 16777215
kern.timecounter.tc.HPET.quality: 950
kern.timecounter.tc.HPET.frequency: 62500000
kern.timecounter.tc.HPET.counter: 3218406757
kern.timecounter.tc.HPET.mask: 4294967295
kern.timecounter.tc.kvmclock.quality: 975
kern.timecounter.tc.kvmclock.frequency: 1000000000
kern.timecounter.tc.kvmclock.counter: 2954620152
kern.timecounter.tc.kvmclock.mask: 4294967295
kern.timecounter.tc.TSC-low.quality: 1000
kern.timecounter.tc.TSC-low.frequency: 1100021373
kern.timecounter.tc.TSC-low.counter: 2772787131
kern.timecounter.tc.TSC-low.mask: 4294967295
root@freebsd:~ # date
Fri Sep 10 18:08:06 UTC 2021

However I am not able to reproduce the original issue with this instance - TSC-low seems to be working fine as well...

Please let me know if you need anything else or if I can run more tests...
Comment 30 Colin Percival freebsd_committer freebsd_triage 2021-09-10 19:52:50 UTC
(In reply to martin.stoyanov from comment #29)
Thanks for testing -- indeed, reproducing the original issue seems difficult (I was never able to reproduce it myself).  But my contacts at Amazon say that we should definitely use kvmclock where it's available, so I'm inclined to trust them in the absence of evidence to the contrary.

Do you have any existing instances where you're seeing the issue which you might be able to upgrade to HEAD to test this?
Comment 31 martin.stoyanov 2021-09-13 15:20:28 UTC
(In reply to Colin Percival from comment #30)

With the terraform test case I attached, I am able to reproduce this fairly reliably, but only with the 13.0-RELEASE AMIs. As a test, I started one 13.0-RELEASE instance that had the problem and updated it to 14.0-CURRENT and now it works fine with both TSC-low and kvmclock.
Comment 32 martin.stoyanov 2021-09-13 16:11:54 UTC
(In reply to Colin Percival from comment #30)
I guess I spoke too soon - after a reboot of this instance, this is what I get (with TSC-low):

root@freebsd:~ # uname -a
FreeBSD freebsd 14.0-CURRENT FreeBSD 14.0-CURRENT #0 main-n249319-3a0976dea38: Sun Sep 12 14:01:34 UTC 2021     root@freebsd:/usr/obj/usr/src/amd64.amd64/sys/GENERIC  amd64
root@freebsd:~ # sysctl kern.timecounter
kern.timecounter.tsc_shift: 1
kern.timecounter.smp_tsc_adjust: 0
kern.timecounter.smp_tsc: 1
kern.timecounter.invariant_tsc: 1
kern.timecounter.fast_gettime: 1
kern.timecounter.tick: 1
kern.timecounter.choice: i8254(0) ACPI-fast(900) HPET(950) kvmclock(975) TSC-low(1000) dummy(-1000000)
kern.timecounter.hardware: TSC-low
kern.timecounter.alloweddeviation: 5
kern.timecounter.timehands_count: 2
kern.timecounter.stepwarnings: 0
kern.timecounter.tc.i8254.quality: 0
kern.timecounter.tc.i8254.frequency: 1193182
kern.timecounter.tc.i8254.counter: 24651
kern.timecounter.tc.i8254.mask: 65535
kern.timecounter.tc.ACPI-fast.quality: 900
kern.timecounter.tc.ACPI-fast.frequency: 3579545
kern.timecounter.tc.ACPI-fast.counter: 3169760
kern.timecounter.tc.ACPI-fast.mask: 16777215
kern.timecounter.tc.HPET.quality: 950
kern.timecounter.tc.HPET.frequency: 62500000
kern.timecounter.tc.HPET.counter: 163922095
kern.timecounter.tc.HPET.mask: 4294967295
kern.timecounter.tc.kvmclock.quality: 975
kern.timecounter.tc.kvmclock.frequency: 1000000000
kern.timecounter.tc.kvmclock.counter: 236796286
kern.timecounter.tc.kvmclock.mask: 4294967295
kern.timecounter.tc.TSC-low.quality: 1000
kern.timecounter.tc.TSC-low.frequency: 1244790063
kern.timecounter.tc.TSC-low.counter: 1383859311
kern.timecounter.tc.TSC-low.mask: 4294967295
root@freebsd:~ # date ; sntp pool.ntp.org ; sleep 60 ; date ; sntp pool.ntp.org
Mon Sep 13 15:45:25 UTC 2021
ntpd 4.2.8p15-a (1)
kod_init_kod_db(): Cannot open KoD db file /var/db/ntp-kod: No such file or directory
2021-09-13 15:45:25.093618 (+0000) +194.309232 +/- 129.554879 pool.ntp.org 23.131.160.7 s2 no-leap
Mon Sep 13 15:46:25 UTC 2021
ntpd 4.2.8p15-a (1)
kod_init_kod_db(): Cannot open KoD db file /var/db/ntp-kod: No such file or directory
2021-09-13 15:46:25.245602 (+0000) +270.24435 +/- 180.186988 pool.ntp.org 83.68.137.76 s2 no-leap

and with KVMCLOCK:

root@freebsd:~ # sysctl kern.timecounter
kern.timecounter.tsc_shift: 1
kern.timecounter.smp_tsc_adjust: 0
kern.timecounter.smp_tsc: 1
kern.timecounter.invariant_tsc: 1
kern.timecounter.fast_gettime: 1
kern.timecounter.tick: 1
kern.timecounter.choice: i8254(0) ACPI-fast(900) HPET(950) kvmclock(975) TSC-low(1000) dummy(-1000000)
kern.timecounter.hardware: kvmclock
kern.timecounter.alloweddeviation: 5
kern.timecounter.timehands_count: 2
kern.timecounter.stepwarnings: 0
kern.timecounter.tc.i8254.quality: 0
kern.timecounter.tc.i8254.frequency: 1193182
kern.timecounter.tc.i8254.counter: 33596
kern.timecounter.tc.i8254.mask: 65535
kern.timecounter.tc.ACPI-fast.quality: 900
kern.timecounter.tc.ACPI-fast.frequency: 3579545
kern.timecounter.tc.ACPI-fast.counter: 10497046
kern.timecounter.tc.ACPI-fast.mask: 16777215
kern.timecounter.tc.HPET.quality: 950
kern.timecounter.tc.HPET.frequency: 62500000
kern.timecounter.tc.HPET.counter: 2364450388
kern.timecounter.tc.HPET.mask: 4294967295
kern.timecounter.tc.kvmclock.quality: 975
kern.timecounter.tc.kvmclock.frequency: 1000000000
kern.timecounter.tc.kvmclock.counter: 1123025369
kern.timecounter.tc.kvmclock.mask: 4294967295
kern.timecounter.tc.TSC-low.quality: 1000
kern.timecounter.tc.TSC-low.frequency: 1244790063
kern.timecounter.tc.TSC-low.counter: 3561554254
kern.timecounter.tc.TSC-low.mask: 4294967295
root@freebsd:~ # date ; sntp pool.ntp.org ; sleep 60 ; date ; sntp pool.ntp.org
Mon Sep 13 16:06:29 UTC 2021
ntpd 4.2.8p15-a (1)
kod_init_kod_db(): Cannot open KoD db file /var/db/ntp-kod: No such file or directory
2021-09-13 16:06:29.878712 (+0000) -0.420875 +/- 0.322316 pool.ntp.org 72.30.35.88 s2 no-leap
Mon Sep 13 16:07:29 UTC 2021
ntpd 4.2.8p15-a (1)
kod_init_kod_db(): Cannot open KoD db file /var/db/ntp-kod: No such file or directory
2021-09-13 16:07:29.984987 (+0000) -0.422455 +/- 0.334305 pool.ntp.org 168.235.86.33 s2 no-leap


I can keep that instance up and give you access if you want...
Comment 33 Colin Percival freebsd_committer freebsd_triage 2021-09-13 21:07:24 UTC
(In reply to martin.stoyanov from comment #32)
Just to confirm: It looks like that instance is having the problem by default, but the problem goes away when you tell it to use kvmclock?
Comment 34 martin.stoyanov 2021-09-13 21:11:00 UTC
(In reply to Colin Percival from comment #33)
Yep, that's exactly right.
Comment 35 Colin Percival freebsd_committer freebsd_triage 2021-09-13 21:49:22 UTC
(In reply to martin.stoyanov from comment #34)
Ok, great.  In that case no need to keep the instance around; we'll get the default timecounter changed and the issue should be gone next release.
Comment 36 martin.stoyanov 2021-09-13 21:50:27 UTC
(In reply to Colin Percival from comment #35)

Works for me; thanks a lot for your time and effort!
Comment 37 dwmw2 2021-11-19 07:39:57 UTC
The TSC shouldn't vary; I suspect what we're seeing here is an issue with the i8253 PIT making the calibration go wrong.

Do you have a current way to switch back from kvmclock to TSC and test, that I can experiment with and then include in regression tests please?
Comment 38 Roger Pau Monné freebsd_committer freebsd_triage 2021-11-19 07:53:25 UTC
(In reply to dwmw2 from comment #37)
You should be able to switch back to TSC at runtime with:
# sysctl kern.timecounter.hardware=TSC
You can also set it in loader.conf to select at boot time. I think older versions of FreeBSD name this TSC-low, so check kern.timecounter.choice for the option name.
Comment 39 Colin Percival freebsd_committer freebsd_triage 2022-03-28 18:49:23 UTC
Sorry for the long radio silence on this issue.  We think this was due to a bug on Amazon's side which has since been fixed, so I'm going to close the bug.

If anyone can still reproduce this, please let me know!