Bug 295043 - pf: per-packet microuptime() calls from 0493260115db cause major throughput regression with expensive timecounters
Summary: pf: per-packet microuptime() calls from 0493260115db cause major throughput r...
Status: New
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: 15.0-RELEASE
Hardware: amd64 Any
: --- Affects Some People
Assignee: freebsd-pf (Nobody)
URL:
Keywords: regression
Depends on:
Blocks:
 
Reported: 2026-05-06 01:34 UTC by Justin Worrell
Modified: 2026-05-06 14:59 UTC (History)
3 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Justin Worrell 2026-05-06 01:34:48 UTC
Commit 0493260115db ("pf: store state creation/expiration timestamps
with millisecond precision") replaced time_uptime reads with
microuptime() calls via pf_get_uptime() in pf's per-packet state expire
update path. On systems using HPET (common in KVM guests without ITSC),
this is extremely expensive.
 
https://github.com/freebsd/freebsd-src/commit/0493260115db10164246762d6d4923880575e529
 
Measurements (FreeBSD/KVM guest, iperf3, pf enabled with stateful rules):
 
  vtnet,  HPET:     ~2.3 Gbit/s
  vtnet,  TSC-low:  ~9   Gbit/s
  ixl passthrough + VLANs, HPET:    <1   Gbit/s
  ixl passthrough + VLANs, TSC-low: ~9   Gbit/s
  pf disabled, any timecounter:     ~9   Gbit/s
 
  kern.timecounter.choice: TSC-low(-100) i8254(0) ACPI-fast(900) HPET(950) dummy(-1000000)
 
TSC-low is distrusted because the hypervisor doesn't expose ITSC to the
guest, which is common in many KVM configurations.
 
How to reproduce:
 
In a VM with pf enabled and stateful rules:
 
  sysctl kern.timecounter.hardware=HPET
  iperf3 -c <target>    # observe reduced throughput
  sysctl kern.timecounter.hardware=TSC-low
  iperf3 -c <target>    # throughput restored
 
Possible approaches:
 
The millisecond precision is primarily needed by pflow(4) at export
time, not per packet. Defaulting pf_get_uptime() to
(uint64_t)time_uptime * 1000 would preserve the data model changes
while eliminating the per-packet timecounter read. The precise
microuptime() path could be gated behind a sysctl for pflow users who
need it.
 
Alternatively, the patch could use getmicrouptime() instead of
microuptime(). The former returns a cached value without querying the
hardware timecounter, which is what the man page recommends when
execution time matters more than precision.