283116 – ntpd doesn't sync with any NTP servers on IPv6-Only host

Bug 283116 - ntpd doesn't sync with any NTP servers on IPv6-Only host

Summary: ntpd doesn't sync with any NTP servers on IPv6-Only host

Status:	Closed FIXED

Alias:	None

Product:	Base System
Classification:	Unclassified
Component:	bin (show other bugs)
Version:	14.2-RELEASE
Hardware:	arm64 Any

Importance:	--- Affects Many People
Assignee:	Cy Schubert

URL:
Keywords:	pkgbase

Depends on:
Blocks:

Reported:	2024-12-04 07:38 UTC by Dmitrij
Modified:	2025-02-24 15:26 UTC (History)
CC List:	8 users (show)

See Also:

Attachments
Correctly setuid to ntpd. (1.91 KB, patch) 2024-12-12 21:39 UTC, Cy Schubert	no flags	Details \| Diff
View All Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Dmitrij 2024-12-04 07:38:14 UTC

After OS upgrade RELEASE 14.1 -> 14.2 ntpd doesn't sync with any NTP servers on IPv6-Only host (arm64). IPv4+IPv6 hosts (amd64) successfully syncs with ipv6 and ipv4 servers.

Observed behavior via tcpdump:
The host sends and receives AAAA DNS requests to get available ipv6 NTP servers addresses.
The IPv6 NTP send/receive traffic is present, while ntpd is running (no other ntp software is active).

But the ntpd state keeps running in unsynced state (observed for more than 1 hour):
# ntpq -nc peers
     remote           refid      st t when poll reach   delay   offset  jitter
==============================================================================
 0.freebsd.pool. .POOL.          16 p    -   64    0    0.000   +0.000   0.000
 2.freebsd.pool. .POOL.          16 p    -   64    0    0.000   +0.000   0.000

# ntptime
ntp_gettime() returns code 5 (ERROR)
  time eafa82ab.8361f000  Wed, Dec  4 2024  9:26:35.513, (.100513213),
  maximum error 16409000 us, estimated error 16000000 us, TAI offset 0
ntp_adjtime() returns code 5 (ERROR)
  modes 0x0 (),
  offset 0.000 us, frequency 6.607 ppm, interval 4 s,
  maximum error 16409000 us, estimated error 16000000 us,
  status 0x41 (PLL,UNSYNC),
  time constant 3, precision 0.000 us, tolerance 496 ppm,
  pps frequency 6.607 ppm, stability 0.000 ppm, jitter 0.000 us,
  intervals 0, jitter exceeded 0, stability exceeded 0, errors 0.

No errors or warnings is reported by ntpd via syslog (like "error resolving pool").

Sample NTP traffic:
# tcpdump -n -p -v port ntp
tcpdump: listening on vtnet0, link-type EN10MB (Ethernet), snapshot length 262144 bytes
09:27:03.286586 IP6 (class 0xb8, hlim 64, next-header UDP (17) payload length: 56) <Host IPv6>.123 > 2a01:4f9:c012:46b2::123.123: [bad udp cksum 0xa92a -> 0x228a!] NTPv4, Client, length 48
	Leap indicator: clock unsynchronized (192), Stratum 0 (unspecified), poll 6 (64s), precision -23
	Root Delay: 0.000000, Root dispersion: 0.000000, Reference-ID: (unspec)
	  Reference Timestamp:  0.000000000
	  Originator Timestamp: 0.000000000
	  Receive Timestamp:    0.000000000
	  Transmit Timestamp:   3942286023.286500161 (2024-12-04T07:27:03Z)
	    Originator - Receive Timestamp:  0.000000000
	    Originator - Transmit Timestamp: 3942286023.286500161 (2024-12-04T07:27:03Z)
09:27:03.287801 IP6 (hlim 56, next-header UDP (17) payload length: 56) 2a01:4f9:c012:46b2::123.123 > <Host IPv6>.123: [udp sum ok] NTPv4, Server, length 48
	Leap indicator:  (0), Stratum 3 (secondary reference), poll 6 (64s), precision -24
	Root Delay: 0.002014, Root dispersion: 0.000717, Reference-ID: 0xc2643197
	  Reference Timestamp:  3942285568.890700930 (2024-12-04T07:19:28Z)
	  Originator Timestamp: 3942286023.286500161 (2024-12-04T07:27:03Z)
	  Receive Timestamp:    3942286022.776204334 (2024-12-04T07:27:02Z)
	  Transmit Timestamp:   3942286022.776319710 (2024-12-04T07:27:02Z)
	    Originator - Receive Timestamp:  -0.510295826
	    Originator - Transmit Timestamp: -0.510180450
09:27:04.231791 IP6 (class 0xb8, hlim 64, next-header UDP (17) payload length: 56) <Host IPv6>.123 > 2a01:4f9:3081:399c::4.123: [bad udp cksum 0x0b64 -> 0x66b9!] NTPv4, Client, length 48
	Leap indicator: clock unsynchronized (192), Stratum 0 (unspecified), poll 6 (64s), precision -23
	Root Delay: 0.000000, Root dispersion: 0.000000, Reference-ID: (unspec)
	  Reference Timestamp:  0.000000000
	  Originator Timestamp: 0.000000000
	  Receive Timestamp:    0.000000000
	  Transmit Timestamp:   3942286024.231696767 (2024-12-04T07:27:04Z)
	    Originator - Receive Timestamp:  0.000000000
	    Originator - Transmit Timestamp: 3942286024.231696767 (2024-12-04T07:27:04Z)
09:27:04.235382 IP6 (flowlabel 0x7f2fc, hlim 57, next-header UDP (17) payload length: 56) 2a01:4f9:3081:399c::4.123 > <Host IPv6>.123: [udp sum ok] NTPv4, Server, length 48
	Leap indicator:  (0), Stratum 3 (secondary reference), poll 6 (64s), precision -25
	Root Delay: 0.004394, Root dispersion: 0.000991, Reference-ID: 0xc8634fed
	  Reference Timestamp:  3942285499.520158031 (2024-12-04T07:18:19Z)
	  Originator Timestamp: 3942286024.231696767 (2024-12-04T07:27:04Z)
	  Receive Timestamp:    3942286023.724094255 (2024-12-04T07:27:03Z)
	  Transmit Timestamp:   3942286023.724137758 (2024-12-04T07:27:03Z)
	    Originator - Receive Timestamp:  -0.507602511
	    Originator - Transmit Timestamp: -0.507559008
09:27:05.230506 IP6 (class 0xb8, hlim 64, next-header UDP (17) payload length: 56) <Host IPv6>.123 > 2606:4700:f1::1.123: [bad udp cksum 0xe040 -> 0xf216!] NTPv4, Client, length 48
	Leap indicator: clock unsynchronized (192), Stratum 0 (unspecified), poll 6 (64s), precision -23
	Root Delay: 0.000000, Root dispersion: 0.000000, Reference-ID: (unspec)
	  Reference Timestamp:  0.000000000
	  Originator Timestamp: 0.000000000
	  Receive Timestamp:    0.000000000
	  Transmit Timestamp:   3942286025.230409312 (2024-12-04T07:27:05Z)
	    Originator - Receive Timestamp:  0.000000000
	    Originator - Transmit Timestamp: 3942286025.230409312 (2024-12-04T07:27:05Z)
09:27:05.232203 IP6 (flowlabel 0x7bc5b, hlim 57, next-header UDP (17) payload length: 56) 2606:4700:f1::1.123 > <Host IPv6>.123: [udp sum ok] NTPv4, Server, length 48
	Leap indicator:  (0), Stratum 3 (secondary reference), poll 6 (64s), precision -25
	Root Delay: 0.006256, Root dispersion: 0.000198, Reference-ID: 0x0a4f0920
	  Reference Timestamp:  3942285853.275892145 (2024-12-04T07:24:13Z)
	  Originator Timestamp: 3942286025.230409312 (2024-12-04T07:27:05Z)
	  Receive Timestamp:    3942286024.720292848 (2024-12-04T07:27:04Z)
	  Transmit Timestamp:   3942286024.720433025 (2024-12-04T07:27:04Z)
	    Originator - Receive Timestamp:  -0.510116463
	    Originator - Transmit Timestamp: -0.509976287
09:27:06.283868 IP6 (class 0xb8, hlim 64, next-header UDP (17) payload length: 56) <Host IPv6>.123 > 2001:67c:164:200::184:123.123: [bad udp cksum 0x9ed0 -> 0xe924!] NTPv4, Client, length 48
	Leap indicator: clock unsynchronized (192), Stratum 0 (unspecified), poll 6 (64s), precision -23
	Root Delay: 0.000000, Root dispersion: 0.000000, Reference-ID: (unspec)
	  Reference Timestamp:  0.000000000
	  Originator Timestamp: 0.000000000
	  Receive Timestamp:    0.000000000
	  Transmit Timestamp:   3942286026.283772916 (2024-12-04T07:27:06Z)
	    Originator - Receive Timestamp:  0.000000000
	    Originator - Transmit Timestamp: 3942286026.283772916 (2024-12-04T07:27:06Z)
09:27:06.285732 IP6 (flowlabel 0xa1bf5, hlim 56, next-header UDP (17) payload length: 56) 2001:67c:164:200::184:123.123 > <Host IPv6>.123: [udp sum ok] NTPv4, Server, length 48
	Leap indicator:  (0), Stratum 2 (secondary reference), poll 6 (64s), precision -25
	Root Delay: 0.001174, Root dispersion: 0.001037, Reference-ID: 0xc26402c2
	  Reference Timestamp:  3942285199.430308981 (2024-12-04T07:13:19Z)
	  Originator Timestamp: 3942286026.283772916 (2024-12-04T07:27:06Z)
	  Receive Timestamp:    3942286025.773833398 (2024-12-04T07:27:05Z)
	  Transmit Timestamp:   3942286025.773961652 (2024-12-04T07:27:05Z)
	    Originator - Receive Timestamp:  -0.509939518
	    Originator - Transmit Timestamp: -0.509811264

Comment 1 Dmitrij 2024-12-04 07:48:14 UTC

All mentioned hosts are running at kern.securelevel=2

Comment 2 Dmitrij 2024-12-05 07:27:16 UTC

Tried to run ntpd with --ipv6 (for about an hour), result is the same:
ipv6 NTP traffic is present, but ntpd maintain unsynced state.

Comment 3 Mark Johnston freebsd_committer

2024-12-06 17:27:12 UTC

It might be worth running ntpd under ktrace to see if there are some errors not getting logged.

Does the problem occur with a reduced securelevel as well?

Comment 4 Cy Schubert freebsd_committer

2024-12-06 17:31:26 UTC

grep ntpd /var/log/messages. There should be errors in it.

Also stop ntpd. Then,

ntpd -D9 -n -6

Post the output here.

Comment 5 Cy Schubert freebsd_committer

2024-12-06 17:43:46 UTC

I've been able to reproduce this problem on my sandbox.

I will reach out to my nwtime.org contact to see what he thinks.

Comment 6 Cy Schubert freebsd_committer

2024-12-06 17:48:04 UTC

The email I sent him states this:

I've been able to reproduce this here at home using ntpd -D9 -n -6

The messages I see are:

 6 Dec 09:45:18 ntpd[3102]: 2600:1f11:a50:1100::be00:5 local addr <null> -> fc00:1:1:1::7
 6 Dec 09:45:19 ntpd[3102]: Soliciting pool server 2001:678:8::123
 6 Dec 09:45:19 ntpd[3102]: 2001:678:8::123 local addr <null> -> fc00:1:1:1::7
 6 Dec 09:45:20 ntpd[3102]: Soliciting pool server 2607:5300:201:3100::5d2e
 6 Dec 09:45:20 ntpd[3102]: 2607:5300:201:3100::5d2e local addr <null> -> fc00:1:1:1::7
 6 Dec 09:45:21 ntpd[3102]: Soliciting pool server 2602:fbc7:2:0:beef:d00d:f00b:5555
 6 Dec 09:45:21 ntpd[3102]: 2602:fbc7:2:0:beef:d00d:f00b:5555 local addr <null> -> fc00:1:1:1::7
 6 Dec 09:45:22 ntpd[3102]: Soliciting pool server 2600:1f11:a50:1100::be00:5
 6 Dec 09:45:22 ntpd[3102]: 2600:1f11:a50:1100::be00:5 local addr <null> -> fc00:1:1:1::7
 6 Dec 09:45:23 ntpd[3102]: Soliciting pool server 2001:678:8::123
 6 Dec 09:45:23 ntpd[3102]: 2001:678:8::123 local addr <null> -> fc00:1:1:1::7
 6 Dec 09:45:24 ntpd[3102]: Soliciting pool server 2607:5300:201:3100::5d2e
 6 Dec 09:45:24 ntpd[3102]: 2607:5300:201:3100::5d2e local addr <null> -> fc00:1:1:1::7
 6 Dec 09:45:25 ntpd[3102]: Soliciting pool server 2602:fbc7:2:0:beef:d00d:f00b:5555
 6 Dec 09:45:25 ntpd[3102]: 2602:fbc7:2:0:beef:d00d:f00b:5555 local addr <null> -> fc00:1:1:1::7
 6 Dec 09:45:26 ntpd[3102]: Soliciting pool server 2600:1f11:a50:1100::be00:5
 6 Dec 09:45:26 ntpd[3102]: 2600:1f11:a50:1100::be00:5 local addr <null> -> fc00:1:1:1::7
^C 6 Dec 09:45:27 ntpd[3102]: ntpd exiting on signal 2 (Interrupt)


Whereas ntpd -D9 -n -4 works correctly:

 6 Dec 09:46:07 ntpd[3103]: Soliciting pool server 23.133.168.244
 6 Dec 09:46:08 ntpd[3103]: Soliciting pool server 216.232.132.102
 6 Dec 09:46:08 ntpd[3103]: 216.232.132.102 local addr <null> -> 10.1.1.7
 6 Dec 09:46:09 ntpd[3103]: Soliciting pool server 23.133.168.246
 6 Dec 09:46:09 ntpd[3103]: 23.133.168.246 local addr <null> -> 10.1.1.7
 6 Dec 09:46:10 ntpd[3103]: Soliciting pool server 23.162.240.10
 6 Dec 09:46:10 ntpd[3103]: 23.162.240.10 local addr <null> -> 10.1.1.7
 6 Dec 09:46:11 ntpd[3103]: Soliciting pool server 216.232.132.95
 6 Dec 09:46:11 ntpd[3103]: 216.232.132.95 local addr <null> -> 10.1.1.7

Comment 7 Cy Schubert freebsd_committer

2024-12-06 17:48:39 UTC

While he and his team are looking at it I will dig further.

Comment 8 Cy Schubert freebsd_committer

2024-12-06 19:00:02 UTC

I've been able to reproduce this problem under Fedora 41.

Comment 9 Cy Schubert freebsd_committer

2024-12-06 19:14:20 UTC

It appears on the 14.1-RELEASE VM I have here it doesn't even try to use IPv6, even with the -6 flag.

Comment 10 Cy Schubert freebsd_committer

2024-12-06 19:25:57 UTC

Same bug under FreeBSD as Fedora. Fedora lists interrupted system call in /var/log/messages. FreeBSD doesn't but truss lists interrupted system call from a select() call. This is due to a popped alarm terminating the call when it doesn't return.

Comment 11 Cy Schubert freebsd_committer

2024-12-06 21:06:08 UTC

This appears to be a DNS query issue. When ntpd issues getaddrinfo() against a name it uses the IPv4 address.

There are two ways to resolve this:

1. Replace the symbolic names in your ntp.conf with IPv6 addresses.

2. Add -6 to your server, peer, and pool statements.

slippy# ntpq -p
     remote           refid      st t when poll reach   delay   offset  jitter
==============================================================================
 0.freebsd.pool. .POOL.          16 p    -   64    0    0.000   +0.000   0.000
 2.freebsd.pool. .POOL.          16 p    -   64    0    0.000   +0.000   0.000
 2001:470:b2de:: 97.183.206.88    2 u    2   64    1   65.916   +3.024   1.700
 2602:fbc7:2:0:b 207.197.87.124   4 u    1   64    1   58.862   +9.446   2.449
 2600:1f11:a50:1 16.164.40.197    2 u    2   64    1   71.231   +2.347   0.179
 risa.xs4me.net  255.187.173.73   2 u    1   64    1   93.343  +12.228   0.076
slippy# 

You can use either or both in ntp.conf as below:

slippy# egrep '^pool|^server|^peer' /etc/ntp.conf
pool 0.freebsd.pool.ntp.org iburst
pool -6 0.freebsd.pool.ntp.org iburst
pool 2.freebsd.pool.ntp.org iburst
pool -6 2.freebsd.pool.ntp.org iburst
server cwfw iburst prefer
server -6 cwfw iburst prefer
peer cwsys iburst
peer -6 cwsys iburst
peer bob iburst
peer -6 bob iburst
slippy# 

It defaults to IPv4 without the -6. This results in ntp querying an invalid IPv6 address consisting of a 32-bit IPv4 address in the IPv6 address variable, 99.999% chance this is invalid. And if it happens by fluke to be a valid IPv6 address it's almost definitely not the one you want.

Comment 12 Cy Schubert freebsd_committer

2024-12-06 21:24:33 UTC

Here is what I wrote our upstream:

Here's the problem:

The ntp.conf man page states:

     Note that in contexts where a host name is expected, a -4 qualifier
     preceding the host name forces DNS resolution to the IPv4 namespace,
     while a -6 qualifier forces DNS resolution to the IPv6 namespace.  See
     IPv6 references for the equivalent classes for that address family.

This is not true. Without the -6 it uses the first IP that is returned, which almost always happens to be the IPv4 address. The workaround for IPv6-only users is to put a -6 into their peer, server, and pool statements.

Comment 13 Dmitrij 2024-12-09 08:10:17 UTC

(In reply to Cy Schubert from comment #12)

Added -6 prior to server address in ntp.conf

Result:
Now only AAAA DNS request are being sent to resolve ntp server addresses (vs A and AAAA without the -6 flag)

# ntpd -D9 -n -6 (without -6 here result is the same)
 9 Dec 09:59:38 ntpd[15408]: Soliciting pool server 2a01:4f9:c012:46b2::123
 9 Dec 09:59:38 ntpd[15408]: 2a01:4f9:c012:46b2::123 local addr <null> -> [host ipv6]
 9 Dec 09:59:39 ntpd[15408]: Soliciting pool server 2a01:4f9:c011:a343:123:123:123:123
 9 Dec 09:59:39 ntpd[15408]: 2a01:4f9:c011:a343:123:123:123:123 local addr <null> -> [host ipv6]
 9 Dec 09:59:40 ntpd[15408]: Soliciting pool server 2606:4700:f1::123
 9 Dec 09:59:40 ntpd[15408]: 2606:4700:f1::123 local addr <null> -> [host ipv6]
 9 Dec 09:59:41 ntpd[15408]: Soliciting pool server 2606:4700:f1::1
 9 Dec 09:59:41 ntpd[15408]: 2606:4700:f1::1 local addr <null> -> [host ipv6]
 9 Dec 09:59:42 ntpd[15408]: Soliciting pool server 2a01:4f9:c012:46b2::123
 9 Dec 09:59:42 ntpd[15408]: 2a01:4f9:c012:46b2::123 local addr <null> -> [host ipv6]
 9 Dec 09:59:43 ntpd[15408]: Soliciting pool server 2a01:4f9:c011:a343:123:123:123:123
 9 Dec 09:59:43 ntpd[15408]: 2a01:4f9:c011:a343:123:123:123:123 local addr <null> -> [host ipv6]
 9 Dec 09:59:44 ntpd[15408]: Soliciting pool server 2606:4700:f1::123
 9 Dec 09:59:44 ntpd[15408]: 2606:4700:f1::123 local addr <null> -> [host ipv6]
 9 Dec 09:59:45 ntpd[15408]: Soliciting pool server 2606:4700:f1::1
 9 Dec 09:59:45 ntpd[15408]: 2606:4700:f1::1 local addr <null> -> [host ipv6]
 9 Dec 09:59:46 ntpd[15408]: Soliciting pool server 2a01:4f9:c012:46b2::123

The state is still unsynced.
$ ntpq -p -nw
     remote           refid      st t when poll reach   delay   offset  jitter
==============================================================================
 2.freebsd.pool.ntp.org
                 .POOL.          16 p    -   64    0    0.000   +0.000   0.000

Comment 14 Dmitrij 2024-12-09 08:14:30 UTC

ntpdate works fine by the way:

# ntpdate 2.freebsd.pool.ntp.org
 9 Dec 10:13:18 ntpdate[14090]: adjust time server 2a01:4f9:c012:963::1 offset +0.007065 sec

Comment 15 Cy Schubert freebsd_committer

2024-12-09 22:45:09 UTC

I can reproduce the bug with the suggested configuration on FreeBSD 15-CURRENT and Fedora 41. Our upstream have requested I open a bug report.

Comment 16 Dmitrij 2024-12-10 12:02:42 UTC

Thank you for your efforts in investigating and reporting the bug upstream! 🎉 Wishing you a Merry Christmas filled with joy and success! 🎄✨

Comment 17 antonfb 2024-12-10 15:17:36 UTC

Hmm.. I notices my ntpd stopped working and adding -4 option made it work. host recently did change to securelevel=2

Comment 18 Cy Schubert freebsd_committer

2024-12-10 15:49:22 UTC

(In reply to Dmitrij from comment #16)

No worries, I'm still investigating. I (cy@nwtime.org) am working with my contact at nwtime.org. The ntp.org bug is at https://bugs.ntp.org/show_bug.cgi?id=3958.

ntp-4.2.17 exhibits the same bug on 15-CURRENT. Next step is to install FreeBSD 14.1 somewhere to verify that it did indeed work there.

Comment 19 Cy Schubert freebsd_committer

2024-12-10 15:50:21 UTC

(In reply to antonfb from comment #17)

Did it work prior to setting securelevel=2?

Comment 20 antonfb 2024-12-10 15:59:02 UTC

Host was/is 14.1-RELEASE kept patched, currently p6
Maybe a point of note:
host has net.inet6.ip6.v6only=0
Playing with it some.
I removed my -4 flag i.e.
ntpd_enable="YES"
#ntpd_flags="-4"
ntpd does start.
but ntpq -p hangs, ntpq -4 -p does work.
So something strange with ipv6 certainly.
Host is a vultr instance using their ntpds.
thalia.hesiod.org:root[18]: ntpq -p4
     remote           refid      st t when poll reach   delay   offset  jitter
==============================================================================
 hydrogen.consta 129.6.15.28      2 u   61   64    7   61.976   -1.446   0.270
 helium.constant 129.6.15.27      2 u   62   64    7   61.912   -0.801   0.266
 lithium.constan 132.163.96.2     2 u   60   64    7   61.970   -1.348   0.248

Comment 21 antonfb 2024-12-10 16:00:37 UTC

Yes. It was working, I didn't notice when it stopped working so I'm not certain if it was securelevel=2 or v6only=0 which caused issues because that also changed recently.

Comment 22 Cy Schubert freebsd_committer

2024-12-10 17:32:55 UTC

Testing on my 14.1 VM, without securelevel=2, the bug is certainly there too. This is consistent with the bug exhibiting itself in Fedora (also).

Comment 23 Cy Schubert freebsd_committer

2024-12-12 00:44:57 UTC

Some additional testing with the -6 flag on the command line and the -6 flag on the peer, server, and pool statements my results are:

bob# ntpq
ntpq> hostnames no
ntpq> peers
     remote           refid      st t when poll reach   delay   offset  jitter
==============================================================================
 0.freebsd.pool. .POOL.          16 p    -   64    0    0.000   +0.000   0.002
 2.freebsd.pool. .POOL.          16 p    -   64    0    0.000   +0.000   0.002
*fc00:1:1:1::fff 206.108.0.131    2 u   22   64    3    0.311   -0.326   0.031
+fc00:1:1:1::1   10.1.1.254       3 s   24   64    3    0.350   -0.847   0.053
+fc00:1:1:1::5b  10.1.1.254       3 s   23   64    3    0.294   -0.317   0.060
ntpq> 

My previous tests were performed behind IPv6 NAT (which I suspect isn't working as well as I thought). With direct connection to the Internet the results are much better. I suspect that ntpd is using IPv4 addresses resulting in populating the first 32 bits of the 128-bit IPv6 address. Just a hypothesis ATM.

Comment 24 Cy Schubert freebsd_committer

2024-12-12 05:51:54 UTC

Posted on bugs.ntp.org:

My hunch is correct. It is indeed attempting to connect to IPv4 addresses:

  5851 ntpd     CALL  connect(0x5,0x39f3c2b4f4f4,0x10)
  5851 ntpd     STRU  struct sockaddr { AF_INET, 10.1.1.254:123 }
  5851 ntpd     RET   connect 0
  5851 ntpd     CALL  getsockname(0x5,0x39f3c2b4f324,0x39f3c2b4f320)
  5851 ntpd     STRU  struct sockaddr { AF_INET, 10.1.1.7:11480 }
  5851 ntpd     RET   getsockname 0

Comment 25 Cy Schubert freebsd_committer

2024-12-12 21:39:59 UTC

Created attachment 255819 [details]
Correctly setuid to ntpd.

Can you try this patch, please?

Comment 26 Dmitrij 2024-12-15 10:04:21 UTC

(In reply to Cy Schubert from comment #25)

Applied your patch.

Result is the same: ntp ipv6 traffic appears, but ntp remains in unsynced state.

$ ntpq -p -nw
     remote           refid      st t when poll reach   delay   offset  jitter
==============================================================================
 2.freebsd.pool.ntp.org
                 .POOL.          16 p    -   64    0    0.000   +0.000   0.000

$ ntptime
ntp_gettime() returns code 5 (ERROR)
  time eb0927bd.72752000  Sun, Dec 15 2024 12:02:37.447, (.254447100),
  maximum error 16300500 us, estimated error 16000000 us, TAI offset 0
ntp_adjtime() returns code 5 (ERROR)
  modes 0x0 (),
  offset 0.000 us, frequency 6.607 ppm, interval 4 s,
  maximum error 16300500 us, estimated error 16000000 us,
  status 0x41 (PLL,UNSYNC),
  time constant 3, precision 0.000 us, tolerance 496 ppm,
  pps frequency 6.607 ppm, stability 0.000 ppm, jitter 0.000 us,
  intervals 0, jitter exceeded 0, stability exceeded 0, errors 0.


No errors or warnings in syslog:
<101>1 2024-12-15T11:52:36.454427+02:00 <hostname> ntpd 67779 - - ntpd 4.2.8p18-a (1): Starting
<101>1 2024-12-15T11:52:36.454514+02:00 <hostname> ntpd 67779 - - Command line: /usr/sbin/ntpd -p /var/db/ntp/ntpd.pid -c /etc/ntp.conf -f /var/db/ntp/ntpd.drift -u ntpd:ntpd -g
<101>1 2024-12-15T11:52:36.454545+02:00 <hostname> ntpd 67779 - - ----------------------------------------------------
<101>1 2024-12-15T11:52:36.454575+02:00 <hostname> ntpd 67779 - - ntp-4 is maintained by Network Time Foundation,
<101>1 2024-12-15T11:52:36.454603+02:00 <hostname> ntpd 67779 - - Inc. (NTF), a non-profit 501(c)(3) public-benefit
<101>1 2024-12-15T11:52:36.454638+02:00 <hostname> ntpd 67779 - - corporation.  Support and training for ntp-4 are
<101>1 2024-12-15T11:52:36.454690+02:00 <hostname> ntpd 67779 - - available at https://www.nwtime.org/support
<101>1 2024-12-15T11:52:36.454745+02:00 <hostname> ntpd 67779 - - ----------------------------------------------------
<101>1 2024-12-15T11:52:36.457427+02:00 <hostname> ntpd 67909 - - leapsecond file ('/var/db/ntpd.leap-seconds.list'): good hash signature
<101>1 2024-12-15T11:52:36.457615+02:00 <hostname> ntpd 67909 - - leapsecond file ('/var/db/ntpd.leap-seconds.list'): loaded, expire=2025-06-28T00:00:00Z last=2017-01-01T00:00:00Z ofs=37

Comment 27 Cy Schubert freebsd_committer

2024-12-15 13:45:21 UTC

(In reply to Dmitrij from comment #26)

Use this, please.

/usr/sbin/ntpd -p /var/db/ntp/ntpd.pid -c /etc/ntp.conf -f /var/db/ntp/ntpd.drift -u ntpd:ntpd -g

Comment 28 Dmitrij 2024-12-15 14:08:43 UTC

(In reply to Cy Schubert from comment #27)

Reminder: the issues with ntpd are happening at aarch64 platform (Hetzner arm64 VM).

In addition I disabled the securelevel and reboot system.

Then:
# /usr/sbin/ntpd -p /var/db/ntp/ntpd.pid -c /etc/ntp.conf -f /var/db/ntp/ntpd.drift -u ntpd:ntpd -g
daemon control: got EOF
Exit 255

<99>1 2024-12-15T15:53:18.148866+02:00 <hostname> ntpd 42859 - - Need MAC 'ntpd' policy enabled to drop root privileges
<99>1 2024-12-15T15:53:18.150033+02:00 <hostname> ntpd 42390 - - daemon child exited with code 255

So I started and stopped ntpd by the service command to load MAC policy.
Then started again from cmd line:
# /usr/sbin/ntpd -p /var/db/ntp/ntpd.pid -c /etc/ntp.conf -f /var/db/ntp/ntpd.drift -u ntpd:ntpd -g

ps:
ntpd    38171   0.0  0.2 23280  8040  -  Ss   15:55    0:00.17 /usr/sbin/ntpd -p /var/db/ntp/ntpd.pid -c /etc/ntp.conf -f /var/db/ntp/ntpd.drift -u ntpd:ntpd -g


Result is the same:
# ntpq -pnw
     remote           refid      st t when poll reach   delay   offset  jitter
==============================================================================
 2.freebsd.pool.ntp.org
                 .POOL.          16 p    -   64    0    0.000   +0.000   0.000


(In reply to Cy Schubert from comment #27)

Comment 29 Cy Schubert freebsd_committer

2024-12-16 07:51:50 UTC

(In reply to Dmitrij from comment #28)

> Then:
> # /usr/sbin/ntpd -p /var/db/ntp/ntpd.pid -c /etc/ntp.conf -f /var/db/ntp/ntpd.drift -u ntpd:ntpd -g
> daemon control: got EOF
> Exit 255

More than one interface has the same IP. Make sure that lo0 and your wired/wireless interface do not share the same IP.

Can you list the output of: ifconfig (with no arguments).

Comment 30 Dmitrij 2024-12-16 09:25:53 UTC

(In reply to Cy Schubert from comment #29)

There are no duplicate IPs on different interfaces, I don't want to provide ifconfig listing for security reasons, it is a prod server.

ntpd exit 255 was due to mac_ntpd was not loaded, which is done by ntpd startup script (kldload -qn mac_ntpd). Please check my previous message, I've explained it and provided syslog.

Comment 31 Cy Schubert freebsd_committer

2024-12-16 15:05:07 UTC

(In reply to Dmitrij from comment #30)

The issue is that if ntpd is invoked directly by the ntpd account, using su ntpd, it will not properly open its IPv6 sockets. However when invoked using -u ntpd:ntpd, to setuid(ntpd), the problem resolves itself.

Your output indicates this is a local problem. I am unable to reproduce your problem here with the patch.

Comment 32 Cy Schubert freebsd_committer

2024-12-17 00:46:09 UTC

Is your machine multi-homed? I.E. more than one interface?

Comment 33 Dmitrij 2024-12-17 08:23:49 UTC

(In reply to Cy Schubert from comment #32)

There is only one interface (vtnet0) pointing to internet and one default route:
default                           fe80::1%vtnet0                UGS          vtnet0

There is also a wireguard vpn interface which is used to access isolated internal services via subnet from fc00::/7 (Private internets) space.

Comment 34 Cy Schubert freebsd_committer

2024-12-17 14:22:27 UTC

(In reply to Dmitrij from comment #33)

I did not ask that. How many interfaces in total?

Comment 35 Dmitrij 2024-12-17 19:00:30 UTC

(In reply to Cy Schubert from comment #34)

4 in total:
vtnet0 - regular
wg0 - wireguard
lo0
pflog0

Comment 36 Cy Schubert freebsd_committer

2024-12-17 23:11:33 UTC

I'm hypothesizing that IPv6 pools don't work with FreeBSD -- still to verify if this is also a bug in Fedora. Can you please replace the pool 2.freebsd.ntp.org with,

server time.cloudflare.com iburst
server time.google.com iburst
server time.apple.com iburst

You should see output similar to this:

bob# ntpq -pn
     remote           refid      st t when poll reach   delay   offset  jitter
==============================================================================
 2.freebsd.pool. .POOL.          16 p    -   64    0    0.000   +0.000   0.002
*fc00:1:1:1::fff 108.160.18.12    2 u    2   64    7    0.304   +0.375   0.046
 2606:4700:f1::1 10.69.8.92       3 u    7   64    7   10.764   +1.694   1.514
 2001:4860:4806: .GOOG.           1 u    5   64    7   17.251   +1.081   3.491
 2620:149:a00:40 .SHM.            1 u    4   64    7   33.814   +1.218   1.972
 fc00:1:1:1::1   10.1.1.254       3 s    -   64    7    0.232   -0.075   0.045
 fc00:1:1:1::5b  10.1.1.254       3 s    1   64    7    0.250   -0.316   0.022
bob# 

Note that the pool is not working.

If this works for you as it does for me, this should send me down the correct path.

Let it run for a while and post the output.

Can you rebuild ntp with DEBUGGING please?

Apply the following patch:

diff --git a/usr.sbin/ntp/Makefile.inc b/usr.sbin/ntp/Makefile.inc
index 5801d91aac46..7a2896ba9f9a 100644
--- a/usr.sbin/ntp/Makefile.inc
+++ b/usr.sbin/ntp/Makefile.inc
@@ -1,6 +1,6 @@
 .include <src.opts.mk>
 
-DEFS_LOCAL= -DPARSE -DHAVE_CONFIG_H
+DEFS_LOCAL= -DPARSE -DHAVE_CONFIG_H -DDEBUG
 NTPDEFS=   -DSYS_FREEBSD
 # CLOCKDEFS=
 #      -DLOCAL_CLOCK -DPST -DWWVB -DAS2201 -DGOES -DGPSTM -DOMEGA \

Assuming you're running -RELEASE,
cd /usr/src/usr.sbin/ntp
make obj
make depend
make includes
make
make install

Add -D 9 -6 flags to ntpd_flags in rc.conf. Stop ntpd and run it manually within script(1).

/usr/sbin/ntpd -p /var/db/ntp/ntpd.pid -c /etc/ntp.conf -f /var/db/ntp/ntpd.drift -u ntpd:ntpd -g -D 9 -d -6 -n

Let it run for a minute or two, exit script, and upload the typescript file here (or you can send it to me directly).

Comment 37 Dmitrij 2024-12-18 10:20:59 UTC

(In reply to Cy Schubert from comment #36)

I've replaced pool 2.freebsd.ntp.org with:
server time.cloudflare.com iburst
server time.google.com iburst
server time.apple.com iburst

And now it worked!
# ntpq -pnw
     remote           refid      st t when poll reach   delay   offset  jitter
==============================================================================
*2606:4700:f1::123
                 10.79.9.32       3 u   17   64   17    0.891   -0.812   0.455
+2001:4860:4806:4::
                 .GOOG.           1 u   16   64   17    3.225   -0.788   0.475
+2a01:b740:a16:3000::1f2
                 .GPSs.           1 u   18   64   17   37.634   +0.144   0.523

# ntptime
ntp_gettime() returns code 0 (OK)
  time eb0d1ffe.4f55cd94  Wed, Dec 18 2024 12:18:38.309, (.309903697),
  maximum error 156547 us, estimated error 335 us, TAI offset 37
ntp_adjtime() returns code 0 (OK)
  modes 0x0 (),
  offset -117.189 us, frequency 6.606 ppm, interval 4 s,
  maximum error 156547 us, estimated error 335 us,
  status 0x2001 (PLL,NANO),
  time constant 6, precision 0.001 us, tolerance 496 ppm,
  pps frequency 6.607 ppm, stability 0.000 ppm, jitter 0.000 us,
  intervals 0, jitter exceeded 0, stability exceeded 0, errors 0.

Should then I rebuild ntpd with DEBUGGING and run it as you suggested?

Comment 38 Dmitrij 2024-12-18 10:52:59 UTC

I've rebuilt ntpd with DEBUGGING and run it under script for 2 mins.
Then sent typescript file to Cy directly.

Comment 39 Cy Schubert freebsd_committer

2024-12-18 15:30:18 UTC

This is a FreeBSD only problem. It only affects IPv6 pools. Nothing else is affected. The sa_family appears to be AF_UNSPECIFIED instead of AF_INET6.

Comment 40 Cy Schubert freebsd_committer

2024-12-18 15:57:26 UTC

(In reply to Cy Schubert from comment #39)

Thanks for sending the ntpd debug output. Can you please run it again with only the pool statement in ntp.conf. What you have sent me shows that it is correctly using time.cloudflare.com, time.google.com and time.apple.com. I'd like to see just the failures. We don't see failures, just a retry loop.

With -DDEBUG enabled we get even better messages as it displays each step in detail.

Comment 41 Dmitrij 2024-12-18 16:49:38 UTC

(In reply to Cy Schubert from comment #40)

Sent ntpd debug output with "pool -6 2.freebsd.pool.ntp.org iburst" only
to Cy directly.

Comment 42 Cy Schubert freebsd_committer

2024-12-23 22:37:27 UTC

Upstream (ntp.org) for upstream Bug 3851 caused this regression. I suspect they committed this to address some kind of Linux bug.

Comment 43 commit-hook freebsd_committer

2024-12-23 22:38:18 UTC

A commit in branch main references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=98e34e8e255767e18dd8a6c348cff8bfc01b2662

commit 98e34e8e255767e18dd8a6c348cff8bfc01b2662
Author:     Cy Schubert <cy@FreeBSD.org>
AuthorDate: 2024-12-23 22:30:58 +0000
Commit:     Cy Schubert <cy@FreeBSD.org>
CommitDate: 2024-12-23 22:37:34 +0000

    ntp: Undo upstream (ntp.org) fix for upstream Bug 3851

    The patch for upstream (ntp.org) fix for upstream Bug 3851 may have
    fixed a Linux bug but it caused a regression when ntpd is run on
    FreeBSD.

    Suggested that so@ publish an errata and merge this to releng/14.2.

    PR:             283116
    MFH:            3 days

 contrib/ntp/ntpd/ntp_proto.c | 2 ++
 1 file changed, 2 insertions(+)

Comment 44 commit-hook freebsd_committer

2024-12-23 22:46:20 UTC

A commit in branch main references this bug:

URL: https://cgit.FreeBSD.org/ports/commit/?id=43537eb9c3e5d588ec4add6973ae03c6053a863a

commit 43537eb9c3e5d588ec4add6973ae03c6053a863a
Author:     Cy Schubert <cy@FreeBSD.org>
AuthorDate: 2024-12-23 22:42:54 +0000
Commit:     Cy Schubert <cy@FreeBSD.org>
CommitDate: 2024-12-23 22:45:04 +0000

    net/ntp: Undo upstream (ntp.org) fix for upstream Bug 3851

    The patch for upstream (ntp.org) fix for upstream Bug 3851 may have
    fixed a Linux bug but it caused a regression when ntpd is run on
    FreeBSD.

    PR:             283116
    MFH:            2024Q4

 net/ntp/Makefile                           |  1 +
 net/ntp/files/patch-ntpd_ntp_proto.c (new) | 18 ++++++++++++++++++
 2 files changed, 19 insertions(+)

Comment 45 commit-hook freebsd_committer

2024-12-23 22:48:22 UTC

A commit in branch 2024Q4 references this bug:

URL: https://cgit.FreeBSD.org/ports/commit/?id=088aa7d5609bcd9d11a32fd743dff59a402248da

commit 088aa7d5609bcd9d11a32fd743dff59a402248da
Author:     Cy Schubert <cy@FreeBSD.org>
AuthorDate: 2024-12-23 22:42:54 +0000
Commit:     Cy Schubert <cy@FreeBSD.org>
CommitDate: 2024-12-23 22:47:20 +0000

    net/ntp: Undo upstream (ntp.org) fix for upstream Bug 3851

    The patch for upstream (ntp.org) fix for upstream Bug 3851 may have
    fixed a Linux bug but it caused a regression when ntpd is run on
    FreeBSD.

    PR:             283116
    (cherry picked from commit 43537eb9c3e5d588ec4add6973ae03c6053a863a)

 net/ntp/Makefile                           |  1 +
 net/ntp/files/patch-ntpd_ntp_proto.c (new) | 18 ++++++++++++++++++
 2 files changed, 19 insertions(+)

Comment 46 commit-hook freebsd_committer

2024-12-27 18:00:08 UTC

A commit in branch stable/14 references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=a653e8317f5af006ab49a761ce35d3f525ba5abd

commit a653e8317f5af006ab49a761ce35d3f525ba5abd
Author:     Cy Schubert <cy@FreeBSD.org>
AuthorDate: 2024-12-23 22:30:58 +0000
Commit:     Cy Schubert <cy@FreeBSD.org>
CommitDate: 2024-12-27 17:59:03 +0000

    ntp: Undo upstream (ntp.org) fix for upstream Bug 3851

    The patch for upstream (ntp.org) fix for upstream Bug 3851 may have
    fixed a Linux bug but it caused a regression when ntpd is run on
    FreeBSD.

    Suggested that so@ publish an errata and merge this to releng/14.2.

    PR:             283116
    (cherry picked from commit 98e34e8e255767e18dd8a6c348cff8bfc01b2662)

 contrib/ntp/ntpd/ntp_proto.c | 2 ++
 1 file changed, 2 insertions(+)

Comment 47 commit-hook freebsd_committer

2024-12-27 18:00:10 UTC

A commit in branch stable/13 references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=f904025c5d20fb579a4b1607069ad9697be542fd

commit f904025c5d20fb579a4b1607069ad9697be542fd
Author:     Cy Schubert <cy@FreeBSD.org>
AuthorDate: 2024-12-23 22:30:58 +0000
Commit:     Cy Schubert <cy@FreeBSD.org>
CommitDate: 2024-12-27 17:59:20 +0000

    ntp: Undo upstream (ntp.org) fix for upstream Bug 3851

    The patch for upstream (ntp.org) fix for upstream Bug 3851 may have
    fixed a Linux bug but it caused a regression when ntpd is run on
    FreeBSD.

    Suggested that so@ publish an errata and merge this to releng/14.2.

    PR:             283116
    (cherry picked from commit 98e34e8e255767e18dd8a6c348cff8bfc01b2662)

 contrib/ntp/ntpd/ntp_proto.c | 2 ++
 1 file changed, 2 insertions(+)

Comment 48 Dave Hart 2025-01-31 23:01:23 UTC

Dmitrij, I am planning to address this a bit differently in the upstream ntpd.  It would be particularly helpful if you would test that fix.  It's available as a tarball with 4.2.8p18 plus the fix for the IPv6 ULA regression at:

https://davehart.net/ntp/test/4.2.8p18-3958.tar.gz

Alternatively, if you revert Cy's patch to ntp_proto.c around line 475 where a block of code mentioning [Bug 3851] is #if 0'd out by removing the #if 0 and matching #endif, and then apply the patch to ntp_io.c from https://bugs.ntp.org/3958 that changes the code around 1600 to:

		if (IN6_IS_ADDR_LINKLOCAL(p6addr)) {
			return TRUE;
		}

(removing the other test for IN6_IS_ADDR_SITELOCAL()) you'll help verify we can put this issue to bed.

Thanks in advance.

Comment 49 Cy Schubert freebsd_committer

2025-01-31 23:41:20 UTC

(In reply to Dave Hart from comment #48)

I left a comment on https://bugs.ntp.org/3958 to say that the strlcpy() part of the patch doesn't apply on FreeBSD.

I will test removing the test for IN6_IS_ADDR_SITELOCAL locally and test here.

Comment 50 Dave Hart 2025-02-01 01:55:16 UTC

(In reply to Cy Schubert from comment #49)

The patch was prepared against the 3851 branch.  When I merged it with the later 4.2.8p18 I noticed there was a conflicting fix to normal_dtoa().  Either version is fine for testing this, as you're likely not using 'make check' nor ntpq saveconfig.

FYI I tested using FreeBSD 14.2 amd64 on a he.net IPv6 tunnel, with ntpd configured with --enable-trustedbsd-mac so a very similar setup to yours.  That's why I'm particularly interested in Dmitrij's testing.

Comment 51 Dave Hart 2025-02-03 11:34:51 UTC

(In reply to Dave Hart from comment #50)

Nevermind, Dmitrij, the patch I provided will not resolve the problem for IPv6-only hosts.  I was testing dual-stack.  I'm working on the IPv6-only case now.

Comment 52 Dave Hart 2025-02-24 15:20:10 UTC

(In reply to Dave Hart from comment #51)

Dmitrij, I apologize for failing to post followup here on 3 February.  I had only posted on the ntp.org bug report:

https://bugs.ntp.org/show_bug.cgi?id=3958

I believe your problem will be resolved by this one-line patch to ntpd/ntp_proto.c:

https://bugs.ntp.org/attachment.cgi?id=1924&action=diff

I would appreciate confirmation that this resolves the regression I caused in ntpd 4.2.8p18.

Comment 53 Cy Schubert freebsd_committer

2025-02-24 15:26:08 UTC

Fixed by three patches brought in from upstream. Upstream is working to release a new ntp.