Bug 241421 - net/ntp segfaults with stack_gap!=0
Summary: net/ntp segfaults with stack_gap!=0
Status: Closed FIXED
Alias: None
Product: Ports & Packages
Classification: Unclassified
Component: Individual Port(s) (show other bugs)
Version: Latest
Hardware: Any Any
: --- Affects Only Me
Assignee: Cy Schubert
URL:
Keywords:
: 241960 (view as bug list)
Depends on:
Blocks: 241960
  Show dependency treegraph
 
Reported: 2019-10-22 20:39 UTC by dewayne
Modified: 2019-11-27 03:19 UTC (History)
1 user (show)

See Also:
bugzilla: maintainer-feedback? (cy)


Attachments
Trial baloon (446 bytes, patch)
2019-10-24 05:11 UTC, Cy Schubert
no flags Details | Diff
This should fix this PR. (1.05 KB, patch)
2019-10-24 20:35 UTC, Cy Schubert
no flags Details | Diff
This has been tested to circumvent this PR. (1.60 KB, patch)
2019-10-25 06:13 UTC, Cy Schubert
no flags Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description dewayne 2019-10-22 20:39:36 UTC
While trying to secure... time (net/ntp), I've noticed that it experiences segmentation faults (SIGSGV).

Environment
FreeBSD 12.1-STABLE #0 r353429M: Sat Oct 12 19:02:59 AEDT 2019

kern.elf64.aslr.stack_gap=1
kern.elf64.aslr.honor_sbrk=1
kern.elf64.aslr.pie_enable=1
kern.elf64.aslr.enable=1
kern.elf64.pie_base=16912384
kern.elf64.nxstack=1

security.mac.ntpd.uid=123
security.mac.ntpd.enabled=1

From the /etc/make.conf
CFLAGS include -fPIE -fPIC -Wl,-z,relro -Wl,-z,now -Wl,-z,noexecstack
LDFLAGS include -pie -z relro -z now -z noexecstack 

# make -C /usr/ports/net/ntp -DUSE_K8 showconfig|grep =on
     IPV6=on: IPv6 protocol support
     LOCAL_CLOCK=on: Enable local clock reference
     SHM=on: Enable SHM clock attached thru shared memory
     SSL=on: SSL protocol support
     THREADS=on: Threading support

And we kick-off ntp with
su -m ntpd -c "/usr/local/sbin/ntpd -c /etc/ntp.conf -u ntpd -x -G --nofork"

Yes this does require other files to be ntpd readable, and logs writeable

With the nofork, it requires multiple tries to get it to start.  Over approx 15 tests, the minimum number of attempts (using stack_gap=1) is 11 and the most 41.  I use a process monitor (s6) which retries starting ntp approx 1.01 seconds until successful.

When kern.elf64.aslr.stack_gap=0, ntp starts on the first attempt.

I'm sharing this because ntpd has a problem with aslr (particularly when enabled via stack_gap, and I had used different percentages stack_gap=1|2|3 during additional tests).
Comment 1 dewayne 2019-10-22 20:49:44 UTC
(In reply to dewayne from comment #0)
ntpd Version        : 4.2.8p13_4
platform: amd64 (Xeon E3)
Comment 2 Cy Schubert freebsd_committer 2019-10-23 12:35:16 UTC
grep ntp /etc/rc.conf output, please.

uname -a output, please.

In the mean time put rlimit memlock=0 in ntp.conf.
Comment 3 dewayne 2019-10-23 21:10:58 UTC
(In reply to Cy Schubert from comment #2)
Hi Cy,

# uname -aKU
FreeBSD hathor 12.1-STABLE FreeBSD 12.1-STABLE #0 r353429M: Sat Oct 12 19:02:59 AEDT 2019     root@hathor:/usr/src/amd64.amd64/sys/hqdev-amd64-smp-vga  amd64 1201500 1201500

Inserting into /etc/ntp.conf
rlimit memlock 0

results in
24 Oct 07:52:20 ntpd[96513]: ./../lib/isc/unix/ifiter_getifaddrs.c:99: unexpected error:
24 Oct 07:52:20 ntpd[96513]: getting interface addresses: getifaddrs: Cannot allocate memory
24 Oct 07:52:20 ntpd[96513]: getpwnam(ntpd) failed: Cannot allocate memory

The environment has 24G of memory, 11G inactive and 9G free at the time of testing and vm.loadavg: { 0.12 0.14 0.09 }


I also tried "rlimit memlock 16" with 
ntpd     56335   0.0  0.1  15332  15604  -  Ss    07:59        0:00.04 /usr/local/sbin/ntpd -c /etc/ntp.conf -u ntpd -x -G --nofork

While removing the rlimit line 
ntpd     28679   0.0  0.0  15460   6888  -  Ss    08:02        0:00.03 /usr/local/sbin/ntpd -c /etc/ntp.conf -u ntpd -x -G --nofork
Comment 4 Cy Schubert freebsd_committer 2019-10-24 00:21:56 UTC
Sorry, that should have been rlimit memlock -1
Comment 5 dewayne 2019-10-24 00:35:21 UTC
(In reply to Cy Schubert from comment #4)
:) ok

with /etc/ntp.conf
rlimit memlock -1

# /usr/local/sbin/ntpd -c /etc/ntp.conf -u ntpd -x -G --nofork
24 Oct 11:28:40 ntpd[30692]: ntpd 4.2.8p13@1.3847-o Tue Oct 15 05:48:05 UTC 2019 (1): Starting
24 Oct 11:28:40 ntpd[30692]: Command line: /usr/local/sbin/ntpd -c /etc/ntp.conf -u ntpd -x -G --nofork
Segmentation fault
/var/log# /usr/local/sbin/ntpd -c /etc/ntp.conf -u ntpd -x -G --nofork
24 Oct 11:28:41 ntpd[31208]: ntpd 4.2.8p13@1.3847-o Tue Oct 15 05:48:05 UTC 2019 (1): Starting
24 Oct 11:28:41 ntpd[31208]: Command line: /usr/local/sbin/ntpd -c /etc/ntp.conf -u ntpd -x -G --nofork
Segmentation fault
/var/log# /usr/local/sbin/ntpd -c /etc/ntp.conf -u ntpd -x -G --nofork
24 Oct 11:28:41 ntpd[31341]: ntpd 4.2.8p13@1.3847-o Tue Oct 15 05:48:05 UTC 2019 (1): Starting
24 Oct 11:28:41 ntpd[31341]: Command line: /usr/local/sbin/ntpd -c /etc/ntp.conf -u ntpd -x -G --nofork
24 Oct 11:28:41 ntpd[31341]: proto: precision = 0.168 usec (-22)
24 Oct 11:28:41 ntpd[31341]: basedate set to 2019-10-03
... 
and operates correctly.

ntpd     31341   0.0  0.0  15520   6928  4  S+    11:28        0:00.02 /usr/local/sbin/ntpd -c /etc/ntp.conf -u ntpd -x -G --nofork

killing with SIGTERM then a restart
ntpd     88280   0.0  0.0  21664   6932  4  S+    11:33        0:00.01 /usr/local/sbin/ntpd -c /etc/ntp.conf -u ntpd -x -G --nofork
Comment 6 Cy Schubert freebsd_committer 2019-10-24 00:40:31 UTC
Good.

Does the base ntp otherwise work without disabling memlock? i.e. is it only the port that fails?
Comment 7 dewayne 2019-10-24 00:53:15 UTC
(In reply to Cy Schubert from comment #6)
Yes both fail with /etc/ntp.conf
rlimit memlock -1

# /usr/sbin/ntpd -c /etc/ntp.conf -u ntpd -x -G --nofork
24 Oct 11:46:17 ntpd[13978]: ntpd 4.2.8p12-a (1): Starting
24 Oct 11:46:17 ntpd[13978]: Command line: /usr/sbin/ntpd -c /etc/ntp.conf -u ntpd -x -G --nofork
Segmentation fault
/var/log# /usr/sbin/ntpd -c /etc/ntp.conf -u ntpd -x -G --nofork
24 Oct 11:46:58 ntpd[16849]: ntpd 4.2.8p12-a (1): Starting
24 Oct 11:46:58 ntpd[16849]: Command line: /usr/sbin/ntpd -c /etc/ntp.conf -u ntpd -x -G --nofork
24 Oct 11:46:58 ntpd[16849]: proto: precision = 0.168 usec (-22)

I also tried base ntpd with 
# rlimit memlock -1

# /usr/sbin/ntpd -c /etc/ntp.conf -u ntpd -x -G --nofork
24 Oct 11:49:50 ntpd[39113]: ntpd 4.2.8p12-a (1): Starting
24 Oct 11:49:50 ntpd[39113]: Command line: /usr/sbin/ntpd -c /etc/ntp.conf -u ntpd -x -G --nofork
Segmentation fault
Comment 8 Cy Schubert freebsd_committer 2019-10-24 02:22:54 UTC
Trying your command line, I am unable to reproduce it on 13-CURRENT.

slippy# /usr/sbin/ntpd -c /etc/ntp.conf -u ntpd -x -G --nofork
23 Oct 19:16:35 ntpd[8005]: ntpd 4.2.8p12-a (1): Starting
23 Oct 19:16:35 ntpd[8005]: Command line: /usr/sbin/ntpd -c /etc/ntp.conf -u ntpd -x -G --nofork
23 Oct 19:16:35 ntpd[8005]: proto: precision = 0.089 usec (-23)
23 Oct 19:16:35 ntpd[8005]: basedate set to 2018-08-07
23 Oct 19:16:35 ntpd[8005]: gps base set to 2018-08-12 (week 2014)
23 Oct 19:16:35 ntpd[8005]: leapsecond file ('/var/db/ntpd.leap-seconds.list'): good hash signature
23 Oct 19:16:35 ntpd[8005]: leapsecond file ('/var/db/ntpd.leap-seconds.list'): loaded, expire=2020-06-28T00:00:00Z last=2017-01-01T00:00:00Z ofs=37
23 Oct 19:16:35 ntpd[8005]: unable to bind to wildcard address :: - another process may be running - EXITING
slippy# service ntpd stop
Stopping ntpd.
Waiting for PIDS: 2296.
slippy# /usr/sbin/ntpd -c /etc/ntp.conf -u ntpd -x -G --nofork
23 Oct 19:16:47 ntpd[8014]: ntpd 4.2.8p12-a (1): Starting
23 Oct 19:16:47 ntpd[8014]: Command line: /usr/sbin/ntpd -c /etc/ntp.conf -u ntpd -x -G --nofork
23 Oct 19:16:47 ntpd[8014]: proto: precision = 0.090 usec (-23)
23 Oct 19:16:47 ntpd[8014]: basedate set to 2018-08-07
23 Oct 19:16:47 ntpd[8014]: gps base set to 2018-08-12 (week 2014)
23 Oct 19:16:47 ntpd[8014]: leapsecond file ('/var/db/ntpd.leap-seconds.list'): good hash signature
23 Oct 19:16:47 ntpd[8014]: leapsecond file ('/var/db/ntpd.leap-seconds.list'): loaded, expire=2020-06-28T00:00:00Z last=2017-01-01T00:00:00Z ofs=37
23 Oct 19:16:47 ntpd[8014]: Listen and drop on 0 v6wildcard [::]:123
23 Oct 19:16:47 ntpd[8014]: Listen and drop on 1 v4wildcard 0.0.0.0:123
23 Oct 19:16:47 ntpd[8014]: Listen normally on 2 lo0 127.0.0.1:123
23 Oct 19:16:47 ntpd[8014]: Listen normally on 3 lo0 [::1]:123
23 Oct 19:16:47 ntpd[8014]: Listen normally on 4 lo0 [fe80::1%2]:123
23 Oct 19:16:47 ntpd[8014]: Listen normally on 5 lo0 10.1.1.91:123
23 Oct 19:16:47 ntpd[8014]: Listen normally on 6 lo0 10.1.2.91:123
23 Oct 19:16:47 ntpd[8014]: Listen normally on 7 lagg0 [fe80::226a:8aff:fe72:317%4]:123
23 Oct 19:16:47 ntpd[8014]: Listen normally on 8 lagg0 10.168.100.185:123
23 Oct 19:16:47 ntpd[8014]: Listen normally on 9 lagg0 [fc00:1:1:1::5b]:123
23 Oct 19:16:47 ntpd[8014]: Listen normally on 10 tun3 [fe80::226a:8aff:fe72:317%6]:123
23 Oct 19:16:47 ntpd[8014]: Listen normally on 11 tun3 10.2.2.6:123
23 Oct 19:16:47 ntpd[8014]: Listening on routing socket on fd #32 for interface updates
23 Oct 19:16:47 ntpd[8014]: kernel reports TIME_ERROR: 0x2041: Clock Unsynchronized
23 Oct 19:16:47 ntpd[8014]: kernel reports TIME_ERROR: 0x2041: Clock Unsynchronized
23 Oct 19:16:48 ntpd[8014]: Soliciting pool server 69.89.207.199
23 Oct 19:16:49 ntpd[8014]: Soliciting pool server 198.211.103.209
23 Oct 19:16:50 ntpd[8014]: Soliciting pool server 208.115.126.70
23 Oct 19:16:51 ntpd[8014]: Soliciting pool server 45.79.36.123
23 Oct 19:16:52 ntpd[8014]: Soliciting pool server 45.33.84.208
23 Oct 19:16:53 ntpd[8014]: Soliciting pool server 74.207.240.206
23 Oct 19:16:54 ntpd[8014]: Soliciting pool server 47.190.36.230
23 Oct 19:16:54 ntpd[8014]: Doing intital time step
23 Oct 19:16:54 ntpd[8014]: receive: Unexpected origin timestamp 0xe15b8816.abbb5f66 does not match aorg 0000000000.00000000 from server@74.207.240.206 xmt 0xe15b8816.ae73b003
23 Oct 19:16:55 ntpd[8014]: Soliciting pool server 184.60.28.49


slippy# /usr/local/sbin/ntpd -c /etc/ntp.conf -u ntpd -x -G --nofork
23 Oct 19:18:19 ntpd[8229]: ntpd 4.2.8p13@1.3847-o Fri Sep 20 19:58:04 UTC 2019 (1): Starting
23 Oct 19:18:19 ntpd[8229]: Command line: /usr/local/sbin/ntpd -c /etc/ntp.conf -u ntpd -x -G --nofork
23 Oct 19:18:19 ntpd[8229]: proto: precision = 0.090 usec (-23)
23 Oct 19:18:19 ntpd[8229]: basedate set to 2019-09-08
23 Oct 19:18:19 ntpd[8229]: gps base set to 2019-09-08 (week 2070)
23 Oct 19:18:19 ntpd[8229]: leapsecond file ('/var/db/ntpd.leap-seconds.list'): good hash signature
23 Oct 19:18:19 ntpd[8229]: leapsecond file ('/var/db/ntpd.leap-seconds.list'): loaded, expire=2020-06-28T00:00:00Z last=2017-01-01T00:00:00Z ofs=37
23 Oct 19:18:19 ntpd[8229]: Listen and drop on 0 v6wildcard [::]:123
23 Oct 19:18:19 ntpd[8229]: Listen and drop on 1 v4wildcard 0.0.0.0:123
23 Oct 19:18:19 ntpd[8229]: Listen normally on 2 lo0 127.0.0.1:123
23 Oct 19:18:19 ntpd[8229]: Listen normally on 3 lo0 [::1]:123
23 Oct 19:18:19 ntpd[8229]: Listen normally on 4 lo0 [fe80::1%2]:123
23 Oct 19:18:19 ntpd[8229]: Listen normally on 5 lo0 10.1.1.91:123
23 Oct 19:18:19 ntpd[8229]: Listen normally on 6 lo0 10.1.2.91:123
23 Oct 19:18:19 ntpd[8229]: Listen normally on 7 lagg0 [fe80::226a:8aff:fe72:317%4]:123
23 Oct 19:18:19 ntpd[8229]: Listen normally on 8 lagg0 10.168.100.185:123
23 Oct 19:18:19 ntpd[8229]: Listen normally on 9 lagg0 [fc00:1:1:1::5b]:123
23 Oct 19:18:19 ntpd[8229]: Listen normally on 10 tun3 [fe80::226a:8aff:fe72:317%6]:123
23 Oct 19:18:19 ntpd[8229]: Listen normally on 11 tun3 10.2.2.6:123
23 Oct 19:18:19 ntpd[8229]: Listening on routing socket on fd #32 for interface updates
23 Oct 19:18:19 ntpd[8229]: kernel reports TIME_ERROR: 0x2041: Clock Unsynchronized
23 Oct 19:18:19 ntpd[8229]: kernel reports TIME_ERROR: 0x2041: Clock Unsynchronized
23 Oct 19:18:20 ntpd[8229]: Soliciting pool server 45.33.84.208
23 Oct 19:18:21 ntpd[8229]: Soliciting pool server 74.207.240.206
23 Oct 19:18:22 ntpd[8229]: Soliciting pool server 47.190.36.230
23 Oct 19:18:23 ntpd[8229]: Soliciting pool server 184.60.28.49
23 Oct 19:18:26 ntpd[8229]: Doing intital time step
23 Oct 19:18:26 ntpd[8229]: receive: Unexpected origin timestamp 0xe15b8872.663ebbf1 does not match aorg 0000000000.00000000 from sym_passive@10.1.1.1 xmt 0xe15b8872.69553dbf

I'll update my 12-stable partition and try to reproduce it there.
Comment 9 dewayne 2019-10-24 03:16:17 UTC
(In reply to Cy Schubert from comment #8)
Thats interesting.  Is this with:
kern.elf64.aslr.stack_gap=1
kern.elf64.aslr.honor_sbrk=1
kern.elf64.aslr.pie_enable=1
kern.elf64.aslr.enable=1
?
Comment 10 Cy Schubert freebsd_committer 2019-10-24 04:36:36 UTC
I've reproduced the problem. There appears to be some regression since this was last fixed.
Comment 11 Cy Schubert freebsd_committer 2019-10-24 05:11:27 UTC
Created attachment 208549 [details]
Trial baloon

Can you give this patch a spin? It's failing on a different setrlimit() this time. Try 200 (the OpenBSD value) if it doesn't work. 128 pages is the default any process gets, BTW.
Comment 12 dewayne 2019-10-24 05:47:46 UTC
(In reply to Cy Schubert from comment #11)
Patch applied, rebuilt and installed.  No rlimits enabled in ntp.conf.

/tmp/t# date -u
Thu Oct 24 05:35:48 UTC 2019
/tmp/t# /usr/local/sbin/ntpd -c /etc/ntp.conf -u ntpd -x -G --nofork
24 Oct 16:35:57 ntpd[30405]: ntpd 4.2.8p13@1.3847-o Thu Oct 24 05:32:43 UTC 2019 (1): Starting
24 Oct 16:35:57 ntpd[30405]: Command line: /usr/local/sbin/ntpd -c /etc/ntp.conf -u ntpd -x -G --nofork
Segmentation fault
/tmp/t# /usr/local/sbin/ntpd -c /etc/ntp.conf -u ntpd -x -G --nofork
24 Oct 16:35:58 ntpd[30566]: ntpd 4.2.8p13@1.3847-o Thu Oct 24 05:32:43 UTC 2019 (1): Starting
24 Oct 16:35:58 ntpd[30566]: Command line: /usr/local/sbin/ntpd -c /etc/ntp.conf -u ntpd -x -G --nofork
24 Oct 16:35:58 ntpd[30566]: proto: precision = 0.064 usec (-24)
24 Oct 16:35:58 ntpd[30566]: basedate set to 2019-10-12

----
Changed /etc/ntp.conf to:
rlimit memlock -1
rlimit stacksize 200

t# /usr/local/sbin/ntpd -c /etc/ntp.conf -u ntpd -x -G --nofork
24 Oct 16:41:29 ntpd[61202]: ntpd 4.2.8p13@1.3847-o Thu Oct 24 05:32:43 UTC 2019 (1): Starting
24 Oct 16:41:29 ntpd[61202]: Command line: /usr/local/sbin/ntpd -c /etc/ntp.conf -u ntpd -x -G --nofork
Segmentation fault

----
rlimit memlock -1
rlimit stacksize 128

# /usr/local/sbin/ntpd -c /etc/ntp.conf -u ntpd -x -G --nofork
24 Oct 16:42:27 ntpd[68829]: ntpd 4.2.8p13@1.3847-o Thu Oct 24 05:32:43 UTC 2019 (1): Starting
24 Oct 16:42:27 ntpd[68829]: Command line: /usr/local/sbin/ntpd -c /etc/ntp.conf -u ntpd -x -G --nofork
Segmentation fault

----

# rlimit memlock -1
rlimit stacksize 128

# /usr/local/sbin/ntpd -c /etc/ntp.conf -u ntpd -x -G --nofork
24 Oct 16:43:39 ntpd[70717]: ntpd 4.2.8p13@1.3847-o Thu Oct 24 05:32:43 UTC 2019 (1): Starting
24 Oct 16:43:39 ntpd[70717]: Command line: /usr/local/sbin/ntpd -c /etc/ntp.conf -u ntpd -x -G --nofork
Segmentation fault

----
# rlimit memlock -1
rlimit stacksize 200

# /usr/local/sbin/ntpd -c /etc/ntp.conf -u ntpd -x -G --nofork
24 Oct 16:44:36 ntpd[75947]: ntpd 4.2.8p13@1.3847-o Thu Oct 24 05:32:43 UTC 2019 (1): Starting
24 Oct 16:44:36 ntpd[75947]: Command line: /usr/local/sbin/ntpd -c /etc/ntp.conf -u ntpd -x -G --nofork
Segmentation fault

----
rlimit memlock 16
rlimit stacksize 200

# /usr/local/sbin/ntpd -c /etc/ntp.conf -u ntpd -x -G --nofork
24 Oct 16:44:36 ntpd[75947]: ntpd 4.2.8p13@1.3847-o Thu Oct 24 05:32:43 UTC 2019 (1): Starting
24 Oct 16:44:36 ntpd[75947]: Command line: /usr/local/sbin/ntpd -c /etc/ntp.conf -u ntpd -x -G --nofork
Segmentation fault

Sorry...
Comment 13 dewayne 2019-10-24 06:23:06 UTC
(In reply to dewayne from comment #12)
Looking a little deeper.

ntp.conf without memlock or stacksize, ie the defaults

# procstat -l 22998  # /usr/local/sbin/ntpd -c /etc/ntp.conf -u ntpd -x -G --nofork

  PID COMM             RLIMIT                  SOFT             HARD
22998 ntpd             cputime             infinity         infinity
22998 ntpd             filesize            infinity         infinity
22998 ntpd             datasize               32768 MB         32768 MB
22998 ntpd             stacksize             524288 B         524288 KB
22998 ntpd             coredumpsize               0 B              0 B
22998 ntpd             memoryuse           infinity         infinity
22998 ntpd             memorylocked        infinity         infinity
22998 ntpd             maxprocesses            1025             1025
22998 ntpd             openfiles                 32               32
22998 ntpd             sbsize              infinity         infinity
22998 ntpd             vmemoryuse          infinity         infinity
22998 ntpd             pseudo-terminals    infinity         infinity
22998 ntpd             swapuse             infinity         infinity
22998 ntpd             kqueues             infinity         infinity
22998 ntpd             umtxp               infinity         infinity

---

rlimit memlock -1
rlimit stacksize 200

 procstat -l 11176
  PID COMM             RLIMIT                  SOFT             HARD
11176 ntpd             cputime             infinity         infinity
11176 ntpd             filesize            infinity         infinity
11176 ntpd             datasize               32768 MB         32768 MB
11176 ntpd             stacksize             819200 B         524288 KB
11176 ntpd             coredumpsize               0 B              0 B
11176 ntpd             memoryuse           infinity         infinity
11176 ntpd             memorylocked        infinity         infinity
11176 ntpd             maxprocesses            1025             1025
11176 ntpd             openfiles                 32               32
11176 ntpd             sbsize              infinity         infinity
11176 ntpd             vmemoryuse          infinity         infinity
11176 ntpd             pseudo-terminals    infinity         infinity
11176 ntpd             swapuse             infinity         infinity
11176 ntpd             kqueues             infinity         infinity
11176 ntpd             umtxp               infinity         infinity

Trying a (silly) number "rlimit stacksize 600"
72981 ntpd             stacksize            2457600 B         524288 KB
and still segfaulting.  Salvation lies on another path.

Aside: I only use 24 file descriptors; so 32 has a lot of headroom.
Comment 14 Cy Schubert freebsd_committer 2019-10-24 11:41:21 UTC
Did you apply the patch to the port and rebuild it?
Comment 15 Cy Schubert freebsd_committer 2019-10-24 20:35:45 UTC
Created attachment 208585 [details]
This should fix this PR.

Can you either apply this to base or apply this to the port after make patch but before make configure.

I don't have a chance to test this right now. If you don't get around to it I'll test it before Monday.
Comment 16 dewayne 2019-10-24 21:37:36 UTC
(In reply to Cy Schubert from comment #14)
Yes, patched, rebuilt, reinstall. 

# date ; date -u ; lh /usr/ports/net/ntp/Makefile ; grep -B3 -A3 stack-lim /usr/ports/net/ntp/Makefile ; lh /usr/local/sbin/ntpd
Fri 25 Oct 2019 08:35:12 AEDT # Now
Thu 24 Oct 2019 21:35:12 UTC  # Now UTC
-rw-r-----  1 root  wheel   2.5K 24 Oct 16:29 /usr/ports/net/ntp/Makefile
GNU_CONFIGURE=  yes
CONFIGURE_ARGS= --enable-leap-smear --enable-trustedbsd-mac \
                --with-locfile=freebsd --with-memlock=-1 \
                --with-stack-limit=128

TEST_TARGET=    check

-r-xr-xr-x  1 root  wheel   531K 24 Oct 16:32 /usr/local/sbin/ntpd*

Previous reply was using patched version. :)
Comment 17 dewayne 2019-10-25 05:18:16 UTC
(In reply to Cy Schubert from comment #15)
Thanks Cy.  Unfortunately the patch was unsuccessful.  I've placed the patch used for net/ntp and (text file) kdumps for pre-patch, patched and patch with ntp.conf rlimit settings at
http://www.heuristicsystems.com/FreeBSD-ntpd/  (to be removed after 30 days)

I followed the convention of 
patch-DIRECTORY__FILENAME and discovered another patch /usr/ports/net/ntp/patch-ntpd_ntp.c that interfers with your patch, due to both modifying ntpd.c. 

Eventually successfully applied the patch, resulted in:

ntpd.c:1009:12: warning: implicit declaration of function 'sysctlbyname' is invalid in C99 [-Wimplicit-function-declaration]
        if ((rc = sysctlbyname(aslr_var, aslr, &aslr_len, NULL, 0)) != 0))
                  ^
ntpd.c:1009:67: error: extraneous ')' after condition, expected a statement
        if ((rc = sysctlbyname(aslr_var, aslr, &aslr_len, NULL, 0)) != 0))
                                                                         ^
1 warning and 1 error generated.
*** Error code 1


Removing the last parenth, from 
 if ((rc = sysctlbyname(aslr_var, aslr, &aslr_len, NULL, 0)) != 0))
to
 if ((rc = sysctlbyname(aslr_var, aslr, &aslr_len, NULL, 0)) != 0)

and modified the Makefile
PORTREVISION=   5

applied the patches, rebuilt on amd64 and i386; installed onto amd64 FreeBSD12.1 Stable, and using ntp.conf with no setting for memlock or stacksize, with the result of:

/usr/local/sbin/ntpd -c /etc/ntp.conf -u ntpd -x -G --nofork
25 Oct 10:22:02 ntpd[45800]: ntpd 4.2.8p13@1.3847-o Thu Oct 24 23:19:32 UTC 2019 (1): Starting
25 Oct 10:22:02 ntpd[45800]: Command line: /usr/local/sbin/ntpd -c /etc/ntp.conf -u ntpd -x -G --nofork
Segmentation fault

---
with ntp.conf
rlimit memlock -1
rlimit stacksize 200

/usr/local/sbin/ntpd -c /etc/ntp.conf -u ntpd -x -G --nofork
25 Oct 10:27:12 ntpd[67285]: ntpd 4.2.8p13@1.3847-o Thu Oct 24 23:19:32 UTC 2019 (1): Starting
25 Oct 10:27:12 ntpd[67285]: Command line: /usr/local/sbin/ntpd -c /etc/ntp.conf -u ntpd -x -G --nofork
Segmentation fault


=== Before you ask, the above results are from the patched ntpd

# date ; date -u ; lh /usr/ports/net/ntp/files | tail -n 2 ; echo --- ; lh /usr/packages2/All/ntp* /usr/local/sbin/ntpd ; echo ===; pkg query "%n %v %t" ntp-4.2.8p13_5 ; date -r `pkg query "%t" ntp-4.2.8p13_5` ; date -ur `pkg query "%t" ntp-4.2.8p13_5`
Fri 25 Oct 2019 15:59:30 AEDT
Fri 25 Oct 2019 04:59:30 UTC
-rw-r-----  1 root  wheel   497B 25 Oct 10:08 patch-ntpd_ntp.c
-rw-r-----  1 root  wheel   998B 25 Oct 10:15 patch-ntpd_ntpd.c
---
-rw-r-----  1 root  wheel   6.3M 25 Oct 10:17 /usr/packages2/All/ntp-4.2.8p13_4.tar
-r-xr-xr-x  1 root  wheel   544K 25 Oct 10:19 /usr/local/sbin/ntpd*
-rw-r-----  1 root  wheel   6.3M 25 Oct 10:19 /usr/packages2/All/ntp-4.2.8p13_5.tar
===
ntp 4.2.8p13_5 1571959254
Fri 25 Oct 2019 10:20:54 AEDT
Thu 24 Oct 2019 23:20:54 UTC
Comment 18 Cy Schubert freebsd_committer 2019-10-25 06:13:50 UTC
Created attachment 208592 [details]
This has been tested to circumvent this PR.

Sorry about that. I didn't have time to test the patch, throwing it together over noon before heading back to $JOB.

This patch does circumvent the problem by avoiding setrlimit() of the stack. The next patch I will post here will be for base. I'll test it here before posting it.
Comment 19 Cy Schubert freebsd_committer 2019-10-25 15:02:27 UTC
Though the patch (removing any restriction on the stack) allows ntpd to start, it exposes another problem. IPv6 causes an assertion (signal 6).
Comment 20 Cy Schubert freebsd_committer 2019-10-27 18:44:48 UTC
All options going forward have undesirable effects, they don't work. Put the following in your rc.conf.

ntpd_prepend="/usr/bin/proccontrol -m aslr -s disable"

Disabling stack gap through proccontrol also doesn't work. Removing the stack setrlimit() causes an IPv6 assertion. This is the only solution.

If you're running ntpd in a script or by hand you will need to:

/usr/bin/proccontrol -m aslr -s disable /usr/sbin/ntpd ...
or
/usr/bin/proccontrol -m aslr -s disable /usr/local/sbin/ntpd ...

kib@ has documented why this is so at https://github.com/freebsd/freebsd-quarterly/blob/master/2019q3/stack_gap.md.

I've pointed our ntp upstream to this PR. The ntp folks are aware of this problem.

I will close this problem.
Comment 21 dewayne 2019-10-28 03:36:16 UTC
(In reply to Cy Schubert from comment #20)
Thank-you for your efforts, and the pointer to proccontrol.  At least I don't need to disable aslr altogether :)  Using proccontrol provides much more reliable start time; and allows ASLR for the rest of the system.

Can I suggest updating https://wiki.freebsd.org/ASLR ?

As mentioned earlier, I have net/ntp under process control, ie outside rc; it will sometimes start the first time and always within 61 attempts when aslr and stack_gap are =1.  (Sometimes its just as troublesome something working intermittently...)

Observations:
applying to the running ntpd process id 11652 
# proccontrol -m aslr -q -p 11652
not forced, active
# proccontrol -m aslr -s disable -p 11652
# proccontrol -m aslr -q -p 11652
force disabled, active
Ok, but that's just interesting.  Your example of use is useful
# /usr/bin/proccontrol -m aslr -s disable /usr/local/sbin/ntpd ...
# proccontrol -m aslr -q -p 50900
force disabled, not active

Kind regards.
Comment 22 dewayne 2019-10-28 03:51:48 UTC
(In reply to Cy Schubert from comment #20)
Thank-you for your efforts, and the pointer to proccontrol.  At least I don't need to disable aslr altogether :)  Using proccontrol provides much more reliable start time; and allows ASLR for the rest of the system.

Can I suggest updating https://wiki.freebsd.org/ASLR to reflect this situation?

As mentioned earlier, I have net/ntp under process control, ie outside rc; it will sometimes start the first time and always within 61 attempts when aslr and stack_gap are =1.  (Sometimes its just as troublesome something working intermittently...)

Observations:
applying to the running ntpd process id 11652 
# proccontrol -m aslr -q -p 11652
not forced, active
# proccontrol -m aslr -s disable -p 11652
# proccontrol -m aslr -q -p 11652
force disabled, active
Ok, but that's just interesting.  Your example of use is useful
# /usr/bin/proccontrol -m aslr -s disable /usr/local/sbin/ntpd ...
# proccontrol -m aslr -q -p 50900
force disabled, not active

Kind regards.
Comment 23 Cy Schubert freebsd_committer 2019-11-14 02:50:55 UTC
*** Bug 241960 has been marked as a duplicate of this bug. ***
Comment 24 commit-hook freebsd_committer 2019-11-15 16:34:55 UTC
A commit references this bug:

Author: cy
Date: Fri Nov 15 16:34:35 UTC 2019
New revision: 354733
URL: https://svnweb.freebsd.org/changeset/base/354733

Log:
  Disable ntpd stack gap. When ASLR with STACK GAP != 0 ntpd suffers SIGSEGV.

  PR:		241421, 241960
  Reported by:	Vladimir Zakharov <zakharov.vv@gmail.com>,
  		dewayne@heuristicsystems.com.au
  Reviewed by:	kib, imp (previous version), ian (suggestion)
  MFC after:	3 days
  Differential Revision:	https://reviews.freebsd.org/D22358

Changes:
  head/contrib/ntp/ntpd/ntpd.c
Comment 25 commit-hook freebsd_committer 2019-11-15 16:35:00 UTC
A commit references this bug:

Author: cy
Date: Fri Nov 15 16:34:42 UTC 2019
New revision: 517694
URL: https://svnweb.freebsd.org/changeset/ports/517694

Log:
  Disable ntpd stack gap. When ASLR with STACK GAP != 0 ntpd suffers SIGSEGV.

  PR:		241421, 241960
  Reported by:	Vladimir Zakharov <zakharov.vv@gmail.com>,
  		dewayne@heuristicsystems.com.au
  Reviewed by:	kib, imp (previous version), ian (suggestion)
  MFH:		2019Q4
  Differential Revision:  https://reviews.freebsd.org/D22358

Changes:
  head/net/ntp/Makefile
  head/net/ntp/files/patch-ntpd_ntpd.c
  head/net/ntp-devel/Makefile
  head/net/ntp-devel/files/patch-ntpd_ntpd.c
Comment 26 commit-hook freebsd_committer 2019-11-18 13:34:21 UTC
A commit references this bug:

Author: cy
Date: Mon Nov 18 13:33:51 UTC 2019
New revision: 517868
URL: https://svnweb.freebsd.org/changeset/ports/517868

Log:
  MFH: r515926 r517694

  patch-ntpd_ntp.c should really be named patch-ntpd_ntpd.c as it patches
  ntpd/ntpd.c.

  Disable ntpd stack gap. When ASLR with STACK GAP != 0 ntpd suffers SIGSEGV.

  PR:		241421, 241960
  Reported by:	Vladimir Zakharov <zakharov.vv@gmail.com>,
  		dewayne@heuristicsystems.com.au
  Reviewed by:	kib, imp (previous version), ian (suggestion)
  Differential Revision:  https://reviews.freebsd.org/D22358

  Approved by:	portmgr (joneum)

Changes:
_U  branches/2019Q4/
  branches/2019Q4/net/ntp/Makefile
  branches/2019Q4/net/ntp/files/patch-ntpd_ntp.c
  branches/2019Q4/net/ntp/files/patch-ntpd_ntpd.c
  branches/2019Q4/net/ntp-devel/Makefile
  branches/2019Q4/net/ntp-devel/files/patch-ntpd_ntp.c
  branches/2019Q4/net/ntp-devel/files/patch-ntpd_ntpd.c
Comment 27 commit-hook freebsd_committer 2019-11-27 03:19:15 UTC
A commit references this bug:

Author: cy
Date: Wed Nov 27 03:18:35 UTC 2019
New revision: 355127
URL: https://svnweb.freebsd.org/changeset/base/355127

Log:
  MFC rr354733:
  Disable ntpd stack gap. When ASLR with STACK GAP != 0 ntpd suffers SIGSEGV.

  PR:		241421, 241960
  Reported by:	Vladimir Zakharov <zakharov.vv@gmail.com>,
  		dewayne@heuristicsystems.com.au
  Reviewed by:	kib, imp (previous version), ian (suggestion)
  Differential Revision:	https://reviews.freebsd.org/D22358

Changes:
_U  stable/12/
  stable/12/contrib/ntp/ntpd/ntpd.c