Bug 252709

Summary: exit status of ping utility floating error
Product: Base System Reporter: sgm.sft
Component: binAssignee: Mariusz Zaborski <oshogbo>
Status: Open ---    
Severity: Affects Only Me CC: markj, oshogbo
Priority: --- Keywords: regression
Version: 12.2-RELEASE   
Hardware: Any   
OS: Any   
Attachments:
Description Flags
ktrace output
none
ktrace output and /sbin/ping none

Description sgm.sft 2021-01-15 12:34:47 UTC
I use scripts that ping every minute to find that some hosts are down.

=====================================================
if /sbin/ping -c 3 -W 3 some_host > /dev/null
then
    MSG="ping OK"
else
    MSG="ping failed"
fi
=====================================================

After switching to 12.2-Release, this script returns a false result once every few hours.

Copying ping from FreeBSD 11.4 to a 12.2 machine fixed the problem.
Comment 1 Mark Johnston freebsd_committer freebsd_triage 2021-01-15 16:20:49 UTC
False result in that it fails to receive a reply?  Or ping hits some error?  Can you show the output from a failed ping?
Comment 2 sgm.sft 2021-01-16 17:43:18 UTC
1. False result in result of execution of "if /sbin/ping -c 3 -W 3 some_host".

2. Scrips were changed to:
=======================================
if /sbin/ping -c 1 -W 1 hostaddr > /var/log/periodic/ping/testhostaddr.txt
then
    MSG="test hostaddr OK"
else
    MSG="test hostaddr failed"
    echo `date` >> /var/log/periodic/ping/testhostaddr.err
    cat /var/log/periodic/ping/testhostaddr.txt >> /var/log/periodic/ping/testhostaddr.err
fi
=======================================
where "hostaddr" is variable for different scripts for different hosts.

Results:
- last night script worked for only one host. No one error appered.
- after adding into crontab 6 script for different hosts, errors took place nearly 3 times an hour.
  Files "testhostaddr.err" contains no errors, only time of errors appeared.
  It looks like /sbin/ping did not start if more than one scripts started simultaneously.
Comment 3 sgm.sft 2021-01-17 10:56:06 UTC
when an error occurs when launching the script from crontab, I receive letters with error diagnostics:
=====================================================================
ping: unable to limit access to system.dns service: Socket is not connected
=====================================================================

/sbin/ping copied from 11.4 works fine in the same machine.
Comment 4 Mariusz Zaborski freebsd_committer freebsd_triage 2021-01-17 14:21:43 UTC
I will try to help with that.

Would it be possible for you to run ping with ktrace -di?
Something like:
ktrace -di -t /tmp/${hostaddr}.out /sbin/ping -c 1 -W 1 ${hostaddr}

And send us when the error accrued?
Comment 5 sgm.sft 2021-01-17 18:00:41 UTC
How can I catch error in /sbin/ping execution called from ktrace?
I can collect a lot of tracefiles and select appeared at error time, but it is inconvinient.
Comment 6 Mariusz Zaborski freebsd_committer freebsd_triage 2021-01-17 18:13:13 UTC
ktrace should return the same code as executed process.
```
ktrace false
$ echo $?
1
$ ktrace true 
$ echo $?
0
```
Comment 7 sgm.sft 2021-01-17 21:09:33 UTC
I renamed /sbin/ping to ping-12.2 to allow production scripts use /sbin/ping copied from 11.4

Scripts were changed to:
============================================
tracename=/tmp/${hostnumber}-`date +%Y-%m-%d`.trc

if ktrace -di -f ${tracename} /sbin/ping-12.2 -c 5 -W 200 ${hostaddr} > /var/log/periodic/ping/test${hostname}.txt
then
    MSG="test ${hostname} OK"
    rm ${tracename}
else
    MSG="test ${hostname} failed"
    echo `date` >> /var/log/periodic/ping/test${hostname}.err
    cat /var/log/periodic/ping/test${hostname}.txt >> /var/log/periodic/ping/test${hostname}.err
fi
============================================

Where can I send .trc files?
One is ready now.
Comment 8 Mariusz Zaborski freebsd_committer freebsd_triage 2021-01-17 21:12:55 UTC
You can attach them here. On send it directly to my email address.
Comment 9 sgm.sft 2021-01-17 22:13:54 UTC
Created attachment 221686 [details]
ktrace output
Comment 10 sgm.sft 2021-01-17 22:15:07 UTC
First output of ktrace
Comment 11 sgm.sft 2021-01-18 09:26:35 UTC
I have else 28 .trc files.
Are they interesting to you?
Comment 12 Mariusz Zaborski freebsd_committer freebsd_triage 2021-01-18 14:07:25 UTC
Yes, thank you. I will analyze them and get back to you ASAP.
Comment 13 Mariusz Zaborski freebsd_committer freebsd_triage 2021-01-18 14:52:54 UTC
Could you also attach both version of ping?
Comment 14 sgm.sft 2021-01-18 17:31:45 UTC
Created attachment 221714 [details]
ktrace output and /sbin/ping
Comment 15 sgm.sft 2021-01-18 17:32:00 UTC
It turned out that I copied the ping from FreeBSD 9.3 and not 11.4.
The attachment contains a ping from version 9.3, with which there are no errors.
I have now replaced it with the one copied from 11.4.
I am observing, I will write about the results.
Comment 16 Mariusz Zaborski freebsd_committer freebsd_triage 2021-01-18 17:38:34 UTC
I'm testing this on the fresh installation of FreeBSD 12.2 and I'm unable to reproduce the bug.

I see that this is exactly the same ping version that I have.
Comment 17 sgm.sft 2021-01-18 17:50:42 UTC
using /sbin/ping from 11.4 results in the same error

===========================================================================
ping: unable to limit access to system.dns service: Socket is not connected
===========================================================================
Comment 18 sgm.sft 2021-01-18 17:53:20 UTC
I have another FreeBSD 12.2 machine.
I'll try.
Comment 19 sgm.sft 2021-01-18 21:05:18 UTC
What can I do to help you reproduce bug?

Additional information:
========================================
FreeBSD 12.2-RELEASE-p2 FreeBSD 12.2-RELEASE-p2 GENERIC  amd64
kernconf=CURRENT
========================================
 diff CURRENT GENERIC
----------------------------------------------

< #options      INET6                   # IPv6 communications protocols
< options               DUMMYNET
< options               HZ=1000                 # strongly recommended
< options               IPDIVERT
< options               IPFIREWALL
---
> options       INET6                   # IPv6 communications protocols
335,336d330
< # USB Serial devices
< device                u3g                     # USB-based 3G modems (Option, Huawei, Sierra)
========================================

resolv.conf:
nameserver 127.0.0.1

bind-tools-9.16.10             Command line tools from BIND: delv, dig, host, nslookup...
bind916-9.16.10                BIND DNS suite with updated DNSSEC and DNS64

A lot of other ports

========================================
 cat /etc/cron.d/ping
----------------------------------------------
SHELL=/bin/sh
PATH=/etc:/bin:/sbin:/usr/bin:/usr/sbin

# See crontab(5) for field format.
*       *       *       *       *       root    /usr/local/etc/periodic/ping/testhost1.sh
*       *       *       *       *       root    /usr/local/etc/periodic/ping/testhost2.sh
*       *       *       *       *       root    /usr/local/etc/periodic/ping/testhost3.sh
*       *       *       *       *       root    /usr/local/etc/periodic/ping/testhost4.sh
*       *       *       *       *       root    /usr/local/etc/periodic/ping/testhost5.sh
*       *       *       *       *       root    /usr/local/etc/periodic/ping/testhost6.sh
----------------------------------------------

========================================
cat /usr/local/etc/periodic/ping/testhost1.sh
----------------------------------------------
#!/bin/sh

tracename=/tmp/host1-`date +%Y-%m-%d-%H-%M`.trc

if ktrace -di -f ${tracename} /sbin/ping-12.2 -c 1 -W 1 NNN.NNN.NNN.NNN > /var/log/periodic/ping/testhost1.txt
then
    MSG="test host1 OK"
    rm ${tracename}
else
    MSG="test host1 failed"
    echo `date` >> /var/log/periodic/ping/testhost1.err
    cat /var/log/periodic/ping/testhost1.txt >> /var/log/periodic/ping/testhost1.err
fi
echo $MSG > /var/log/periodic/ping/testhost1.new
MSG=$MSG" on "`date`

if ! /usr/bin/diff /var/log/periodic/ping/testhost1.new /var/log/periodic/ping/testhost1.old > /dev/null
then
    echo $MSG >> /var/log/periodic/ping/testhost1.log
fi
mv /var/log/periodic/ping/testhost1.new /var/log/periodic/ping/testhost1.old
----------------------------------------------
Scripts for other 5 hosts differs in host names and addresses.
Comment 20 Mariusz Zaborski freebsd_committer freebsd_triage 2021-01-20 17:59:40 UTC
I still can't figure out that :(
I'm really sorry for troubling you more.

Could you create d.d file with:
```
pid$target:::entry {}
pid$target:::return {printf("%d %d", arg0, arg1);}
```

And could you try running ping like that:
```
dtrace -s d.d -c "/sbin/ping $HOST" -o output.file
```