Bug 253166

Summary: net/dhcpcd: no interfaces have a carrier (during boot)
Product: Ports & Packages Reporter: Dries Michiels <driesm>
Component: Individual Port(s)Assignee: Ed Maste <emaste>
Status: Closed FIXED    
Severity: Affects Only Me CC: emaste, kbowling, philip, roy, woodsb02
Priority: ---    
Version: Latest   
Hardware: Any   
OS: Any   

Description Dries Michiels freebsd_committer freebsd_triage 2021-02-01 16:23:02 UTC
Jan 31 20:06:31 vados kernel: Starting dhcpcd.
Jan 31 20:06:31 vados kernel: dhcpcd-9.4.0 starting
Jan 31 20:06:31 vados kernel: dhcp6_openudp: Can't assign requested address
Jan 31 20:06:31 vados kernel: ps_inet_startcb: dhcp6_open: Can't assign requested address
Jan 31 20:06:31 vados kernel: DUID 00:04:00:00:00:00:00:00:00:00:00:00:4c:cc:6a:28:3e:a3
Jan 31 20:06:31 vados kernel: no interfaces have a carrier
Jan 31 20:06:31 vados kernel: forked to background, child pid 15254
Jan 31 20:06:31 vados kernel: Additional TCP/IP options: IPv6 CPE WANIF=em0.
Jan 31 20:06:31 vados kernel: Setting up harvesting: PURE_RDRAND,[UMA],[FS_ATIME],SWI,INTERRUPT,NET_NG,[NET_ETHER],NET_TUN,MOUSE,KEYBOARD,ATTACH,CACHED
Comment 1 Dries Michiels freebsd_committer freebsd_triage 2021-02-01 16:23:48 UTC
Pretty weird, as dhclient starts right after dhcpcd and that doesn't seem to fail.
Comment 2 Dries Michiels freebsd_committer freebsd_triage 2021-02-01 16:25:03 UTC
To fix this, I just have to manually restart dhcpcd after the system has booted up. Its annoying though as I sometimes forget and my network doesn't have IPv6 in the mean time :-)
Comment 3 Dries Michiels freebsd_committer freebsd_triage 2021-02-01 16:26:35 UTC
I'm running 14 CURRENT. Here the rcorder.

[/home/dries]$ rcorder /etc/rc.d/* /usr/local/etc/rc.d/*
rcorder: file `/usr/local/etc/rc.d/tcsd' is before unknown provision `kerberos'
rcorder: file `/usr/local/etc/rc.d/tcsd' is before unknown provision `named'
/etc/rc.d/natd
/etc/rc.d/rctl
/usr/local/etc/rc.d/dhcpcd
/etc/rc.d/dhclient
Comment 4 roy 2021-02-01 20:45:45 UTC
(In reply to Dries Michiels from comment #0)
Jan 31 20:06:31 vados kernel: dhcp6_openudp: Can't assign requested address

So you have something else hogging the DHCPv6 port wildcard address.

(In reply to Dries Michiels from comment #2)
> To fix this, I just have to manually restart dhcpcd after the system has booted

Sounds like during boot the routing socket overflowed and dhcpcd missed the carrier coming up event.
FreeBSD currently does not report route overflow so dhcpcd as you can see needs a manual restart.

See https://reviews.freebsd.org/D26652 for a kernel patch to fix this which has sadly stalled in review.
Comment 5 Dries Michiels freebsd_committer freebsd_triage 2021-02-02 13:15:17 UTC
Why is this now a problem? On some older versions of FreeBSD / dhcpcd I did not have this issue. (not sure if its FreeBSD that changed something, or dhcpcd :))
Comment 6 roy 2021-02-02 13:38:55 UTC
The route socket overflow has always been a problem, just fairly invisible unless you actually know what you're looking for.

The carrier issue has changed fairly recently since dhcpcd-9.3

In a nutshell, carrier is now *only* determined by ifnet->if_data->ifi_link_state.
It used to use media valid state, but this is problematic for some interface types who have a separation between valid media vs carrier.

One good example of this is wireless monitor mode.
The interface media is valid, but there is no carrier.

In FreeBSD<13 the only way of accessing ifi_link_state is via routing messages (which can be lost) or getifaddrs(2) which is an expensive libc call. FreeBSD-13 has added SIOCGIFDATA which is much more light weight.
https://reviews.freebsd.org/D26538

I *could* poll this ioctl every second at the expense of CPU to detect carrier state changes or FreeBSD *could* actually commit something to detect overflow.
Currently it's the only major BSD that doesn't report this for the routing socket.
dhcpcd used to poll for carrier up only (via media state), but I got complaints that it used too much CPU or was too slow or just didn't work reliably so I removed it. I'm in a no-win situation right now :(
Comment 7 Dries Michiels freebsd_committer freebsd_triage 2021-07-28 17:02:32 UTC
Roy, hope all is well!

Your preferred approach of the kernel adjustments landed today in https://cgit.freebsd.org/src/commit/?id=7045b1603bdf054145dd958a4acc17b410fb62a0

Is there anything needed from the dhcpcd side to make use of this?
Comment 8 roy 2021-07-28 17:26:31 UTC
(In reply to Dries Michiels from comment #7)

Hi Dries

I'm currently up and down like a yoyo sadly.

I saw that it landed which is great!
All that should be needed from the dhcpcd side is a recompile and boom it starts working.
Let me know how that works for you.

As a side note, I have started work on a new dhcpcd release - created a dhcpcd-9 branch as the master has code I don't have time to fully test or make stable right now. I just have one issue to solve and hopefully a dhcpcd-9.4.1 will get tagged in the comming weeks.

Roy
Comment 9 Kevin Bowling freebsd_committer freebsd_triage 2021-07-28 18:21:31 UTC
(In reply to roy from comment #8)
Hi Roy,

I've relanded this and intend to MFC it to at least stable-13.
Comment 10 Dries Michiels freebsd_committer freebsd_triage 2021-09-22 13:35:41 UTC
Hmm, so I have upgraded my STABLE and recompiled dhcpcd to make use of the new commit in FreeBSD. But during boot I still receive the same error message. What could be happening here?
Comment 11 Dries Michiels freebsd_committer freebsd_triage 2021-09-22 13:35:54 UTC
STABLE-13 (FYI)
Comment 12 roy 2021-10-18 14:50:24 UTC
I'll try and spin up a FreeBSD-13 VM.
Which error message still happens? There are two here:

1)
Jan 31 20:06:31 vados kernel: dhcp6_openudp: Can't assign requested address
Jan 31 20:06:31 vados kernel: ps_inet_startcb: dhcp6_open: Can't assign requested address

and 2)
Jan 31 20:06:31 vados kernel: no interfaces have a carrier
Jan 31 20:06:31 vados kernel: forked to background, child pid 15254

The only fix so far should be for 2) where interace carrier *will* be detected at some point and dhcpcd will then start working.

I don't have a fix for 1) as something is trying to open the DHCP6 ports.
That's something you'll have to analyse yourself, maybe using netstat and ps with grep in your init scripts to work out what process it is.
Comment 13 Dries Michiels freebsd_committer freebsd_triage 2021-10-18 19:46:37 UTC
I'm still seeing both messages on my recent boot on the 14th of october.
Comment 14 roy 2021-10-19 06:38:17 UTC
(In reply to Dries Michiels from comment #13)

I've been able to replicate this error by using the init script in https://reviews.freebsd.org/D22012
Jan 31 20:06:31 vados kernel: dhcp6_openudp: Can't assign requested address

I fixed it by adding this in /etc/rc.d/dhcpcd

start_precmd="dhcpcd_prestart"
dhcpcd_prestart() {
        # Hack to get rid of the error Can't assign requested address when
        # dhcpcd tries to open ANYADDR:DHCPPORT
        /etc/rc.d/netif start lo0
}

I did initally just try using `ifconfig lo0 up`, but that wasn't enough.
I'll try and narrow down just what needs to be up later.
Comment 15 Dries Michiels freebsd_committer freebsd_triage 2021-10-19 06:49:39 UTC
Its good that you can replicate. I have install dhcpcd through ports, the port uses this script: https://cgit.freebsd.org/ports/tree/net/dhcpcd/files/dhcpcd.in (just FYI).
Comment 16 roy 2021-10-19 07:43:56 UTC
(In reply to Dries Michiels from comment #15)

I couldn't replicate it using that script :)

If you put `/etc/rc.d/netif lo0 start` at the end of the dhcpcd_precmd function in your script, does it fix the problem?
Comment 17 roy 2021-10-19 10:55:21 UTC
So I've narrowed it down to this:

start_precmd="dhcpcd_prestart"
dhcpcd_prestart() {
        # Hack to get rid of the error Can't assign requested address when
        # dhcpcd tries to open ANYADDR:DHCPPORT
        # XXX Even for INET6 only this seems to be needed?
        /sbin/ifconfig lo0 inet alias 127.0.0.1/8 up
}

Does that fix it for you?
Comment 18 roy 2021-10-19 12:00:54 UTC
So just as long as *any* address of each family exists on the loopback interface, dhcpcd works.
I can replace 127.0.0.1/8 with 1.2.3.4/32 in the above hack and it works.

The addition of any IPv4 address also has the side effect of bringing the interface up and adding IPv6 addresses.

What gets more interesting is I can remove the address I just added and it fails again, but only for that address family.
Even more down the rabbit hole, I can add the address to em0 (not lo0) and it works!

This smells of a kernel bug in FreeBSD somewhere which is checking to see if the host has an address assigned of the family before allowing the opening of the any address socket.
Comment 19 Dries Michiels freebsd_committer freebsd_triage 2021-10-19 12:42:03 UTC
Delaying the start of the dhcpcd daemon might also just work then. Start it after the lo0 interface has been initialized with ::1 and 127.0.0.1. Might use a keyword in the #REQUIRED section in the init script which seems a lot less hacky to me. Surely something changed in either FreeBSD or dhcpcd because I did not have this issue before.
Comment 20 roy 2021-10-19 16:53:18 UTC
What changed is from earlier dhcpcd versions is that we open the unspecified address for the DHCP sockets much earlier in the process.
Infact, since privilege seperation was introduced it's now one of the first steps to open these sockets, fork and drop privs.

Anyway, for a proper fix I've submitted this review with a kernel patch which also fixes this problem:
https://reviews.freebsd.org/D32563
Comment 21 commit-hook freebsd_committer freebsd_triage 2021-10-20 23:31:35 UTC
A commit in branch main references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=5c5340108e9c2e384ca646720e17d037c69acc4c

commit 5c5340108e9c2e384ca646720e17d037c69acc4c
Author:     Roy Marples <roy@marples.name>
AuthorDate: 2021-10-20 15:47:29 +0000
Commit:     Ed Maste <emaste@FreeBSD.org>
CommitDate: 2021-10-20 23:25:51 +0000

    net: Allow binding of unspecified address without address existance

    Previously in_pcbbind_setup returned EADDRNOTAVAIL for empty
    V_in_ifaddrhead (i.e., no IPv4 addresses configured) and in6_pcbbind
    did the same for empty V_in6_ifaddrhead (no IPv6 addresses).

    An equivalent test has existed since 4.4-Lite.  It was presumably done
    to avoid extra work (assuming the address isn't going to be found
    later).

    In normal system operation *_ifaddrhead will not be empty: they will
    at least have the loopback address(es).  In practice no work will be
    avoided.

    Further, this case caused net/dhcpd to fail when run early in boot
    before assignment of any addresses.  It should be possible to bind the
    unspecified address even if no addresses have been configured yet, so
    just remove the tests.

    The now-removed "XXX broken" comments were added in 59562606b9d3,
    which converted the ifaddr lists to TAILQs.  As far as I (emaste) can
    tell the brokenness is the issue described above, not some aspect of
    the TAILQ conversion.

    PR:             253166
    Reviewed by:    ae, bz, donner, emaste, glebius
    MFC after:      1 month
    Differential Revision:  https://reviews.freebsd.org/D32563

 sys/netinet/in_pcb.c   | 2 --
 sys/netinet6/in6_pcb.c | 2 --
 2 files changed, 4 deletions(-)
Comment 22 commit-hook freebsd_committer freebsd_triage 2021-11-19 01:53:21 UTC
A commit in branch stable/13 references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=ec5691aa2f96d27c8f000486a9e3297c2dce31b9

commit ec5691aa2f96d27c8f000486a9e3297c2dce31b9
Author:     Roy Marples <roy@marples.name>
AuthorDate: 2021-10-20 15:47:29 +0000
Commit:     Ed Maste <emaste@FreeBSD.org>
CommitDate: 2021-11-19 00:28:56 +0000

    net: Allow binding of unspecified address without address existance

    Previously in_pcbbind_setup returned EADDRNOTAVAIL for empty
    V_in_ifaddrhead (i.e., no IPv4 addresses configured) and in6_pcbbind
    did the same for empty V_in6_ifaddrhead (no IPv6 addresses).

    An equivalent test has existed since 4.4-Lite.  It was presumably done
    to avoid extra work (assuming the address isn't going to be found
    later).

    In normal system operation *_ifaddrhead will not be empty: they will
    at least have the loopback address(es).  In practice no work will be
    avoided.

    Further, this case caused net/dhcpd to fail when run early in boot
    before assignment of any addresses.  It should be possible to bind the
    unspecified address even if no addresses have been configured yet, so
    just remove the tests.

    The now-removed "XXX broken" comments were added in 59562606b9d3,
    which converted the ifaddr lists to TAILQs.  As far as I (emaste) can
    tell the brokenness is the issue described above, not some aspect of
    the TAILQ conversion.

    PR:             253166
    Reviewed by:    ae, bz, donner, emaste, glebius
    MFC after:      1 month
    Differential Revision:  https://reviews.freebsd.org/D32563

    (cherry picked from commit 5c5340108e9c2e384ca646720e17d037c69acc4c)

 sys/netinet/in_pcb.c   | 2 --
 sys/netinet6/in6_pcb.c | 2 --
 2 files changed, 4 deletions(-)
Comment 23 commit-hook freebsd_committer freebsd_triage 2021-11-19 01:54:22 UTC
A commit in branch stable/12 references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=e20aa51503b5960c9514492d0746ce05e8eca8a6

commit e20aa51503b5960c9514492d0746ce05e8eca8a6
Author:     Roy Marples <roy@marples.name>
AuthorDate: 2021-10-20 15:47:29 +0000
Commit:     Ed Maste <emaste@FreeBSD.org>
CommitDate: 2021-11-19 00:29:31 +0000

    net: Allow binding of unspecified address without address existance

    Previously in_pcbbind_setup returned EADDRNOTAVAIL for empty
    V_in_ifaddrhead (i.e., no IPv4 addresses configured) and in6_pcbbind
    did the same for empty V_in6_ifaddrhead (no IPv6 addresses).

    An equivalent test has existed since 4.4-Lite.  It was presumably done
    to avoid extra work (assuming the address isn't going to be found
    later).

    In normal system operation *_ifaddrhead will not be empty: they will
    at least have the loopback address(es).  In practice no work will be
    avoided.

    Further, this case caused net/dhcpd to fail when run early in boot
    before assignment of any addresses.  It should be possible to bind the
    unspecified address even if no addresses have been configured yet, so
    just remove the tests.

    The now-removed "XXX broken" comments were added in 59562606b9d3,
    which converted the ifaddr lists to TAILQs.  As far as I (emaste) can
    tell the brokenness is the issue described above, not some aspect of
    the TAILQ conversion.

    PR:             253166
    Reviewed by:    ae, bz, donner, emaste, glebius
    MFC after:      1 month
    Differential Revision:  https://reviews.freebsd.org/D32563

    (cherry picked from commit 5c5340108e9c2e384ca646720e17d037c69acc4c)

 sys/netinet/in_pcb.c   | 2 --
 sys/netinet6/in6_pcb.c | 2 --
 2 files changed, 4 deletions(-)