Commit 713264f6b8bc5f927dd52cf8ffcccfa397034fec (March 6 2023) has potentially broken BOOTP/DHCP functionality for nfsroot diskless start-up. This commit adds a check to the end of netinet/in_pcb.c:in_pcbladdr that checks for a source address equivalent to INADDR_ANY. Unfortunately as part of the diskless BOOTP/DHCP process (nfs/bootp_subr.c et al), the interface's address is (effectively) set to INADDR_ANY, causing this check to fail, and therefore the DHCP search and the rest of the diskless nfs root process to fail as well. Tested/discovered and analyzed on ARM64/RPI4, 14.0-RELEASE. Not yet verified for other platforms.
*** Bug 278044 has been marked as a duplicate of this bug. ***
Indeed, bootpc_call() does this strange thing to ensure that the src addr of DHCP requests has address 0.0.0.0: 638 /* Set netmask to 0.0.0.0 */ 639 clear_sinaddr(sin); 640 error = ifioctl(bootp_so, SIOCAIFADDR, (caddr_t)ifra, 641 td); 642 if (error != 0) 643 panic("%s: SIOCAIFADDR, error=%d", __func__, 644 error); 645 646 error = sosend(bootp_so, (struct sockaddr *) &dst, 647 &auio, NULL, NULL, 0, td); 648 if (error != 0) 649 printf("%s: sosend: %d state %08x\n", __func__, 650 error, (int )bootp_so->so_state); 651 652 /* Set netmask to 255.0.0.0 */ 653 sin->sin_addr.s_addr = htonl(0xff000000); 654 error = ifioctl(bootp_so, SIOCAIFADDR, (caddr_t)ifra, 655 td); 656 if (error != 0) 657 panic("%s: SIOCAIFADDR, error=%d", __func__, 658 error); The sosend() causes udp_send() to connect the socket, resulting in an error because the local address is 0.0.0.0. We don't permit that since INADDR_ANY is used as a sentinel value in the inpcb layer. dhclient doesn't have this problem since it uses BPF to write packets. Any opinions on how best to fix this?
(In reply to Mark Johnston from comment #2) After dwelling on this for a bit, I'm starting to conclude that adding a tunable for these source checks. For bootp/nfs root systems that can then be turned off from the loader.
I haven't yet look deep into the problem, but my first thought about it was the following. Why do we create a UDP socket from the kernel, while a userland dhclient would use a raw socket? In kernel we are closer to the network stack but we use a higher level primitive than dhclient.
(In reply to Gleb Smirnoff from comment #4) I was wondering the same thing. I will try to reimplement the data path using a raw socket this afternoon.
The basic reason is that we cannot easily use raw sockets to receive UDP packets - they belong to udp_input(). dhclient actually uses the BPF interface to send/recv UDP packets for DHCP, but I think that's a bit too inconvenient in the kernel. We can perhaps use a separate UDP socket for receiving packets, but that's a bit inconvenient. Or, we can perhaps temporarily override the ipproto registration when bootp is running. (In reply to Richard Wai from comment #3) We could do that, but I'd prefer not to if there's some other option. We use INADDR_ANY as a sentinel value in various places in the pcb layer, so life is better if we can always say that it's an invalid value.