Bug 254432 - ctld won't start correctly at boot
Summary: ctld won't start correctly at boot
Status: New
Alias: None
Product: Base System
Classification: Unclassified
Component: bin (show other bugs)
Version: 13.0-STABLE
Hardware: amd64 Any
: --- Affects Some People
Assignee: freebsd-bugs (Nobody)
URL:
Keywords: regression
Depends on:
Blocks:
 
Reported: 2021-03-20 11:53 UTC by David BOYER
Modified: 2021-04-04 22:00 UTC (History)
2 users (show)

See Also:


Attachments
ctl.conf file (285 bytes, text/plain)
2021-03-20 11:53 UTC, David BOYER
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description David BOYER 2021-03-20 11:53:07 UTC
Created attachment 223448 [details]
ctl.conf file

Hello,

ctld does not fully start when the machine boots on FreeBSD 13 (I tested through BETA to RC2).
However, it does start when it is launched manually.

Plus, the exact same configuration works perfectly on FreeBSD 12.

I tried on two different machines.

I stripped down the config to a basic one (some value are edited) :

/etc/ctl.conf

lun pulse {
        path "/dev/zvol/exports/iscsi/pulse"
}

portal-group pg0 {
        discovery-auth-group no-authentication
        listen 0.0.0.0
}

target iqn.2020-02.net.domain.hostname:pulse {
        auth-group no-authentication
        portal-group pg0
        lun 0 "pulse"
}

-----------

After the boot process, ctld seems to have begun to start, as 

1) there is a process (with no arguments)
root@hostname:~ # ps faux |grep ctl
root      9   0,0  0,0       0     256  -  DL   11:56    0:00,00 [ctl]

2) ctladm prints some informations that are not printed when it is stopped

root@hostname:~ # ctladm port -l
Port Online Frontend Name     pp vp
0    YES    ioctl    ioctl    0  0
1    YES    tpc      tpc      0  0
2    NO     camsim   camsim   0  0  naa.5000000xxxxxxxx
3    YES    iscsi    iscsi    257 1 iqn.2020-02.net.domain.hostname:pulse,t,0x0101

-----------

root@hostname:~ # ctladm devlist
LUN Backend       Size (Blocks)   BS Serial Number    Device ID
  0 block             251658240  512 MYSERIAL0000     MYDEVID0000

-----------

root@hostname:~ # ctladm lunlist
(7:0:0/0): <FREEBSD CTLDISK 0001> Fixed Direct Access SPC-5 SCSI device

But the service does not work:
root@hostname:~ # service ctld status
ctld is not running.

The first attempt at starting the service manually returns

root@hostname:~ # service ctld start
Starting ctld.
ctld: error returned from port creation request: target "iqn.2020-02.net.domain.hostname:pulse" for portal group tag 257 already exists
ctld: failed to update port pg0-iqn.2020-02.net.domain.hostname:pulse

While the second attempt returns
root@hostname:~ # service ctld start
ctld already running?  (pid=2961).


Then, the service is OK
root@hostname:~ # service ctld status
ctld is running as pid 2961.

And everything works fine.

I could not find any useful log but can provide more informations if needed.
Comment 1 Stéphane D'Alu 2021-03-20 22:14:08 UTC
Perhaps, it is the same problem that I have, which is due to ctld being started before network initialization.

$ service -r

...
/etc/rc.d/ctld
/etc/rc.d/autounmountd
/etc/rc.d/devmatch
/usr/local/etc/rc.d/uuidd
/etc/rc.d/kld
/etc/rc.d/addswap
/usr/local/etc/rc.d/virtual_oss
/etc/rc.d/netif
...
Comment 2 David BOYER 2021-03-21 13:00:41 UTC
(In reply to Stéphane D'Alu from comment #1)

That was my first hypothesis, because it seems logical, but I thought it was a mistake because it was working well on FreeBSD 12.

But service -r (I wasn't aware of this option so thank you ;-) ) shows that ctld is started AFTER NETWORKING on 12:

# grep -n -E 'iovctl|netif|NETWORKING|ctld' 12/service-r
38:/etc/rc.d/iovctl
39:/etc/rc.d/netif
60:/etc/rc.d/NETWORKING
90:/etc/rc.d/ctld

The question is what changed this behavior ? 
I am currently comparing rc files from both versions but did not find anything obvious so far.

NOTE: Adding netif as a dependency to ctld works, but it takes forever before I can ssh to my host. It seems that it is waiting for my jails to be up and running before.
Comment 3 David BOYER 2021-03-21 16:38:21 UTC
(In reply to David BOYER from comment #2)
I replaced netif with NETWORKING and it is working fine.

To be clear, I modified the file /etc/rc.d/ctld :
# REQUIRE: FILESYSTEMS NETWORKING


I think this problem relates to bug #232397.
Comment 4 Edward Tomasz Napierala freebsd_committer 2021-04-04 22:00:17 UTC
https://reviews.freebsd.org/D29578