Bug 267144 - sysutils/nut does not start
Summary: sysutils/nut does not start
Status: New
Alias: None
Product: Ports & Packages
Classification: Unclassified
Component: Individual Port(s) (show other bugs)
Version: Latest
Hardware: Any Any
: --- Affects Only Me
Assignee: Cy Schubert
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2022-10-17 08:31 UTC by ml
Modified: 2023-08-07 08:55 UTC (History)
2 users (show)

See Also:
cy: maintainer-feedback+
cy: merge-quarterly+


Attachments
Try this. (1.92 KB, patch)
2022-10-17 15:23 UTC, Cy Schubert
no flags Details | Diff
Truss log as root (19.55 KB, text/plain)
2022-12-05 10:05 UTC, ml
no flags Details
Truss log as uucp (14.48 KB, text/plain)
2022-12-05 10:06 UTC, ml
no flags Details
Patch for sysutils/nut/files for 2.8.0 (1.61 KB, patch)
2023-03-28 20:39 UTC, Rudolf Čejka
no flags Details | Diff
${WRKDIR}/include/config.h (as requested). (32.57 KB, text/plain)
2023-08-07 08:54 UTC, ml
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description ml 2022-10-17 08:31:17 UTC
Since a few weeks I have a problem with NUT not starting due to USB device permissions; I don't know if it was a FreeBSD upgrade or a NUT upgrade that broke this, but this is happening on several different systems.
I *think* it happened around the time I switched to the 2022Q3 branch for ports.



> # service nut restart
> nut not running? (check /var/db/nut/upsd.pid).
> Network UPS Tools - UPS driver controller 2.8.0
> Network UPS Tools - Generic HID driver 0.47 (2.8.0)
> USB communication driver (libusb 1.0) 0.43
> interrupt pipe disabled (add 'pollonly' flag to 'ups.conf' to get rid of this message)
> Can't claim USB device [051d:0003]@0/0: Other error
> Driver failed to start (exit status=1)
> /usr/local/etc/rc.d/nut: WARNING: failed precmd routine for nut

> # usbconfig ugen0.1: <0x8086 XHCI root HUB> at usbus0, cfg=0 md=HOST spd=SUPER (5.0Gbps) pwr=SAVE (0mA)
> ugen0.2: <Logitech USB Optical Mouse> at usbus0, cfg=0 md=HOST spd=LOW (1.5Mbps) pwr=ON (100mA)
> ugen0.3: <Logitech USB Keyboard> at usbus0, cfg=0 md=HOST spd=LOW (1.5Mbps) pwr=ON (90mA)
> ugen0.4: <American Power Conversion Smart-UPS1500 FW:UPS 03.5 / ID1015> at usbus0, cfg=0 md=HOST spd=FULL (12Mbps) pwr=ON (10mA)

> # ls -l /dev/|grep ugen
> lrwxr-xr-x  1 root  wheel        9 Sep 24 10:59 ugen0.1 -> usb/0.1.0
> lrwxr-xr-x  1 root  wheel        9 Sep 24 10:59 ugen0.2 -> usb/0.2.0
> lrwxr-xr-x  1 root  wheel        9 Sep 24 10:59 ugen0.3 -> usb/0.3.0
> lrwxr-xr-x  1 root  wheel        9 Sep 24 10:59 ugen0.4 -> usb/0.4.0

> # ls -l /dev/usb/
> total 0
> crw-------  1 root  operator  0x2d Sep 24 10:59 0.1.0
> crw-------  1 root  operator  0x55 Sep 24 10:59 0.1.1
> crw-------  1 root  operator  0x74 Sep 24 10:59 0.2.0
> crw-------  1 root  operator  0x76 Sep 24 10:59 0.2.1
> crw-------  1 root  operator  0x77 Sep 24 10:59 0.3.0
> crw-------  1 root  operator  0x79 Sep 24 10:59 0.3.1
> crw-------  1 root  operator  0x7a Sep 24 10:59 0.3.2
> crw-rw----  1 root  uucp      0x7d Sep 24 08:25 0.4.0
> crw-------  1 root  operator  0x7f Sep 24 10:59 0.4.1

The only way I was able to solve this is by issuing:
upsdrvctl -u root start
upsdrvctl stop
service nut start

However, I need to manually do this after each reboot.
Comment 1 Cy Schubert freebsd_committer freebsd_triage 2022-10-17 14:32:58 UTC
I have a four questions below. Please answer ALL of them.

1. Is there an entry like this in /usr/local/etc/devd/nut-usb.conf?

#  various 5G models  - usbhid-ups
notify 100 {
        match "system"          "USB";
        match "subsystem"       "DEVICE";
        match "type"            "ATTACH";
        match "vendor"          "0x051d";
        match "product"         "0x0003";
        action "chgrp uucp /dev/$cdev; chmod g+rw /dev/$cdev";
};

2. After reboot what does ps auxww | grep usb say. **Not now because you're running it under the root account.**

3. Also, do you know if devd is running when nut starts?

4. Does your UPS disconnect from and reconnect to the USB bus every five seconds? You will see something like this in dmesg until usbhid-ups finally connects:

ugen0.2: <CPS UPS RF1025> at usbus0 (disconnected)
ugen0.2: <CPS UPS RF1025> at usbus0
ugen0.2: <CPS UPS RF1025> at usbus0 (disconnected)
ugen0.2: <CPS UPS RF1025> at usbus0
ugen0.2: <CPS UPS RF1025> at usbus0 (disconnected)
ugen0.2: <CPS UPS RF1025> at usbus0
ugen0.2: <CPS UPS RF1025> at usbus0 (disconnected)
ugen0.2: <CPS UPS RF1025> at usbus0
ugen0.2: <CPS UPS RF1025> at usbus0 (disconnected)

Do you see anything like this above.
Comment 2 Cy Schubert freebsd_committer freebsd_triage 2022-10-17 14:33:31 UTC
Also, uname -a please.
Comment 3 ml 2022-10-17 14:45:36 UTC
(In reply to Cy Schubert from comment #1)

1. Is there an entry like this in /usr/local/etc/devd/nut-usb.conf?

Sure.
I use the stock file provided by the port, do this definition is there.



2. After reboot what does ps auxww | grep usb say. **Not now because you're running it under the root account.**

Sorry, I cannot reboot production servers now. Will keep an eye on this if needed.
However, I'm not running it as root! As I said I only need to start it once as root, then I stop it and run it normally.
So:
# ps auxxww|grep usb
root        13   0.0  0.0       0      80  -  DL   10:43      0:00.37 [usb]
uucp     60073   0.0  0.0   12236    2352  -  Ss   12:29      0:00.47 /usr/local/libexec/nut/usbhid-ups -a APC
root     62303   0.0  0.0   11308    2164  0  S+   16:39      0:00.00 grep usb

BTW, after reboot there would be no such usbhid-ups entry, as, as I said, it doesn't start.



3. Also, do you know if devd is running when nut starts?

Yes.



4. Does your UPS disconnect from and reconnect to the USB bus every five seconds?

No.
(Curiously, I always wanted to ask why mouses behave like this :), but I've never seen this with UPSes).



5. Also, uname -a please.

Several...
Just two examples:
FreeBSD harry.netfence.it 12.3-RELEASE-p7 FreeBSD 12.3-RELEASE-p7 releng/netfence_12.3-n234236-f9fd7a7be39 HARRY  amd64
FreeBSD jack.netfence.it 13.1-RELEASE-p2 FreeBSD 13.1-RELEASE-p2 releng/netfence_13.1-n250164-abaf2815a20 JACK amd64



Thanks
Comment 4 Cy Schubert freebsd_committer freebsd_triage 2022-10-17 15:23:25 UTC
Created attachment 237404 [details]
Try this.

Please try this patch.
Comment 5 commit-hook freebsd_committer freebsd_triage 2022-10-17 18:20:59 UTC
A commit in branch main references this bug:

URL: https://cgit.FreeBSD.org/ports/commit/?id=9ef8c35f855969b21a880e942ff53803b5d81ce8

commit 9ef8c35f855969b21a880e942ff53803b5d81ce8
Author:     Cy Schubert <cy@FreeBSD.org>
AuthorDate: 2022-10-17 15:12:03 +0000
Commit:     Cy Schubert <cy@FreeBSD.org>
CommitDate: 2022-10-17 18:20:13 +0000

    sysutils/nut*: Require devd prior to start

    In some cases nut may start before devd causing it to fail because it
    lacks permissions to USB attached UPS devices. The nut supplied
    devd.conf ensures that nut has read/write access to the UPS.

    PR:             267144
    Reported by:    ml@netfence.it
    MFH             2022Q4

 sysutils/nut-devel/Makefile     | 2 +-
 sysutils/nut-devel/files/nut.in | 2 +-
 sysutils/nut/Makefile           | 2 +-
 sysutils/nut/files/nut.in       | 2 +-
 4 files changed, 4 insertions(+), 4 deletions(-)
Comment 6 commit-hook freebsd_committer freebsd_triage 2022-10-25 04:11:18 UTC
A commit in branch 2022Q4 references this bug:

URL: https://cgit.FreeBSD.org/ports/commit/?id=ea31a795f4d83ed3f3f37577fdd119cf0eba651f

commit ea31a795f4d83ed3f3f37577fdd119cf0eba651f
Author:     Cy Schubert <cy@FreeBSD.org>
AuthorDate: 2022-10-17 15:12:03 +0000
Commit:     Cy Schubert <cy@FreeBSD.org>
CommitDate: 2022-10-25 04:09:39 +0000

    sysutils/nut*: Require devd prior to start

    In some cases nut may start before devd causing it to fail because it
    lacks permissions to USB attached UPS devices. The nut supplied
    devd.conf ensures that nut has read/write access to the UPS.

    PR:             267144
    Reported by:    ml@netfence.it
    MFH             2022Q4

    (cherry picked from commit 9ef8c35f855969b21a880e942ff53803b5d81ce8)

 sysutils/nut-devel/Makefile     | 2 +-
 sysutils/nut-devel/files/nut.in | 2 +-
 sysutils/nut/Makefile           | 2 +-
 sysutils/nut/files/nut.in       | 2 +-
 4 files changed, 4 insertions(+), 4 deletions(-)
Comment 7 ml 2022-11-25 11:01:30 UTC
(In reply to Cy Schubert from comment #4)

Sorry for taking so long!

I've updated the ports on all my boxes, but I'm getting mixed results:
_ on two 13.1 it works properly;
_ on two 12.3 machines, I still experience the problem.

Not sure what's the reason.
Of course I'll (slowly) upgrade all of them, so unless I find a 13.1 which shows the problem, we may lower the importance of this bug.

Thanks.
Comment 8 Cy Schubert freebsd_committer freebsd_triage 2022-11-25 14:24:17 UTC
(In reply to ml from comment #7)
You will need to tell me why it doesn't work on the two 12.3 boxes. I don't have access to logs, dmesg or other outputs. It is likely something unique to how you set the boxes up.

As I said before my RF1025 connects and disconnects from the USB bus every five seconds. Does your UPS do the same?
Comment 9 ml 2022-11-25 16:36:26 UTC
(In reply to Cy Schubert from comment #8)

I already answered the last question: no, my UPSes don't disconnect and reconnect continuously.
On boot I see something like:
Nov 25 09:26:36 jack kernel: ugen0.3: <American Power Conversion Smart-UPS1500 FW:UPS 03.5 / ID1018> at usbus0
Then that's it, there's no more mention of ugen0.3 in the logs.
(On other machines, of course, it might be ugen0.4 or whatever, the UPS model changes, but the behaviour is the same: I've never seen any UPS disconnect and reconnect).



I understand you don't have info.

The only thing I see in retrospective from the logs is:

_ at boot ugen0.3 appears, along with uhid0:
Nov 25 09:26:36 jack kernel: ugen0.3: <American Power Conversion Smart-UPS1500 FW:UPS 03.5 / ID1018> at usbus0
Nov 25 09:26:36 jack kernel: Autoloading module: uhid
Nov 25 09:26:36 jack kernel: uhid0 on uhub0
Nov 25 09:26:36 jack kernel: uhid0: <American Power Conversion Smart-UPS1500 FW:UPS 03.5 / ID1018, class 0/0, rev 2.00/0.01, addr 2> on usbus0

_ nut (started from its script) cannot attach to the device;

_ when I start it as root I see:
Nov 25 09:31:23 jack devd[61831]: Processing event '!system=DEVFS subsystem=CDEV type=DESTROY cdev=uhid0'
Nov 25 09:31:23 jack kernel: uhid0: at uhub0, port 10, addr 2 (disconnected)
Nov 25 09:31:23 jack kernel: uhid0: detached
Nov 25 09:31:23 jack devd[61831]: Processing event '-uhid0 at   on uhub0'
Nov 25 09:31:23 jack usbhid-ups[69369]: Startup successful

_ after this (and killing it), I can start it normally.

Perhaps the "uhid0: detached" can give any hint?
I'd need to reboot a 13.1 machine, to see if it's different.



I'll try to debug this if I happen to reboot such a machine (12.3 with NUT) before I upgrade them (either to 12.4 to 13.1).
Comment 10 ml 2022-12-05 10:05:26 UTC
I've found a machine where the problem persists after the upgrade to 13.1.
I'm now even more convinced that the problem is usbhid-ups not being able to detach uhid if it's not run as root.
This should explain why, after running once as root, then NUT can run as uucp without problems.

I'm attaching two truss logs, one as root (succesful) and one as uucp (failing).
Comment 11 ml 2022-12-05 10:05:48 UTC
Created attachment 238533 [details]
Truss log as root
Comment 12 ml 2022-12-05 10:06:20 UTC
Created attachment 238534 [details]
Truss log as uucp
Comment 13 ml 2022-12-11 16:43:31 UTC
See also https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=126845

In fact NUT works properly on machines where the UPS idVendor/idProduct is listed in /usr/src/sys/dev/usb/usbdevs.
E.g. older APC models (idVendor = 0x051d, idProduct = 0x0002), but not newer models (idVendor = 0x051d, idProduct = 0x0003).

That bug is of course from 2009: I'm not so expert to tell, but possibly things have evolved, and I'm not sure adding entries to that list might would be the right approach (even if it would solve for a specific case).

What I find strange is that it used to work some months ago. However I cannot tell if at the time NUT was able to detach uhid even as non-root or if uhid did not even attach (as I said I cannot tell if a FreeBSD or NUT upgrade broke this).
Comment 14 Rudolf Čejka 2023-03-28 20:37:28 UTC
Hello,
  I have the same problem. After upgrade of nut from 2.7.4 to 2.8.0 it does not work anymore. It is most definitely not the correct fix, but it helps me to remove libusb_set_auto_detach_kernel_driver() code from drivers/libusb1.c in 2.8.0 (older 2.7.4 used libusb0).
Comment 15 Rudolf Čejka 2023-03-28 20:39:24 UTC
Created attachment 241167 [details]
Patch for sysutils/nut/files for 2.8.0
Comment 16 Cy Schubert freebsd_committer freebsd_triage 2023-03-28 20:44:17 UTC
(In reply to Rudolf Čejka from comment #14)

Which ports tree are you using HEAD or 2023Q1?

Is your ports tree up to date? head /usr/ports/sysutils/nut/Makefile.
Comment 17 Cy Schubert freebsd_committer freebsd_triage 2023-03-28 20:45:14 UTC
(In reply to Rudolf Čejka from comment #14)

Which ports tree are you using HEAD or 2023Q1?

Is your ports tree up to date? head /usr/ports/sysutils/nut/Makefile.
Comment 18 Cy Schubert freebsd_committer freebsd_triage 2023-03-28 20:51:31 UTC
(In reply to Rudolf Čejka from comment #15)
This is the incorrect patch to resolve a permissions problem.

Can you provide ls -l /dev/usb, please.

Also, list your /usr/local/etc/devd/nut-usb.conf.

Is it possible you might be tripping over PR/269729?
Comment 19 Rudolf Čejka 2023-03-28 21:36:51 UTC
Yes, PR/269729 seems that it would solve my problem, but it is impossible for me to use it right now, because I did freebsd-update -r 12.4 upgrade & install and pkg upgrade today and freebsd-update fetch says that there are no other updates. So I needed to find something, what else can I do. It seems that nut 2.7.4 did not detach HID and it worked, so 2.8.0 could too :o)
Comment 20 Rudolf Čejka 2023-03-28 22:23:55 UTC
# ls -l /dev/usb
total 0
...
crw-------  1 root  operator  0x60 Mar 28 22:29 2.1.1
crw-rw----  1 root  uucp      0x6a Mar 28 22:30 2.2.0
crw-------  1 root  operator  0x6c Mar 28 22:30 2.2.1
...

This PR is the same as PR/269729, see ioctl(6,USB_IFACE_DRIVER_DETACH,0x7fffffffc83c)	 ERR#1 'Operation not permitted'  in truss uucp and ioctl(6,USB_IFACE_DRIVER_DETACH,0x7fffffffc83c)	 = 0 (0x0) in truss root.

So there was problem with attaching HID to newer APCs, but if HID is attached, nut 2.8.0 tries to detach it and does not know how. Fortunately, it is not needed. I have

ugen2.2: <American Power Conversion Smart-UPS 3000 FW:UPS 06.5 / ID18> at usbus2, cfg=0 md=HOST spd=FULL (12Mbps) pwr=ON (2mA)
ugen2.2.0: uhid0: <American Power Conversion Smart-UPS 3000 FW:UPS 06.5 / ID18, class 0/0, rev 2.00/1.06, addr 2>

and patched /usr/local/libexec/nut/usbhid-ups works (but yes, there surely has to be more correct fix).
Comment 21 Cy Schubert freebsd_committer freebsd_triage 2023-03-28 22:44:17 UTC
Can you run,

usbconfig list

and

usbconfig dump_device_desc


The problem is that your UPS is recognized by the kernel as a USBHID device (usbhid is mouse, keyboard, headphones, or anything else for human interface). I will need to add a quirk to the kernel telling it the device is not a USBHID device.
Comment 22 Rudolf Čejka 2023-03-28 22:58:09 UTC
It seems that UPS3000 uses the same id 0x0003 as you added for UPS1000:

ugen2.2: <American Power Conversion Smart-UPS 3000 FW:UPS 06.5 / ID18> at usbus2, cfg=0 md=HOST spd=FULL (12Mbps) pwr=ON (2mA)

  bLength = 0x0012
  bDescriptorType = 0x0001
  bcdUSB = 0x0200
  bDeviceClass = 0x0000  <Probed by interface class>
  bDeviceSubClass = 0x0000
  bDeviceProtocol = 0x0000
  bMaxPacketSize0 = 0x0040
  idVendor = 0x051d
  idProduct = 0x0003
  bcdDevice = 0x0106
  iManufacturer = 0x0001  <American Power Conversion>
  iProduct = 0x0002  <Smart-UPS 3000 FW:UPS 06.5 / ID=18>
  iSerialNumber = 0x0003  <...>
  bNumConfigurations = 0x0001
Comment 23 Cy Schubert freebsd_committer freebsd_triage 2023-03-28 23:45:26 UTC
Can you upload dmesg and /var/log/messages please.

What is uname -a ?
Comment 24 Rudolf Čejka 2023-03-29 07:55:36 UTC
I think that your patch is still not in binary update, nothing more.

# uname -a
FreeBSD 12.4-RELEASE-p1 FreeBSD 12.4-RELEASE-p1 GENERIC  amd64
# freebsd-update fetch
src component not installed, skipped
Looking up aws.update.FreeBSD.org mirrors... 1 mirrors found.
Fetching metadata signature for 12.4-RELEASE from dualstack.aws.update.freebsd.org... done.
Fetching metadata index... done.
Inspecting system... done.
Preparing to download files... done.

No updates needed to update system to 12.4-RELEASE-p2.
Comment 25 Cy Schubert freebsd_committer freebsd_triage 2023-03-29 13:04:08 UTC
It will be on Saturday when 2023Q2 is cut.
Comment 26 Cy Schubert freebsd_committer freebsd_triage 2023-03-29 13:28:00 UTC
(In reply to Cy Schubert from comment #25)

Sorry. I mistook this to mean the port --> quarterly.

My patch will never be incorporated in binary updates. You will see it in 13.2. As I understand it there will be no 12.5 so the patch will never reach the 12.X binary updates.

It is in 12-STABLE but you will need to buildworld and buildkernel that yourself.

so@ has published that 12-STABLE will reach EOL at the end of this year. It is   recommended that you update to 13.2 or later by then.
Comment 27 Cy Schubert freebsd_committer freebsd_triage 2023-03-29 13:48:40 UTC
Looking at a copy of your /var/log/messages sent to me (grep nut would have been enough), we see,

Mar 28 20:17:19 pce003 root[776]: /etc/rc: WARNING: failed precmd routine for nu
t

Can you try service nut restart to see if it starts outside of boot? The information provided in this PR is sketchy. The best I can do is guess without more input.
Comment 28 Rudolf Čejka 2023-03-30 13:01:46 UTC
Hello, no it isn't. If there is uhid driver, it does not matter, when the service is started or restarted.

It is needed either to use patched nut, or I also considered the use of root permissions, but I'm rather using modified nut.

--- /usr/local/etc/rc.d/nut.orig     2023-03-21 14:05:24.000000000 +0100
+++ /usr/local/etc/rc.d/nut     2023-03-30 14:52:37.942716000 +0200
@@ -32,7 +32,7 @@
 stop_postcmd="nut_poststop"
 
 nut_prestart() {
-       ${nut_prefix}/sbin/upsdrvctl start
+       ${nut_prefix}/sbin/upsdrvctl -u root start
 }
 
 nut_poststop() {
Comment 29 Cy Schubert freebsd_committer freebsd_triage 2023-03-30 13:43:10 UTC
(In reply to Rudolf Čejka from comment #28)
This is a security risk.
Comment 30 Cy Schubert freebsd_committer freebsd_triage 2023-03-30 14:10:44 UTC
(In reply to Rudolf Čejka from comment #24)
Are you comfortable building and installing a new kernel? I can supply the patch  if you are.

It will need to be reapplied after every 12.4-RELEASE update because it is not a security fix. Security Officer only cherry-picks security fixes from stable/12 to releng/12.4. Therefore it will never be included in 12.4. And, I don't know if there will be a 12.5. If there is the fix will be in it.
Comment 31 Rudolf Čejka 2023-03-31 08:56:16 UTC
Yes for you.
Comment 32 ml 2023-08-05 14:07:40 UTC
(In reply to Cy Schubert from comment #26)

I wonder why this patch did not make it into 13.2.
I any case, after cherry-picking it, I can confirm it solves my problem.
Comment 33 Cy Schubert freebsd_committer freebsd_triage 2023-08-05 22:12:42 UTC
(In reply to ml from comment #32)

The patch was not applied because simply undefining HAVE_LIBUSB_KERNEL_DRIVER_ACTIVE or HAVE_LIBUSB_SET_AUTO_DETACH_KERNEL_DRIVER does the same thing.

Can you show us your config.h in your $WRKDIR, please?

Have you tried nut-devel? It fixes a lot of problems. It tracks the latest nut development.
Comment 34 ml 2023-08-07 08:54:50 UTC
Created attachment 243921 [details]
${WRKDIR}/include/config.h (as requested).
Comment 35 ml 2023-08-07 08:55:10 UTC
(In reply to Cy Schubert from comment #33)
(In reply to Cy Schubert from comment #33)


> The patch was not applied

Yet it's in 13/stable...



> because simply undefining HAVE_LIBUSB_KERNEL_DRIVER_ACTIVE or HAVE_LIBUSB_SET_AUTO_DETACH_KERNEL_DRIVER does the same thing.

I see these are defined in config.h.
What should I do to avoid them?



> Can you show us your config.h in your $WRKDIR, please?

I've just attached it.



> Have you tried nut-devel?

Not yet.
Currently everything works simply by cherry-picking your commits ece231cf5c60d7ee7cb59ea5f21d4471c907d0d7, 594c963bd3ec5c30b346beccf1e0e83a4d5b8c94 and ef8c91f5f99116c93210fd30b871dd86c75aa464.
Unless this is going to cause trouble, I'm sticking with these.
I'll try nut-devel in the future, but I'm currently out of time. Sorry.