Bug 263995 - ssh: ssh-sk-helper hangs
Summary: ssh: ssh-sk-helper hangs
Status: In Progress
Alias: None
Product: Base System
Classification: Unclassified
Component: bin (show other bugs)
Version: 13.1-STABLE
Hardware: Any Any
: --- Affects Some People
Assignee: Ed Maste
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2022-05-15 14:14 UTC by Christian Weisgerber
Modified: 2023-08-29 22:48 UTC (History)
9 users (show)

See Also:


Attachments
ssh-sk-helper ktrace of successful authentication (60.82 KB, text/plain)
2022-05-16 14:16 UTC, Christian Weisgerber
no flags Details
ssh-sk-helper ktrace of hanging authentication attempt (46.99 KB, text/plain)
2022-05-16 14:23 UTC, Christian Weisgerber
no flags Details
Don't drop endpoint in xhci driver (605 bytes, patch)
2022-06-22 14:52 UTC, Michael Gmelin
no flags Details | Diff
Patch to try. (766 bytes, patch)
2022-06-22 16:08 UTC, Hans Petter Selasky
no flags Details | Diff
Output when applying "Patch to try" (118.59 KB, text/plain)
2022-06-23 10:33 UTC, Michael Gmelin
no flags Details
Patch to try #2. (862 bytes, patch)
2022-06-23 12:33 UTC, Hans Petter Selasky
no flags Details | Diff
Backtraces of Patch to try #2 (16.74 KB, text/plain)
2022-06-23 13:03 UTC, Michael Gmelin
no flags Details
Patch for libfido2 in FreeBSD to use hidraw<N> when available (1.01 KB, patch)
2022-06-23 13:38 UTC, Hans Petter Selasky
no flags Details | Diff
Long log requested in comment #35 (269.98 KB, text/plain)
2022-06-23 15:55 UTC, Michael Gmelin
no flags Details
Patch to try #3. (487 bytes, patch)
2022-06-23 16:43 UTC, Hans Petter Selasky
no flags Details | Diff
usbhid-14CURRENT.patch (25.95 KB, patch)
2022-12-24 23:41 UTC, Vladimir Kondratyev
no flags Details | Diff
usbhid.patch (31.83 KB, patch)
2022-12-27 01:47 UTC, Vladimir Kondratyev
no flags Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Christian Weisgerber freebsd_committer freebsd_triage 2022-05-15 14:14:15 UTC
FreeBSD 13.1-STABLE (03f6d8361af1869ee0ab3ad115a729e298527860) GENERIC amd64
uhid1: <Yubico YubiKey FIDO, class 0/0, rev 2.00/5.43, addr 14> on usbus0


I have started using FIDO-based ssh keys in earnest, with id_ed25519_sk loaded into ssh-agent. Every few authentications, ssh-agent stops responding and every command that queries the agent (e.g. "ssh host", "ssh-add -l") will hang.

ssh-agent is unresponsive because ssh-sk-helper hangs. From "ps lwx":

1000 42272 42268 1  20  0    18352    6288 sbwait S+    9    0:00.01 ssh-agent -d
1000 42443 42272 3  20  0    18484    6296 select S+    9    0:00.00 /usr/libexec/ssh-sk-helper

Running "ssh-agent -d" shows this:

debug1: new_socket: type = CONNECTION
debug1: xcount 1 -> 2
debug3: fd 4 is O_NONBLOCK
debug1: process_message: socket 1 (fd=4) type 11
debug2: process_request_identities: entering
debug1: process_message: socket 1 (fd=4) type 13
debug1: process_sign_request2: entering
Confirm user presence for key ED25519-SK SHA256:w+YEBmsQsODSx1FDLTKrIWSKZ8b9Kk1neKIwzc6EHSw
debug3: start_helper: started pid=42443
debug3: Fssh_ssh_msg_send: type 5
debug3: ssh_msg_recv entering
debug1: start_helper: starting /usr/libexec/ssh-sk-helper 
debug1: process_sign: ready to sign with key ED25519-SK, provider internal: msg len 218, compat 0x0
debug1: Fssh_sshsk_sign: provider "internal", key ED25519-SK, flags 0x01
debug1: sk_probe: 1 device(s) detected
debug1: sk_probe: selecting sk by cred

Strangely, when I try to "truss -p <pid>" the ssh-sk-helper process, it unblocks (although authentication fails):

debug1: sk_open: fido_dev_open /dev/uhid1 failed: FIDO_ERR_RX
debug1: sk_openv: sk_open failed
debug1: sk_select_by_cred: sk_openv failed
debug1: ssh_sk_sign: failed to find sk
debug1: Fssh_sshsk_sign: sk_sign failed with code -1
debug1: ssh-sk-helper: Signing failed: invalid format
debug1: main: reply len 8
debug3: Fssh_ssh_msg_send: type 5
debug1: Fssh_client_converse: helper returned error -4
debug3: reap_helper: pid=42443
process_sign_request2: sshkey_sign: invalid format
User presence confirmed
debug1: xcount 2 -> 1

I guess it's possible that the problem is in the underlying FIDO stack, but the fact that attaching to the process with ptrace(2) unblocks it is weird.

I have tried different USB ports as well as a second FIDO authenticator. Same behavior.
Comment 1 Christian Weisgerber freebsd_committer freebsd_triage 2022-05-16 14:16:35 UTC
Created attachment 233963 [details]
ssh-sk-helper ktrace of successful authentication
Comment 2 Christian Weisgerber freebsd_committer freebsd_triage 2022-05-16 14:23:09 UTC
Created attachment 233964 [details]
ssh-sk-helper ktrace of hanging authentication attempt

When ssh-sk-helper is hanging, the Yubikey does not flash to request user verification.

A ktrace of ssh-sk-helper shows that it waits in ppoll() after sending the very first message to the authenticator. Obviously it does not receive a response from the Yubikey.

Compare with the ktrace of ssh-sk-helper for a successful authentication which shows an exchange of messages with the Yubikey.
Comment 3 Ed Maste freebsd_committer freebsd_triage 2022-05-16 15:34:37 UTC
I assume in the failing case
> 71813 ssh-sk-helper RET   ppoll -1 errno 4 Interrupted system call
is you killing the stuck ssh-sk-helper?
Comment 4 Ed Maste freebsd_committer freebsd_triage 2022-05-16 15:53:46 UTC
Also, are you able to easily try -current (which has OpenSSH 9.0)?
Comment 5 Christian Weisgerber freebsd_committer freebsd_triage 2022-05-16 16:54:20 UTC
Yes, this is me killing the hanging process.

No, I don't have a -current system to try this on.  I suggest you proceed with your scheduled MFC of OpenSSH 9.0, and we'll see if it makes a difference.  The ktrace points to libfido2, though.
Comment 6 Ed Maste freebsd_committer freebsd_triage 2022-05-16 17:40:43 UTC
Ah, libfido2 1.9 includes in its release notes:
Support for device timeouts; see fido_dev_set_timeout().
as well as "reliability fixes" (which are also listed for 1.10.0, 1.11.0)
Comment 7 Christian Weisgerber freebsd_committer freebsd_triage 2022-05-16 18:28:59 UTC
FWIW, I cannot reproduce the problem on OpenBSD with OpenSSH 9.0, libfido2 1.8.0.
Comment 8 Christian Weisgerber freebsd_committer freebsd_triage 2022-05-16 22:10:42 UTC
When ssh-sk-helper hangs, it has sent a single CTAPHID_INIT command to the authenticator and then sits in ppoll(), waiting for a response that never comes.
Comment 9 Christian Weisgerber freebsd_committer freebsd_triage 2022-05-18 17:56:41 UTC
So, I don't see how this can be a bug in ssh-sk-helper (libfido2). I have also confirmed with truss(1) that ssh-sk-helper ppoll()s the correct fd.

I'm wondering if this is actually a USB problem. I have now set hw.usb.uhid.debug=1. uhid(4) is not very chatty, however:

===> Successful authentication
May 18 19:36:11 lorvorc ssh-agent[2836]: error: Fssh_notify_start: exec(/usr/local/bin/ssh-askpass): No such file or directory
May 18 19:36:11 lorvorc kernel: uhid_intr_read_callback: transferred!
May 18 19:36:12 lorvorc syslogd: last message repeated 14 times

===> Hung authentication
May 18 19:45:06 lorvorc ssh-agent[3036]: error: Fssh_notify_start: exec(/usr/local/bin/ssh-askpass): No such file or directory
May 18 19:45:28 lorvorc ssh-agent[1532]: error: Fssh_client_converse: receive: unexpected internal error
May 18 19:45:28 lorvorc ssh-agent[1532]: error: reap_helper: helper exited abnormally
May 18 19:45:28 lorvorc ssh-agent[1532]: error: process_sign_request2: sshkey_sign: unexpected internal error

(Again, this is me terminating the hung process.)

The absence of any "uhid_intr_read_callback: transferred!" in the hung case makes me think that there might indeed be a problem at the USB level.
Comment 10 Michael Gmelin freebsd_committer freebsd_triage 2022-06-21 15:47:29 UTC
(In reply to Christian Weisgerber from comment #9)

Hi Chris,

I tried to debug this further, but I ended up deep in the USB rabbit hole where I didn't feel like this would be efficient use of my time.

What I realized while playing with `ykman fido`: Every second attempt get information on the key hung, which is an easy way to reproduce the problem. The traces you somewhere down to usbd_transmit.

Anyway, using the patch below, I could make things work (it starts an interrupt handler when the device attaches, therefore hid_intr_start/stop on open/close don't stop the interrupt handler). This is certainly not a solution, but it allows using the feature:

diff --git a/sys/dev/hid/hidraw.c b/sys/dev/hid/hidraw.c
index e71b2e2c7d5..e50c6aa75cb 100644
--- a/sys/dev/hid/hidraw.c
+++ b/sys/dev/hid/hidraw.c
@@ -215,6 +215,7 @@ hidraw_attach(device_t self)
 
        hidbus_set_lock(self, &sc->sc_mtx);
        hidbus_set_intr(self, hidraw_intr, sc);
+       hidbus_intr_start(sc->sc_dev);
 

You've got to be careful though - interrupting a process while the Yubikey is waiting for a touch will cause the kernel to panic (didn't look into why this is happening). Would be cool if you could play a bit with the patch (requires using hidraw and creating a symlink or one of the other options we discussed in private mail).
Comment 11 Hans Petter Selasky freebsd_committer freebsd_triage 2022-06-21 16:00:10 UTC
Did you check the USB traffic using usbdump ?

It will tell when transfers are started / stopped .

--HPS
Comment 12 Hans Petter Selasky freebsd_committer freebsd_triage 2022-06-21 16:03:24 UTC
I do recall some USB MIDI devices that crash and stop working unless you start polling the for data right away.

Maybe something similar.

--HPS
Comment 13 Hans Petter Selasky freebsd_committer freebsd_triage 2022-06-21 16:54:33 UTC
s/the for/the device for/
Comment 14 Michael Gmelin freebsd_committer freebsd_triage 2022-06-21 16:56:41 UTC
(In reply to Hans Petter Selasky from comment #12)

Additional comment for @emaste: This also happens with py-fido2 (so no ssh or libfido2 involved, even though python-fido2 is very similar to libfido2).

@Hans Petter: Didn't check with usbdump yet. The interesting part here is, that the device doesn't works on every second run of `ỳkman fido info` (so I don't have to `usbconfig -d x.x. reset` it). This can also be minutes after connecting it and it still works on the first attempt, but not on the second (the ssh case Chris was reporting uses more complex commands, so it almost always fails).

See below for dumps:
(Successful run and hanging run alternate)


Successful run:

16:42:43.751197 usbus0.2 SUBM-INTR-EP=00000082,SPD=HIGH,NFR=1,SLEN=0,IVAL=4
 frame[0] READ 64 bytes
16:42:43.751716 usbus0.2 DONE-INTR-EP=00000082,SPD=HIGH,NFR=0,SLEN=0,IVAL=4,ERR=CANCELLED
16:42:43.751793 usbus0.3 SUBM-CTRL-EP=00000080,SPD=FULL,NFR=2,SLEN=8,IVAL=0
 frame[0] WRITE 8 bytes
 0000  81 06 00 22 00 00 47 00  -- -- -- -- -- -- -- --  |..."..G.        |
 frame[1] READ 71 bytes
16:42:43.752093 usbus0.3 DONE-CTRL-EP=00000080,SPD=FULL,NFR=2,SLEN=72,IVAL=0,ERR=0
 frame[0] WRITE 8 bytes
 frame[1] READ 71 bytes
16:42:43.752159 usbus0.3 SUBM-INTR-EP=00000084,SPD=FULL,NFR=1,SLEN=0,IVAL=2
 frame[0] READ 64 bytes
16:42:43.752181 usbus0.3 SUBM-CTRL-EP=00000080,SPD=FULL,NFR=2,SLEN=8,IVAL=0
 frame[0] WRITE 8 bytes
 0000  81 06 00 22 01 00 22 00  -- -- -- -- -- -- -- --  |..."..".        |
 frame[1] READ 34 bytes
16:42:43.752342 usbus0.3 DONE-CTRL-EP=00000080,SPD=FULL,NFR=2,SLEN=36,IVAL=0,ERR=0
 frame[0] WRITE 8 bytes
 frame[1] READ 34 bytes
 0000  06 D0 F1 09 01 A1 01 09  20 15 00 26 FF 00 75 08  |........ ..&..u.|
 0010  95 40 81 02 09 21 15 00  26 FF 00 75 08 95 40 91  |.@...!..&..u..@.|
 0020  02 C0 -- -- -- -- -- --  -- -- -- -- -- -- -- --  |..              |
16:42:43.752926 usbus0.3 DONE-INTR-EP=00000084,SPD=FULL,NFR=0,SLEN=0,IVAL=2,ERR=CANCELLED
16:42:43.753318 usbus0.3 SUBM-INTR-EP=00000084,SPD=FULL,NFR=1,SLEN=0,IVAL=2
 frame[0] READ 64 bytes
16:42:43.753373 usbus0.3 SUBM-CTRL-EP=00000080,SPD=FULL,NFR=2,SLEN=8,IVAL=0
 frame[0] WRITE 8 bytes
 0000  81 06 00 22 01 00 22 00  -- -- -- -- -- -- -- --  |..."..".        |
 frame[1] READ 34 bytes
16:42:43.753530 usbus0.3 DONE-CTRL-EP=00000080,SPD=FULL,NFR=2,SLEN=36,IVAL=0,ERR=0
 frame[0] WRITE 8 bytes
 frame[1] READ 34 bytes
 0000  06 D0 F1 09 01 A1 01 09  20 15 00 26 FF 00 75 08  |........ ..&..u.|
 0010  95 40 81 02 09 21 15 00  26 FF 00 75 08 95 40 91  |.@...!..&..u..@.|
 0020  02 C0 -- -- -- -- -- --  -- -- -- -- -- -- -- --  |..              |
16:42:43.754112 usbus0.3 DONE-INTR-EP=00000084,SPD=FULL,NFR=0,SLEN=0,IVAL=2,ERR=CANCELLED
16:42:43.754392 usbus0.3 SUBM-INTR-EP=00000084,SPD=FULL,NFR=1,SLEN=0,IVAL=2
 frame[0] READ 64 bytes
16:42:43.754436 usbus0.3 SUBM-CTRL-EP=00000080,SPD=FULL,NFR=2,SLEN=8,IVAL=0
 frame[0] WRITE 8 bytes
 0000  81 06 00 22 01 00 22 00  -- -- -- -- -- -- -- --  |..."..".        |
 frame[1] READ 34 bytes
16:42:43.754592 usbus0.3 DONE-CTRL-EP=00000080,SPD=FULL,NFR=2,SLEN=36,IVAL=0,ERR=0
 frame[0] WRITE 8 bytes
 frame[1] READ 34 bytes
 0000  06 D0 F1 09 01 A1 01 09  20 15 00 26 FF 00 75 08  |........ ..&..u.|
 0010  95 40 81 02 09 21 15 00  26 FF 00 75 08 95 40 91  |.@...!..&..u..@.|
 0020  02 C0 -- -- -- -- -- --  -- -- -- -- -- -- -- --  |..              |
16:42:43.755174 usbus0.3 DONE-INTR-EP=00000084,SPD=FULL,NFR=0,SLEN=0,IVAL=2,ERR=CANCELLED
16:42:43.755269 usbus0.3 SUBM-INTR-EP=00000084,SPD=FULL,NFR=1,SLEN=0,IVAL=2
 frame[0] READ 64 bytes
16:42:43.755367 usbus0.3 SUBM-INTR-EP=00000004,SPD=FULL,NFR=1,SLEN=64,IVAL=2
 frame[0] WRITE 64 bytes
16:42:43.756282 usbus0.3 DONE-INTR-EP=00000004,SPD=FULL,NFR=1,SLEN=0,IVAL=2,ERR=0
 frame[0] WRITE 64 bytes
16:42:43.757325 usbus0.3 DONE-INTR-EP=00000084,SPD=FULL,NFR=1,SLEN=64,IVAL=2,ERR=0
 frame[0] READ 64 bytes
16:42:43.757328 usbus0.3 SUBM-INTR-EP=00000084,SPD=FULL,NFR=1,SLEN=0,IVAL=2
 frame[0] READ 64 bytes
16:42:43.757475 usbus0.3 SUBM-INTR-EP=00000004,SPD=FULL,NFR=1,SLEN=64,IVAL=2
 frame[0] WRITE 64 bytes
16:42:43.758327 usbus0.3 DONE-INTR-EP=00000004,SPD=FULL,NFR=1,SLEN=0,IVAL=2,ERR=0
 frame[0] WRITE 64 bytes
16:42:43.759325 usbus0.3 DONE-INTR-EP=00000084,SPD=FULL,NFR=1,SLEN=64,IVAL=2,ERR=0
 frame[0] READ 64 bytes
16:42:43.759326 usbus0.3 SUBM-INTR-EP=00000084,SPD=FULL,NFR=1,SLEN=0,IVAL=2
 frame[0] READ 64 bytes
16:42:43.759768 usbus0.3 SUBM-INTR-EP=00000004,SPD=FULL,NFR=1,SLEN=64,IVAL=2
 frame[0] WRITE 64 bytes
16:42:43.760327 usbus0.3 DONE-INTR-EP=00000004,SPD=FULL,NFR=1,SLEN=0,IVAL=2,ERR=0
 frame[0] WRITE 64 bytes
16:42:43.761325 usbus0.3 DONE-INTR-EP=00000084,SPD=FULL,NFR=1,SLEN=64,IVAL=2,ERR=0
 frame[0] READ 64 bytes
16:42:43.761326 usbus0.3 SUBM-INTR-EP=00000084,SPD=FULL,NFR=1,SLEN=0,IVAL=2
 frame[0] READ 64 bytes
16:42:43.762325 usbus0.3 DONE-INTR-EP=00000084,SPD=FULL,NFR=1,SLEN=64,IVAL=2,ERR=0
 frame[0] READ 64 bytes
16:42:43.762326 usbus0.3 SUBM-INTR-EP=00000084,SPD=FULL,NFR=1,SLEN=0,IVAL=2
 frame[0] READ 64 bytes
16:42:43.763324 usbus0.3 DONE-INTR-EP=00000084,SPD=FULL,NFR=1,SLEN=64,IVAL=2,ERR=0
 frame[0] READ 64 bytes
16:42:43.763325 usbus0.3 SUBM-INTR-EP=00000084,SPD=FULL,NFR=1,SLEN=0,IVAL=2
 frame[0] READ 64 bytes
16:42:43.764324 usbus0.3 DONE-INTR-EP=00000084,SPD=FULL,NFR=1,SLEN=64,IVAL=2,ERR=0
 frame[0] READ 64 bytes
16:42:43.764325 usbus0.3 SUBM-INTR-EP=00000084,SPD=FULL,NFR=1,SLEN=0,IVAL=2
 frame[0] READ 64 bytes
16:42:43.766127 usbus0.3 SUBM-INTR-EP=00000004,SPD=FULL,NFR=1,SLEN=64,IVAL=2
 frame[0] WRITE 64 bytes
16:42:43.767297 usbus0.3 DONE-INTR-EP=00000004,SPD=FULL,NFR=1,SLEN=0,IVAL=2,ERR=0
 frame[0] WRITE 64 bytes
16:42:43.768334 usbus0.3 DONE-INTR-EP=00000084,SPD=FULL,NFR=1,SLEN=64,IVAL=2,ERR=0
 frame[0] READ 64 bytes
16:42:43.768338 usbus0.3 SUBM-INTR-EP=00000084,SPD=FULL,NFR=1,SLEN=0,IVAL=2
 frame[0] READ 64 bytes
16:42:43.769302 usbus0.3 DONE-INTR-EP=00000084,SPD=FULL,NFR=0,SLEN=0,IVAL=2,ERR=CANCELLED



Hanging run (identical to successful, until it hangs):

16:42:53.317064 usbus0.2 SUBM-INTR-EP=00000082,SPD=HIGH,NFR=1,SLEN=0,IVAL=4
 frame[0] READ 64 bytes
16:42:53.317580 usbus0.2 DONE-INTR-EP=00000082,SPD=HIGH,NFR=0,SLEN=0,IVAL=4,ERR=CANCELLED
16:42:53.317629 usbus0.3 SUBM-CTRL-EP=00000080,SPD=FULL,NFR=2,SLEN=8,IVAL=0
 frame[0] WRITE 8 bytes
 0000  81 06 00 22 00 00 47 00  -- -- -- -- -- -- -- --  |..."..G.        |
 frame[1] READ 71 bytes
16:42:53.317960 usbus0.3 DONE-CTRL-EP=00000080,SPD=FULL,NFR=2,SLEN=72,IVAL=0,ERR=0
 frame[0] WRITE 8 bytes
 frame[1] READ 71 bytes
16:42:53.318052 usbus0.3 SUBM-INTR-EP=00000084,SPD=FULL,NFR=1,SLEN=0,IVAL=2
 frame[0] READ 64 bytes
16:42:53.318079 usbus0.3 SUBM-CTRL-EP=00000080,SPD=FULL,NFR=2,SLEN=8,IVAL=0
 frame[0] WRITE 8 bytes
 0000  81 06 00 22 01 00 22 00  -- -- -- -- -- -- -- --  |..."..".        |
 frame[1] READ 34 bytes
16:42:53.318359 usbus0.3 DONE-CTRL-EP=00000080,SPD=FULL,NFR=2,SLEN=36,IVAL=0,ERR=0
 frame[0] WRITE 8 bytes
 frame[1] READ 34 bytes
 0000  06 D0 F1 09 01 A1 01 09  20 15 00 26 FF 00 75 08  |........ ..&..u.|
 0010  95 40 81 02 09 21 15 00  26 FF 00 75 08 95 40 91  |.@...!..&..u..@.|
 0020  02 C0 -- -- -- -- -- --  -- -- -- -- -- -- -- --  |..              |
16:42:53.318943 usbus0.3 DONE-INTR-EP=00000084,SPD=FULL,NFR=0,SLEN=0,IVAL=2,ERR=CANCELLED
16:42:53.319310 usbus0.3 SUBM-INTR-EP=00000084,SPD=FULL,NFR=1,SLEN=0,IVAL=2
 frame[0] READ 64 bytes
16:42:53.319365 usbus0.3 SUBM-CTRL-EP=00000080,SPD=FULL,NFR=2,SLEN=8,IVAL=0
 frame[0] WRITE 8 bytes
 0000  81 06 00 22 01 00 22 00  -- -- -- -- -- -- -- --  |..."..".        |
 frame[1] READ 34 bytes
16:42:53.319522 usbus0.3 DONE-CTRL-EP=00000080,SPD=FULL,NFR=2,SLEN=36,IVAL=0,ERR=0
 frame[0] WRITE 8 bytes
 frame[1] READ 34 bytes
 0000  06 D0 F1 09 01 A1 01 09  20 15 00 26 FF 00 75 08  |........ ..&..u.|
 0010  95 40 81 02 09 21 15 00  26 FF 00 75 08 95 40 91  |.@...!..&..u..@.|
 0020  02 C0 -- -- -- -- -- --  -- -- -- -- -- -- -- --  |..              |
16:42:53.320103 usbus0.3 DONE-INTR-EP=00000084,SPD=FULL,NFR=0,SLEN=0,IVAL=2,ERR=CANCELLED
16:42:53.320385 usbus0.3 SUBM-INTR-EP=00000084,SPD=FULL,NFR=1,SLEN=0,IVAL=2
 frame[0] READ 64 bytes
16:42:53.320434 usbus0.3 SUBM-CTRL-EP=00000080,SPD=FULL,NFR=2,SLEN=8,IVAL=0
 frame[0] WRITE 8 bytes
 0000  81 06 00 22 01 00 22 00  -- -- -- -- -- -- -- --  |..."..".        |
 frame[1] READ 34 bytes
16:42:53.320595 usbus0.3 DONE-CTRL-EP=00000080,SPD=FULL,NFR=2,SLEN=36,IVAL=0,ERR=0
 frame[0] WRITE 8 bytes
 frame[1] READ 34 bytes
 0000  06 D0 F1 09 01 A1 01 09  20 15 00 26 FF 00 75 08  |........ ..&..u.|
 0010  95 40 81 02 09 21 15 00  26 FF 00 75 08 95 40 91  |.@...!..&..u..@.|
 0020  02 C0 -- -- -- -- -- --  -- -- -- -- -- -- -- --  |..              |
16:42:53.321176 usbus0.3 DONE-INTR-EP=00000084,SPD=FULL,NFR=0,SLEN=0,IVAL=2,ERR=CANCELLED
16:42:53.321263 usbus0.3 SUBM-INTR-EP=00000084,SPD=FULL,NFR=1,SLEN=0,IVAL=2
 frame[0] READ 64 bytes
16:42:53.321377 usbus0.3 SUBM-INTR-EP=00000004,SPD=FULL,NFR=1,SLEN=64,IVAL=2
 frame[0] WRITE 64 bytes
16:42:53.322284 usbus0.3 DONE-INTR-EP=00000004,SPD=FULL,NFR=1,SLEN=0,IVAL=2,ERR=0
 frame[0] WRITE 64 bytes


Recovery (happens after hitting Ctrl-C on a hanging run):

16:53:33.069916 usbus0.3 SUBM-INTR-EP=00000004,SPD=FULL,NFR=1,SLEN=64,IVAL=2
 frame[0] WRITE 64 bytes
16:53:33.070709 usbus0.3 DONE-INTR-EP=00000004,SPD=FULL,NFR=1,SLEN=0,IVAL=2,ERR=0
 frame[0] WRITE 64 bytes
16:53:33.101737 usbus0.3 DONE-INTR-EP=00000084,SPD=FULL,NFR=0,SLEN=0,IVAL=2,ERR=CANCELLED
Comment 15 Michael Gmelin freebsd_committer freebsd_triage 2022-06-22 14:52:40 UTC
Created attachment 234858 [details]
Don't drop endpoint in xhci driver

@Hans Petter:
FIDO used to work on 13.0 (I compared on the same hardware to be sure).

Looks like commit https://cgit.freebsd.org/src/commit/?id=3644b92099938b is where things broke.

When applying the super simple patch attached, FIDO is working ok (as it disables dropping the endpoint).

I guess that the commit mentioned above just uncovered some other issue that should be addressed, but for Christian it is certainly a viable workaround (it also stops the kernel from panicking when hitting CTRL-C, which happened with the previous patch).
Comment 16 Hans Petter Selasky freebsd_committer freebsd_triage 2022-06-22 15:01:12 UTC
This commit has later been fixed, to exclude dropping the endpoint context for USB interrupt endpoints, which is what I believe this HW is using.

https://cgit.freebsd.org/src/commit/?id=3644b92099938b

Does the problem reproduce on 13-stable aswell?

Regarding the logs:

Hanging run (identical to successful, until it hangs):

16:42:53.317064 usbus0.2 SUBM-INTR-EP=00000082,SPD=HIGH,NFR=1,SLEN=0,IVAL=4
 frame[0] READ 64 bytes
16:42:53.317580 usbus0.2 DONE-INTR-EP=00000082,SPD=HIGH,NFR=0,SLEN=0,IVAL=4,ERR=CANCELLED

The CANCELLED error code only happens when the software triggers a usbd_transfer_stop(), which may cause data-loss. It is very strange this happens right after the start of the hanging run.

--HPS
Comment 17 Michael Gmelin freebsd_committer freebsd_triage 2022-06-22 15:29:51 UTC
(In reply to Hans Petter Selasky from comment #16)

> This commit has later been fixed, to exclude dropping the endpoint context for USB interrupt endpoints, which is what I believe this HW is using.

Tested on 13.1-RELEASE and latest 13-STABLE (patch is against latest 13-STABLE).

I assume you are referring to:
https://cgit.freebsd.org/src/commit/?id=e276d281503160ba3648

which excludes dropping the endpoint context for everything **but** bulk and interrupt endpoints. ("xhci(4): Only drop BULK and INTERRUPT endpoints to reset data toggle.").

This HW is using an interrupt endpoint, correct.
Comment 18 Michael Gmelin freebsd_committer freebsd_triage 2022-06-22 15:40:21 UTC
(In reply to Hans Petter Selasky from comment #16)

p.s.
> The CANCELLED error code only happens when the software triggers a 
> usbd_transfer_stop(), which may cause data-loss. It is very strange this
> happens right after the start of the hanging run.

This also happens right after the start of the successful run, like I wrote, successful and hanging are **exactly** the same up until the point the hanging run.. hangs ;) (if you remove timestamps and diff, all you find is additional lines from the successful run).
Comment 19 Hans Petter Selasky freebsd_committer freebsd_triage 2022-06-22 16:08:13 UTC
Created attachment 234867 [details]
Patch to try.

I see one issue there. The usbhid driver uses usbd_transfer_stop() to make USB transfers synchronous, but that function is async. So we need to use usbd_transfer_drain() to ensure we don't screw up I think.

Can you test this patch, and provide the output from usbdump as previously done?
Comment 20 Michael Gmelin freebsd_committer freebsd_triage 2022-06-23 10:33:42 UTC
Created attachment 234882 [details]
Output when applying "Patch to try"

Hi Hans Petter,

I tried the patch, but it didn't improve the situation.

Please see the attached file (withpatch.debugoutput.txt) for output of a successful and a failed run (back to back), timing information is in the file.

This also contains syslog from running with xhci debug level 5 - maybe it helps (not sure if remainder=64 is relevant in there).
Comment 21 Hans Petter Selasky freebsd_committer freebsd_triage 2022-06-23 10:54:58 UTC
Jun 23 10:20:25 jailhost-b kernel: xhci_check_transfer: slot=1 epno=5 remainder=0 status=27

Endpoint 5 and 9 are receiving status 27 (XHCI_TRB_ERROR_LENGTH), which likely means these endpoints are receiving more data than expected.

Can you dump the configuration descriptor of your device using usbdump:

usbdump -d ugenX.Y dump_curr_config_desc

I want to check wMaxPacketSize, if that is 64 or 512.
Comment 22 Hans Petter Selasky freebsd_committer freebsd_triage 2022-06-23 10:55:33 UTC
XHCI endpoint 5 and 9 => 0x82 and 0x84 (USB endpoint values).
Comment 23 Hans Petter Selasky freebsd_committer freebsd_triage 2022-06-23 11:05:59 UTC
usbdump -> usbconfig

Can you dump the configuration descriptor of your device using usbconfig:

usbconfig -d ugenX.Y dump_curr_config_desc
Comment 24 Hans Petter Selasky freebsd_committer freebsd_triage 2022-06-23 11:50:00 UTC
One more command:

usbconfig -d ugenX.Y dump_stats

--HPS
Comment 25 Hans Petter Selasky freebsd_committer freebsd_triage 2022-06-23 12:33:18 UTC
Created attachment 234886 [details]
Patch to try #2.

Collect logs from dmesg with this patch enabled.
Comment 26 Michael Gmelin freebsd_committer freebsd_triage 2022-06-23 12:48:40 UTC
(In reply to Hans Petter Selasky from comment #24)

Output of usbconfig:

```
ugen0.3: <Yubico YubiKey OTP+FIDO+CCID> at usbus0, cfg=0 md=HOST spd=FULL (12Mbps) pwr=ON (30mA)


 Configuration index 0

    bLength = 0x0009 
    bDescriptorType = 0x0002 
    wTotalLength = 0x0096 
    bNumInterfaces = 0x0003 
    bConfigurationValue = 0x0001 
    iConfiguration = 0x0000  <no string>
    bmAttributes = 0x0080 
    bMaxPower = 0x000f 

    Interface 0
      bLength = 0x0009 
      bDescriptorType = 0x0004 
      bInterfaceNumber = 0x0000 
      bAlternateSetting = 0x0000 
      bNumEndpoints = 0x0001 
      bInterfaceClass = 0x0003  <HID device>
      bInterfaceSubClass = 0x0001 
      bInterfaceProtocol = 0x0001 
      iInterface = 0x0000  <no string>

      Additional Descriptor

      bLength = 0x09
      bDescriptorType = 0x21
      bDescriptorSubType = 0x10
       RAW dump: 
       0x00 | 0x09, 0x21, 0x10, 0x01, 0x00, 0x01, 0x22, 0x47, 
       0x08 | 0x00

     Endpoint 0
        bLength = 0x0007 
        bDescriptorType = 0x0005 
        bEndpointAddress = 0x0081  <IN>
        bmAttributes = 0x0003  <INTERRUPT>
        wMaxPacketSize = 0x0008 
        bInterval = 0x000a 
        bRefresh = 0x0000 
        bSynchAddress = 0x0000 


    Interface 1
      bLength = 0x0009 
      bDescriptorType = 0x0004 
      bInterfaceNumber = 0x0001 
      bAlternateSetting = 0x0000 
      bNumEndpoints = 0x0002 
      bInterfaceClass = 0x0003  <HID device>
      bInterfaceSubClass = 0x0000 
      bInterfaceProtocol = 0x0000 
      iInterface = 0x0000  <no string>

      Additional Descriptor

      bLength = 0x09
      bDescriptorType = 0x21
      bDescriptorSubType = 0x10
       RAW dump: 
       0x00 | 0x09, 0x21, 0x10, 0x01, 0x00, 0x01, 0x22, 0x22, 
       0x08 | 0x00

     Endpoint 0
        bLength = 0x0007 
        bDescriptorType = 0x0005 
        bEndpointAddress = 0x0004  <OUT>
        bmAttributes = 0x0003  <INTERRUPT>
        wMaxPacketSize = 0x0040 
        bInterval = 0x0002 
        bRefresh = 0x0000 
        bSynchAddress = 0x0000 

     Endpoint 1
        bLength = 0x0007 
        bDescriptorType = 0x0005 
        bEndpointAddress = 0x0084  <IN>
        bmAttributes = 0x0003  <INTERRUPT>
        wMaxPacketSize = 0x0040 
        bInterval = 0x0002 
        bRefresh = 0x0000 
        bSynchAddress = 0x0000 


    Interface 2
      bLength = 0x0009 
      bDescriptorType = 0x0004 
      bInterfaceNumber = 0x0002 
      bAlternateSetting = 0x0000 
      bNumEndpoints = 0x0003 
      bInterfaceClass = 0x000b  <Smart card>
      bInterfaceSubClass = 0x0000 
      bInterfaceProtocol = 0x0000 
      iInterface = 0x0000  <no string>

      Additional Descriptor

      bLength = 0x36
      bDescriptorType = 0x21
      bDescriptorSubType = 0x00
       RAW dump: 
       0x00 | 0x36, 0x21, 0x00, 0x01, 0x00, 0x07, 0x02, 0x00, 
       0x08 | 0x00, 0x00, 0xa0, 0x0f, 0x00, 0x00, 0xa0, 0x0f, 
       0x10 | 0x00, 0x00, 0x00, 0x00, 0xb0, 0x04, 0x00, 0x00, 
       0x18 | 0xb0, 0x04, 0x00, 0x00, 0xf6, 0x0b, 0x00, 0x00, 
       0x20 | 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 
       0x28 | 0xfe, 0x00, 0x04, 0x00, 0x00, 0x0c, 0x00, 0x00, 
       0x30 | 0xff, 0xff, 0x00, 0x00, 0x00, 0x01


     Endpoint 0
        bLength = 0x0007 
        bDescriptorType = 0x0005 
        bEndpointAddress = 0x0002  <OUT>
        bmAttributes = 0x0002  <BULK>
        wMaxPacketSize = 0x0040 
        bInterval = 0x0000 
        bRefresh = 0x0000 
        bSynchAddress = 0x0000 

     Endpoint 1
        bLength = 0x0007 
        bDescriptorType = 0x0005 
        bEndpointAddress = 0x0082  <IN>
        bmAttributes = 0x0002  <BULK>
        wMaxPacketSize = 0x0040 
        bInterval = 0x0000 
        bRefresh = 0x0000 
        bSynchAddress = 0x0000 

     Endpoint 2
        bLength = 0x0007 
        bDescriptorType = 0x0005 
        bEndpointAddress = 0x0083  <IN>
        bmAttributes = 0x0003  <INTERRUPT>
        wMaxPacketSize = 0x0008 
        bInterval = 0x0020 
        bRefresh = 0x0000 
        bSynchAddress = 0x0000 



ugen0.3: <Yubico YubiKey OTP+FIDO+CCID> at usbus0, cfg=0 md=HOST spd=FULL (12Mbps) pwr=ON (30mA)

{
    UE_CONTROL_OK       : 30
    UE_ISOCHRONOUS_OK   : 0
    UE_BULK_OK          : 0
    UE_INTERRUPT_OK     : 13
    UE_CONTROL_FAIL     : 0
    UE_ISOCHRONOUS_FAIL : 0
    UE_BULK_FAIL        : 0
    UE_INTERRUPT_FAIL   : 0
}
```

I can turn off features of the yubikey, so that it appears as "Yubico YubiKey FIDO". In this case, only one HID device shows up (as OTP is gone). In this case, my workaround by patching drop out of xhci.c doesn **not** work anymore.
Comment 27 Michael Gmelin freebsd_committer freebsd_triage 2022-06-23 12:53:00 UTC
(In reply to Michael Gmelin from comment #26)

> I can turn off features of the yubikey, so that it appears as "Yubico YubiKey
> FIDO". In this case, only one HID device shows up (as OTP is gone). In this
> case, my workaround by patching drop out of xhci.c doesn **not** work anymore.

Scrap that last comment, the workaround still works in this case (one has to boot the right kernel when testing *sigh*). I'll keep the key in this configuration now (less noise).

Will try the new patch to collect debug info...
Comment 28 Michael Gmelin freebsd_committer freebsd_triage 2022-06-23 13:03:42 UTC
Created attachment 234888 [details]
Backtraces of Patch to try #2

This contains two attempts, first successful, second failed.
(ran on a yubikey that had all features but fido disabled now).
Comment 29 Michael Gmelin freebsd_committer freebsd_triage 2022-06-23 13:05:24 UTC
Comment on attachment 234888 [details]
Backtraces of Patch to try #2

13:01:04 first attempt (successful)
13:01:06 second attempt (hangs)
13:01:09 cancelled (SIGINT)
Comment 30 Hans Petter Selasky freebsd_committer freebsd_triage 2022-06-23 13:22:31 UTC
Do you have:

HIDRAW_MAKE_UHID_ALIAS

set in the kernel configuration file?

I suspect that /dev/uhid<N> is conflicting with /dev/hidraw<N> . They will steal data from eachother if active at the same time.

Could you check and see if you have any /dev/hidraw<N> devices, and use that instead of uhid<N> ?

Or add:

options HIDRAW_MAKE_UHID_ALIAS

to your kernel configuration file?

--HPS
Comment 31 Hans Petter Selasky freebsd_committer freebsd_triage 2022-06-23 13:26:33 UTC
And you also need to set:

sysctl hw.usb.usbhid.enable=1

--HPS
Comment 32 Hans Petter Selasky freebsd_committer freebsd_triage 2022-06-23 13:29:00 UTC
Or:

Disable:

sysctl hw.usb.usbhid.enable=0

Might fix it too, so that uhid has exclusive access to the device!

--HPS
Comment 33 Hans Petter Selasky freebsd_committer freebsd_triage 2022-06-23 13:38:46 UTC
Created attachment 234891 [details]
Patch for libfido2 in FreeBSD to use hidraw<N> when available

Here is a patch to use the newer hidraw<N> device when available.
Comment 34 Michael Gmelin freebsd_committer freebsd_triage 2022-06-23 14:19:18 UTC
(In reply to Hans Petter Selasky from comment #33)

It might have been lost in the long stream of comments:

- I do not use HIDRAW_MAKE_UHID_ALIAS
- I do use sysctl hw.usb.usbhid.enable=1
- The traces you get are from using the python-fido2 code through `ykman fido 
  info`[0], as it removes libfido2 and ssh from the picture and keeps things
  simple.
- I also tested with ssh (removing python-fido2 from the picture), as I 
  described in the comments (which confirms my findings, the xhci.c patch also 
  works for this use case)
- For testing with ssh, I already use a patched version of libfido2. I'm also in 
  the process of upstreaming the patch, see 
  https://github.com/Yubico/libfido2/pull/597

Christian also tested both with uhid(4)and hidraw(4) and could confirm the findings.

If you want me to, I can also run all tests again using uhid(4).

[0] this uses the patched version of ykman from bug #263916
Comment 35 Hans Petter Selasky freebsd_committer freebsd_triage 2022-06-23 15:00:41 UTC
To eliminate other issues I would like to do the following:

1)
sysctl sysctl hw.usb.usbhid.enable=0

2) Verify this command should be empty:
sysctl -a kern.conftxt | grep UHID

3) Physically re-plug the device.

4) Enable uhid debugging (needs options USB_DEBUG):
sysctl hw.usb.uhid.debug=16

5) Start usbdump to capture all traffic.

6) Reproduce the two cases.

If the failure still happens, also enable xhci debug:

7) sysctl hw.usb.xhci.debug=16

Thank you!

Hopefully we are not that far away from a solution :-)

--HPS
Comment 36 Michael Gmelin freebsd_committer freebsd_triage 2022-06-23 15:55:36 UTC
Created attachment 234892 [details]
Long log requested in comment #35

(In reply to Hans Petter Selasky from comment #35)
[x] Checked kernel source - stable/13 + Patch to try #2
Patch makes no difference, since we're not using hidbus(4).

[x] sysctl sysctl hw.usb.usbhid.enable=0
I actually rebooted the machine with the setting in /boot/loader.conf to be absolutely safe.

[x] Verify this command should be empty: sysctl -a kern.conftxt | grep UHID

[x] Physically re-plug the device.

[x] Enable uhid debugging (needs options USB_DEBUG):
sysctl hw.usb.uhid.debug=16

[x] Start usbdump to capture all traffic.
[x] Reproduce the two cases.
Failed, see attachment

[x] If the failure still happens, also enable xhci debug:
sysctl hw.usb.xhci.debug=16
Failed, see attachment

I also added an additional file doing it eight times in a row (success, fail, success, fail, success, fail, success, fail, sucess, fail), see attachment.

usbdump includes hitting CTRL-C on failed runs.
Comment 37 Michael Gmelin freebsd_committer freebsd_triage 2022-06-23 16:06:56 UTC
(In reply to Michael Gmelin from comment #36)

I can also confirm that - as to be expected - running in this setup, disabling drop in xhci works around the problem like it did with hidraw:


-	if (drop != 0 &&
+	if (0 && drop != 0 &&
Comment 38 Hans Petter Selasky freebsd_committer freebsd_triage 2022-06-23 16:36:51 UTC
Hi,

I'm aware about the drop patch, but that is no solution, because the stop endpoint is there to prevent freed DMA memory from being used after usbd_transfer_unsetup() finishes. Let me explain.

Here is my analysis:

It appears that on endpoint 0x84, there is some kind of "HELLO" packet appearing at the very first:

Jun 23 15:37:29 jailhost-b kernel: xhci_check_transfer: slot=4 epno=9 remainder=0 status=1

The length appears to be 64-bytes:

Jun 23 15:37:29 jailhost-b kernel: xhci_generic_done_sub: xfer=0xfffffe00ca4dd278[0/1] rem=0/64 status=1

The USB interrupt job is asynchronous, which means it goes on trying to read the next USB packet after it has completed.

Now in the case we are closing the UHID file descriptor a function called usbd_transfer_unsetup() is called, which basically invoke the XHCI stop endpoint, with drop=1, because we are pulling out all active USB transfers and free memory.

The problem is that the last USB packet has already been received, and the USB stack is waiting for another one, which accidentially is received, but because we are in the process of tearing down the USB transfers, get lost:

Jun 23 15:37:29 jailhost-b kernel: xhci_check_transfer: slot=4 epno=9 remainder=0 status=27 (XHCI_TRB_ERROR_LENGTH and data was received)

I will make a patch shortly to disable USB read ahead on the interrupt endpoint, and that will hopefully fix the problem!

--HPS
Comment 39 Hans Petter Selasky freebsd_committer freebsd_triage 2022-06-23 16:43:49 UTC
Created attachment 234895 [details]
Patch to try #3.

Can you test this patch using the uhid driver as a backend?

--HPS
Comment 40 Michael Gmelin freebsd_committer freebsd_triage 2022-06-23 17:36:58 UTC
(In reply to Hans Petter Selasky from comment #39)

Using this patch (and making sure all other workarounds are disables) both `ykman fido info` and all the ssh commands work.
Comment 41 Hans Petter Selasky freebsd_committer freebsd_triage 2022-06-23 17:41:44 UTC
Cool!

So what do we do about hidraw?

It seems like a bigger job to fix.

--HPS
Comment 42 Michael Gmelin freebsd_committer freebsd_triage 2022-06-23 17:44:09 UTC
(In reply to Hans Petter Selasky from comment #41)

Well, the question is where to start. Maybe wulf@ could help?
Comment 43 Christian Weisgerber freebsd_committer freebsd_triage 2022-06-23 18:05:55 UTC
(In reply to Hans Petter Selasky from comment #39)

[Patch to try #3]
I can also confirm that this fixes the problem for me as reported with ssh(1) and uhid(4) on 13-STABLE.
Comment 44 commit-hook freebsd_committer freebsd_triage 2022-06-23 19:12:59 UTC
A commit in branch main references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=b6f615255d8bcdf40604005b11998eee86872364

commit b6f615255d8bcdf40604005b11998eee86872364
Author:     Hans Petter Selasky <hselasky@FreeBSD.org>
AuthorDate: 2022-06-23 17:39:21 +0000
Commit:     Hans Petter Selasky <hselasky@FreeBSD.org>
CommitDate: 2022-06-23 19:11:24 +0000

    uhid(4): Don't read-ahead from the USB IN endpoint.

    This avoids an issue where IN endpoint data received from the device right
    before the file handle is closed, gets lost.

    PR:             263995
    MFC after:      1 week
    Sponsored by:   NVIDIA Networking

 sys/dev/usb/input/uhid.c | 6 ++++++
 1 file changed, 6 insertions(+)
Comment 45 Vladimir Kondratyev freebsd_committer freebsd_triage 2022-06-24 10:11:04 UTC
(In reply to Michael Gmelin from comment #42)
> Well, the question is where to start. Maybe wulf@ could help?

I do not know why usbd_transfer_unsetup() is called at all. The only place it called is detach handler of uhid(4) which is not invoked on UHID file descriptor closing. Should it be read as usbd_transfer_stop()?

If so, hidraw(4)/usbhid(4) usbd_transfer_start/stop logic differs from uhid(4) one. hidraw(4)/usbhid(4) invokes open/drain pair on open/dtor character device handlers rather then open/close on read_start/stop usb_fifo methods like uhid(4) does.

The question is: Should usbd_transfer_drain() handle the case of "USB stack is waiting for another one, which accidentially is received, but because we are in the process of tearing down the USB transfers, get lost" gracefully or not?
Comment 46 Hans Petter Selasky freebsd_committer freebsd_triage 2022-06-24 10:52:12 UTC
> The question is: Should usbd_transfer_drain() handle the case of "USB stack is waiting for another one, which accidentially is received, but because we are in the process of tearing down the USB transfers, get lost" gracefully or not?

I think this will depend too much on the USB hardware. Might work with XHCI but not EHCI/OHCI/UHCI and so on.

Right now uhid will only start a new usb transfer iff, read syscall, or POLL w/POLL IN. When a USB IN transaction completes, it will then again wait for read syscall or POLL w/POLL IN.

Could hidbus do something similar?

The problem is when closing the FD, that the device sends some initial HELLO world packet which gets lost. First uhid<N> is opened for probing, and then it is opened for I/O, and then it is closed/opened again for the next I/O and this frequent open/close causes a problem / data loss.

--HPS
Comment 47 Vladimir Kondratyev freebsd_committer freebsd_triage 2022-06-24 11:28:18 UTC
(In reply to Hans Petter Selasky from comment #46)
> Could hidbus do something similar?
It is up to hidraw and evdev. All hidbus does is broadcasting of input reports to subscribed children here.

Implementation of this feature is good place to start regardless of the issue. I'll create a patch soon.
Comment 48 commit-hook freebsd_committer freebsd_triage 2022-06-30 09:40:33 UTC
A commit in branch stable/13 references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=20d32249196ea6122b481bd61841337a67855337

commit 20d32249196ea6122b481bd61841337a67855337
Author:     Hans Petter Selasky <hselasky@FreeBSD.org>
AuthorDate: 2022-06-23 17:39:21 +0000
Commit:     Hans Petter Selasky <hselasky@FreeBSD.org>
CommitDate: 2022-06-30 09:39:43 +0000

    uhid(4): Don't read-ahead from the USB IN endpoint.

    This avoids an issue where IN endpoint data received from the device right
    before the file handle is closed, gets lost.

    PR:             263995
    Sponsored by:   NVIDIA Networking

    (cherry picked from commit b6f615255d8bcdf40604005b11998eee86872364)

 sys/dev/usb/input/uhid.c | 6 ++++++
 1 file changed, 6 insertions(+)
Comment 49 commit-hook freebsd_committer freebsd_triage 2022-06-30 09:42:34 UTC
A commit in branch stable/12 references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=e81b8c36e7578b5986c4837868eaee00711d6293

commit e81b8c36e7578b5986c4837868eaee00711d6293
Author:     Hans Petter Selasky <hselasky@FreeBSD.org>
AuthorDate: 2022-06-23 17:39:21 +0000
Commit:     Hans Petter Selasky <hselasky@FreeBSD.org>
CommitDate: 2022-06-30 09:41:26 +0000

    uhid(4): Don't read-ahead from the USB IN endpoint.

    This avoids an issue where IN endpoint data received from the device right
    before the file handle is closed, gets lost.

    PR:             263995
    Sponsored by:   NVIDIA Networking

    (cherry picked from commit b6f615255d8bcdf40604005b11998eee86872364)

 sys/dev/usb/input/uhid.c | 6 ++++++
 1 file changed, 6 insertions(+)
Comment 50 Vladimir Kondratyev freebsd_committer freebsd_triage 2022-12-24 23:41:07 UTC
Created attachment 239011 [details]
usbhid-14CURRENT.patch

This patch introduces new driver fido.ko which is a stripped down version of hidraw(4). It provides only functions required by yubikey to work.

Unlike hidraw(4) it is able to load automatically with devd, disables usbhid readahead on USB interface it attached to, and exposes USB version of USB_GET_DEVICEINFO ioctl.

The patch can be applied to very recent 13-STABLE too, but probably requires minor tweaking of DRIVER_MODULE parameters to be compiled successfully.
Comment 51 Vladimir Kondratyev freebsd_committer freebsd_triage 2022-12-27 01:47:44 UTC
Created attachment 239052 [details]
usbhid.patch

13STABLE - compatible patch for usbhid(4). build-tested.
Comment 52 Christian Weisgerber freebsd_committer freebsd_triage 2022-12-28 18:26:02 UTC
(In reply to Vladimir Kondratyev from comment #51)

Two cosmetic notes about the patch:

+static SYSCTL_NODE(_hw_hid, OID_AUTO, fido, CTLFLAG_RW, 0, "Yubico FIDO key");

A more generic name would be better, e.g. "FIDO authenticator".

+/* A match on these entries will load hms */

... will load fido?


Now, I briefly tried the patch on 13-STABLE with FIDO-backed ssh keys.

It seems to work fine with a "HS HyperFIDO Token".

With the "Yubico YubiKey FIDO", only some accesses to the authenticator work, the others hang.

(I will not have time to test this further.)
Comment 53 Vladimir Kondratyev freebsd_committer freebsd_triage 2022-12-29 10:31:59 UTC
(In reply to Christian Weisgerber from comment #52)
> With the "Yubico YubiKey FIDO", only some accesses to the authenticator work, the others hang.

How can I reproduce hangs? I do have "Yubico YubiKey OTP+FIDO+CCID" 0x1050/0x0407 key to test.
Comment 54 Christian Weisgerber freebsd_committer freebsd_triage 2022-12-30 18:20:22 UTC
(In reply to Vladimir Kondratyev from comment #53)
Generate a FIDO-backed ssh key
$ ssh-keygen -t ecdsa-sk
and use it to login somewhere.
Comment 55 Romain Tartière freebsd_committer freebsd_triage 2023-02-28 04:41:55 UTC
I just updated my FreeBSD 13.1-RELEASE to 13.2-BETA3 and can now enjoy working ed25519-sk <3  Thanks a lot for fixing this!
Comment 56 Ed Maste freebsd_committer freebsd_triage 2023-02-28 12:09:46 UTC
(In reply to Romain Tartière from comment #55)
Thanks for following up!
Comment 57 Vladimir Kondratyev freebsd_committer freebsd_triage 2023-08-29 22:48:39 UTC
I created https://reviews.freebsd.org/D41639 for usbhid(4) users. I hope it helps.