The ccp(4) driver immediately hangs if I try to use it on a freshly installed 13.3-BETA3 system. I can load the driver, but it immediately hangs the first time I try to use it, which is by creating a geli device. > sudo kldload ccp > sudo mdconfig -a -t swap -s 64m md0 > sudo geli onetime -e aes-xts -l 256 -s 4096 /dev/md0 <hangs> In another terminal, dmesg shows that it ccp is indeed being used: > dmesg ... ccp0: <AMD CCP-5a> mem 0xfc000000-0xfc0fffff,0xfc1cc000-0xfc1cdfff irq 54 at device 0.2 on pci12 random: registering fast source AMD CCP TRNG GEOM_ELI: Device md0.eli created. GEOM_ELI: Encryption: AES-XTS 256 GEOM_ELI: Crypto: hardware > uname -a FreeBSD XXX.YYY 13.0-BETA3 FreeBSD 13.0-BETA3 #0 releng/13.0-n244525-150b4388d3b: Fri Feb 19 04:04:34 UTC 2021 root@releng1.nyi.freebsd.org:/usr/obj/usr/src/amd64.amd64/sys/GENERIC amd64 > sysctl hw.model hw.model: AMD Ryzen 3 3200G with Radeon Vega Graphics
I think there is an earlier ccp/geli bug you can dupe this to. I suggest not loading ccp.ko.
I found an earlier bug about a panic; I didn't find any about a hang. Should we disable ccp from the GENERIC build?
Sure, let’s disable it.
I can also add that on my Ryzen 3 3200U with Radeon Vega Mobile the hardware looked like it worked but it never generated an interrupt (or, at least, it was never received). Tested with both our ccp and on Linux (5.6).
Just ran into this on my NAS box which has an AMD Ryzen 3 2200G CPU, running 13.0-RELEASE-p6. If this hardware didn't generate an interrupt under both FreeBSD and Linux, is it a hardware bug of some kind?
(In reply to Joshua Kinard from comment #5) Who knows... there is no public documentation for it.
FYI, it looks like ccp(4) got some kind of fixing in 14.0-RELEASE and it now appears to work on my Ryzen 3 2200G: dmesg: > # dmesg | grep ccp > ccp0: <AMD CCP-5a> mem 0xfc700000-0xfc7fffff,0xfc884000-0xfc885fff irq 54 at device 0.2 on pci10 > [26] GEOM_ELI: Device da0p2.eli created. > [26] GEOM_ELI: Encryption: AES-XTS 256 > [26] GEOM_ELI: Crypto: hardware pciconf -lvcV > ccp0@pci0:10:0:2: class=0x108000 rev=0x00 hdr=0x00 vendor=0x1022 device=0x15df subvendor=0x1043 subdevice=0x876b > vendor = 'Advanced Micro Devices, Inc. [AMD]' > device = 'Family 17h (Models 10h-1fh) Platform Security Processor' > class = encrypt/decrypt > cap 09[48] = vendor (length 8) > cap 01[50] = powerspec 3 supports D0 D3 current D0 > cap 10[64] = PCI-Express 2 endpoint max data 256(256) RO NS > max read 512 > link x16(x16) speed 8.0(8.0) ASPM disabled(L0s/L1) > cap 05[a0] = MSI supports 2 messages, 64 bit > cap 11[c0] = MSI-X supports 2 messages, enabled > Table in map 0x24[0x0], PBA in map 0x24[0x1000] > ecap 000b[100] = Vendor [1] ID 0001 Rev 1 Length 16 I'll update if any oddities/errors/crashes happen, but this is a positive sign!
Humm, the only fix I can see that would be relevant for GELI in particular (and might have resulted in a hang if the hardware was waiting for more data due to an S/G list being too small) is this commit: commit 70efe1a2fe13642732e56c7f040fe63f62bc6a6b Author: John Baldwin <jhb@FreeBSD.org> Date: Mon Feb 6 13:51:57 2023 -0800 ccr,ccp: Fix argument order to sglist_append_vmpages. The offset comes before the byte count. Reported by: br Reviewed by: asomers, markj MFC after: 1 week Sponsored by: DARPA Differential Revision: https://reviews.freebsd.org/D38375 It would have worked fine with cryptocheck or other use cases, just not GELI when using unmapped disk I/O (which I think is the only crypto consumer that uses the VMPAGES buffer type). That commit has been merged to stable/13 and will be in 13.3. It was not included in 13.2. I'm going to optimistically close this bug, but if anyone reports issues on 13.3 or newer we can reopen it.
(In reply to John Baldwin from comment #8) It didn't take long, but I've actually found a new problem w/ ccp(4), and that's "sysctl -a" haning if ccp is loaded and is being used by something (in my case, encrypted swap). It seems to be hanging when probing the "kern.geom.conftxt" OID. When the hang happens, sysctl won't respond to any process signals and becomes stuck in D+ state. A reboot seems to be the only way to clear it. What that in a new bug?
Perhaps stay with this bug for now, but can you get 'procstat -kk <pid>' for the sysctl process when it hangs as the next debugging step?
(In reply to John Baldwin from comment #10) Already opened Bug #276587 for my issue, since this one was closed, so my bad. I've added the procstat -kk output there.
^Triage: re-opened by submitter request.