Created attachment 203644 [details] dmesg of 8.2 booting on the same machine This problem affects both 11.2 and 12.0 on my old laptop. The machine boots fine into 8.2 (dmesg attached). I may be misreading the boot-messages, but it looks like it identifies 3 storage devices: 1. An SSD (ada0) 2. A CD/DVD (cd0) 3. Sony's "memory stick" reader -- with no media inserted The boot reports: GEOM: new disk cd0 GEOM: new disk ada0 and then goes into infinite cycle of (retyping): (aprobe0:ata1:0:1:0): ATAPI_IDENTIFY. ACB: a1 00 00 00 00 40 00 00 00 00 00 00 (aprobe0:ata1:0:1:0): CAM status: Command timeout (aprobe0:ata1:0:1:0): Retrying command ... (aprobe0:ata1:0:1:0): ATAPI_IDENTIFY. ACB: a1 00 00 00 00 40 00 00 00 00 00 00 (aprobe0:ata1:0:1:0): CAM status: Command timeout (aprobe0:ata1:0:1:0): Retries exhausted ... It then resets ata1 and tries again... And again... Because 8.2 continues to boot fine, I do not believe, anything is wrong with the hardware. The Bug #202712 is similar, but over there the hang is over the SETFEATURES SET TRANSFER MODE -- my laptop can't get through the ATAPI_IDENTIFY...
Thank you for the report. Can you please attach dmesg of 12.0 boot (ideally with -v), up to the point of the infinite cycle?
(In reply to Conrad Meyer from comment #1) > Can you please attach dmesg of 12.0 boot Cannot -- it never finishes booting. In retyped the most relevant part by hand... I suppose, I can take pictures of -- or video-record -- the boot and attach that. 21st century, eh?
(In reply to Conrad Meyer from comment #1) Ok, the video is almost 90Mb, so I don't want to upload it. But you can view it here: https://oc.virtual-estates.net:8443/index.php/s/wmTpNXT9tjPDQyX
(In reply to Mikhail Teterin from comment #2) I wonder how long you waited at most? I suspect that after some, quite long, time the system would eventually continue to boot. The ATAPI_IDENTIFY timeout is quite large and there is a number of retries on a couple of levels. The problem seems to be that for some reason the system seems to "detect" a phantom ATAPI slave on the same channel as the CD-ROM device (devices=0x30000 -- this mask contains two devices on the channel). Maybe the older code had a way to check that it is a phantom device or maybe it just failed the phantom much faster. Could you please attach a verbose dmesg from FreeBSD 8 ?
Created attachment 203818 [details] Verbose dmesg.boot from 8.2 > I wonder how long you waited at most? Left it trying overnight -- after about 12 hours it was still at it... Verbose dmesg attached.
(In reply to Mikhail Teterin from comment #5) Thank you! So, we can see that the old stack also sees the phantom device, tries to identify it and fails. But I guess that that happens rather quickly (?) and, certainly, there are no endless retries. That seems to be the main difference between the old code and the new one. I would try to draw attention of CAM experts like Scott Long or Alexander Motin or Kenneth Merry to this bug. I'll re-assign this bug to scsi@ as well. Here are the relevant bits from the log: ata1: <ATA channel 1> on atapci0 ata1: reset tp1 mask=03 ostat0=50 ostat1=00 ata1: stat0=0x00 err=0x01 lsb=0x14 msb=0xeb ata1: stat1=0x00 err=0x01 lsb=0x14 msb=0xeb ata1: reset tp2 stat0=00 stat1=00 devices=0x30000 ata1: Identifying devices: 00030000 ata1: New devices: 00030000 ata1: reiniting channel .. ata1: reset tp1 mask=03 ostat0=00 ostat1=00 ata1: stat0=0x00 err=0x01 lsb=0x14 msb=0xeb ata1: stat1=0x00 err=0x01 lsb=0x14 msb=0xeb ata1: reset tp2 stat0=00 stat1=00 devices=0x30000 ata1: reinit done .. unknown: FAILURE - ATAPI_IDENTIFY timed out LBA=0 ata1: reiniting channel .. ata1: reset tp1 mask=03 ostat0=00 ostat1=00 ata1: stat0=0x00 err=0x01 lsb=0x14 msb=0xeb ata1: stat1=0x00 err=0x01 lsb=0x14 msb=0xeb ata1: reset tp2 stat0=00 stat1=00 devices=0x30000 ata1: reinit done .. unknown: FAILURE - ATAPI_IDENTIFY timed out LBA=0 And then success for the real device: ata1-master: pio=PIO4 wdma=WDMA2 udma=UDMA33 cable=40 wire acd0: setting UDMA33 acd0: <UJDA755 DVD/CDRW/1.00> CDRW drive at ata1 as master
(In reply to Andriy Gapon from comment #6) Ok, so why does it never give up with the new code? It claims that "Retries exhausted", but comes right back to the same exhaustion again and again...
I suppose it is not really a command retry, but a restart of probe process, triggered by ATA bus reset, triggered by ATAPI_IDENTIFY command timeout for phantom CDROM. That should probably be work-arounded, but honestly I have no big wish to workaround PATA hardware issue in year 2019.
PATA or not, an endless loop in device-detection like this is a bug on its own, is not it? There is got to be a limit on the number of iterations. What if it were a real device - with a broken controller? > I have no big wish to workaround PATA hardware issue in year 2019. A small wish, maybe? Things got broken in 9.0, it seems - much earlier than 2019...