Bug 161901

Summary: [cam] [patch] cam / ata timeout limited to 2147 due to overflow
Product: Base System Reporter: Steven Hartland & <killing>
Component: kernAssignee: freebsd-bugs (Nobody) <bugs>
Status: Closed FIXED    
Severity: Affects Only Me CC: eugen, schaap
Priority: Normal    
Version: 8.2-RELEASE   
Hardware: Any   
OS: Any   
Attachments:
Description Flags
file.txt none

Description Steven Hartland & 2011-10-22 15:10:13 UTC
I'm working on adding security methods to camcontrol and have
come up against a strange issue. It seems that the timeout
value for cam, at least on ata (ahci), is limited to less than
2148 seconds.

This can be seen by running:-
camcontrol identify ada0 -t 2148 -v
(pass0:ahcich0:0:0:0): ATA_IDENTIFY. ACB: ec 00 00 00 00 40 00 00 00 00 00 00
(pass0:ahcich0:0:0:0): CAM status: Command timeout

Also seen in /var/log/messages at this time is:-
Aug  4 23:29:51 cfdev kernel: ahcich0: Timeout on slot 24
Aug  4 23:29:51 cfdev kernel: ahcich0: is 00000000 cs 01000000 ss 00000000 rs 01000000 tfd d0 serr 00000000

Dropping the timeout down to 2147 and the command runs fine.

I've done some digging and it seems like this is implemented via:-
sys/dev/ahci/ahci.c
ahci_execute_transaction(struct ahci_slot *slot)
{
..
    /* Start command execution timeout */
    callout_reset(&slot->timeout, (int)ccb->ccb_h.timeout * hz / 2000,
        (timeout_t*)ahci_timeout, slot);

Now its documented that:-
"Non-positive values of ticks are silently converted to the value 1"

So I suspect that this is what's happening resulting in an extremely
small timeout instead of a large one. Now I know that passed in value
to the timeout is seconds * 1000 so we should be seeing 2148000
for ccb->ccb_h.timeout now multiply that by 1000 (hz) and your over
the int wrap point 2147483647.

So instead of the wrap point being 2147483 seconds (24 days), I suspect
because of the way this is structured its actually 2147 seconds (26mins).

If this is the case the fix is likely to be something like:-
 callout_reset(&slot->timeout, (int)(ccb->ccb_h.timeout * (hz / 2000)),

Does this sound reasonable? What I don't understand is why the /2000?

For reference the reason for wanting a large timeout is that a
secure erase of large media could take many hours so I'm using
the erase time reported by the drive for this, in my case here is
400 minutes.

Currently this instantly fails with a Command timeout which is
clearly not right.

Additional discussion can be found here:-
http://lists.freebsd.org/pipermail/freebsd-hackers/2011-August/036060.html

Updated patches may be found here:-
http://blog.multiplay.co.uk/2011/08/timeout-overflow-in-cam-drivers-under-freebsd-8-2/

Fix: Apply the attached patch.
Original patch updated by Eygene Ryabinkin and added revised to include changes to mps driver added to stable by myself.

Patch attached with submission follows:
How-To-Repeat: Request a cam timeout larger that 2147 seconds.
Comment 1 Steven Hartland & 2012-02-07 09:26:51 UTC
Any update on this?
Comment 2 Steven Hartland & 2012-05-15 13:00:59 UTC
Looks like this still hasn't been committed could someone please investigate
so cam timeouts work correctly :)

    Regards
    Steve

================================================
This e.mail is private and confidential between Multiplay (UK) Ltd. and the person or entity to whom it is addressed. In the event of misdirection, the recipient is prohibited from using, copying, printing or otherwise disseminating it or any information contained in it. 

In the event of misdirection, illegible or incomplete transmission please telephone +44 845 868 1337
or return the E.mail to postmaster@multiplay.co.uk.
Comment 3 Steven Hartland freebsd_committer freebsd_triage 2012-12-11 14:12:57 UTC
Responsible Changed
From-To: freebsd-bugs->smh

I'll take it.
Comment 4 Eitan Adler freebsd_committer freebsd_triage 2017-12-31 08:00:21 UTC
For bugs matching the following criteria:

Status: In Progress Changed: (is less than) 2014-06-01

Reset to default assignee and clear in-progress tags.

Mail being skipped
Comment 5 Eugene Grosbein freebsd_committer freebsd_triage 2020-08-30 13:13:40 UTC
Fixed since 10.2-RELEASE: https://svnweb.freebsd.org/base?view=revision&revision=275982
Comment 6 Eugene Grosbein freebsd_committer freebsd_triage 2020-08-30 13:14:29 UTC
*** Bug 187900 has been marked as a duplicate of this bug. ***