Created attachment 187229 [details]
sw_nblks offset by 2
As also noted in the patch,
swaponsomething() skips the first 2 pages.
Subtract these 2 pages from sw_nblcks as well as from swap_pager_avail and swap_total to keep track the correct numbers.
Given sw_used and sw_nblcks now keeps track of correct pages,
we can use these to check remaining pages blist_alloc().
This improves situation where some of swap devices are filled.
Testing "(sp->sw_nblks - sp->sw_used >= npages)" before every call to blist_alloc() is of no real benefit. In the common case, where there is available swap space, it is added overhead, albeit a tiny amount of overhead. Only in the rare case, where there is no swap space available will it be of any benefit. But, in that case, the gain is minimal. The hinting mechanism in the blist code will quickly conclude that there is no available space, and return failure. In summary, the tiny amount of overhead in the common case is going to outweigh the gain in the rare case.
(In reply to Alan Cox from comment #1)
I've seen 2 conditions when blist_alloc() failed massively.
One case is when sizes of swap devices are imbalanced especially when one of them are significantly small or large, one or some of others gets filled.
The other case is when more devices are added to avoid swap space exhaustion. As soon as all existing devices are fulled, new ones will pick up all I/Os.
In any cases, if a machine is busy swapping out pages, I also agree that failures of blist_alloc() are minor in performance as well.
Also, blist_alloc needs to find a contentious region. Even if remaining blocks is more than requested, blist_alloc can fail due to fragmentation.