|
Lines 312-318
Link Here
|
| 312 |
However, the swap management structure has had problems |
312 |
However, the swap management structure has had problems |
| 313 |
historically.</para> |
313 |
historically.</para> |
| 314 |
|
314 |
|
| 315 |
<para>Under FreeBSD 3.x the swap management structure preallocates an |
315 |
<para>Under FreeBSD 3.X the swap management structure preallocates an |
| 316 |
array that encompasses the entire object requiring swap backing |
316 |
array that encompasses the entire object requiring swap backing |
| 317 |
store—even if only a few pages of that object are swap-backed. |
317 |
store—even if only a few pages of that object are swap-backed. |
| 318 |
This creates a kernel memory fragmentation problem when large objects |
318 |
This creates a kernel memory fragmentation problem when large objects |
|
Lines 329-335
Link Here
|
| 329 |
fly for additional swap management structures when a swapout occurs. It |
329 |
fly for additional swap management structures when a swapout occurs. It |
| 330 |
is evident that there was plenty of room for improvement.</para> |
330 |
is evident that there was plenty of room for improvement.</para> |
| 331 |
|
331 |
|
| 332 |
<para>For FreeBSD 4.x, I completely rewrote the swap subsystem. With this |
332 |
<para>For FreeBSD 4.X, I completely rewrote the swap subsystem. With this |
| 333 |
rewrite, swap management structures are allocated through a hash table |
333 |
rewrite, swap management structures are allocated through a hash table |
| 334 |
rather than a linear array giving them a fixed allocation size and much |
334 |
rather than a linear array giving them a fixed allocation size and much |
| 335 |
finer granularity. Rather then using a linearly linked list to keep |
335 |
finer granularity. Rather then using a linearly linked list to keep |
|
Lines 558-564
Link Here
|
| 558 |
<qandaentry> |
558 |
<qandaentry> |
| 559 |
<question> |
559 |
<question> |
| 560 |
<para>What is <quote>the interleaving algorithm</quote> that you |
560 |
<para>What is <quote>the interleaving algorithm</quote> that you |
| 561 |
refer to in your listing of the ills of the FreeBSD 3.x swap |
561 |
refer to in your listing of the ills of the FreeBSD 3.X swap |
| 562 |
arrangements?</para> |
562 |
arrangements?</para> |
| 563 |
</question> |
563 |
</question> |
| 564 |
|
564 |
|
|
Lines 574-580
Link Here
|
| 574 |
|
574 |
|
| 575 |
<literallayout>A B C D A B C D A B C D A B C D</literallayout> |
575 |
<literallayout>A B C D A B C D A B C D A B C D</literallayout> |
| 576 |
|
576 |
|
| 577 |
<para>FreeBSD 3.x uses a <quote>sequential list of free |
577 |
<para>FreeBSD 3.X uses a <quote>sequential list of free |
| 578 |
regions</quote> approach to accounting for the free swap areas. |
578 |
regions</quote> approach to accounting for the free swap areas. |
| 579 |
The idea is that large blocks of free linear space can be |
579 |
The idea is that large blocks of free linear space can be |
| 580 |
represented with a single list node |
580 |
represented with a single list node |
|
Lines 593-608
Link Here
|
| 593 |
it is to try to put that sophistication elsewhere.</para> |
593 |
it is to try to put that sophistication elsewhere.</para> |
| 594 |
|
594 |
|
| 595 |
<para>The fragmentation causes other problems. Being a linear list |
595 |
<para>The fragmentation causes other problems. Being a linear list |
| 596 |
under 3.x, and having such a huge amount of inherent |
596 |
under 3.X, and having such a huge amount of inherent |
| 597 |
fragmentation, allocating and freeing swap winds up being an O(N) |
597 |
fragmentation, allocating and freeing swap winds up being an O(N) |
| 598 |
algorithm instead of an O(1) algorithm. Combined with other |
598 |
algorithm instead of an O(1) algorithm. Combined with other |
| 599 |
factors (heavy swapping) and you start getting into O(N^2) and |
599 |
factors (heavy swapping) and you start getting into O(N^2) and |
| 600 |
O(N^3) levels of overhead, which is bad. The 3.x system may also |
600 |
O(N^3) levels of overhead, which is bad. The 3.X system may also |
| 601 |
need to allocate KVM during a swap operation to create a new list |
601 |
need to allocate KVM during a swap operation to create a new list |
| 602 |
node which can lead to a deadlock if the system is trying to |
602 |
node which can lead to a deadlock if the system is trying to |
| 603 |
pageout pages in a low-memory situation.</para> |
603 |
pageout pages in a low-memory situation.</para> |
| 604 |
|
604 |
|
| 605 |
<para>Under 4.x we do not use a sequential list. Instead we use a |
605 |
<para>Under 4.X we do not use a sequential list. Instead we use a |
| 606 |
radix tree and bitmaps of swap blocks rather than ranged list |
606 |
radix tree and bitmaps of swap blocks rather than ranged list |
| 607 |
nodes. We take the hit of preallocating all the bitmaps required |
607 |
nodes. We take the hit of preallocating all the bitmaps required |
| 608 |
for the entire swap area up front but it winds up wasting less |
608 |
for the entire swap area up front but it winds up wasting less |