interview question and answer

July 18, 2012

Understanding swap files in Linux


Understanding swap files in Linux


Summary If you run anything close to a modern operating system, you almost certainly interact with a swap file. You might be familiar with the basics of how these work: they allow your OS to prioritize more frequently-used pages in main memory and move the less frequently-used ones to disk. But there’s a lot going on underneath the covers. Here’s a simple guide to swap files in Linux.
To appease some of my hungrier applications and support heftier development efforts, I recently upgraded the memory on my system from 4 GiB to 8 GiB. That gave me occasion to tinker around with the swapping behavior of my system and check things out.
If you run anything close to a modern operating system, you almost certainly interact with a swap file. You might be familiar with the basics of how these work: they allow your OS to prioritize more frequently-used pages in main memory and move the less frequently-used ones to disk. But there’s a lot going on underneath the covers. Here’s a simple guide to the theory and practice of swap files in Linux, and how you can tweak things for your benefit.

An abstract memory model

It’ll be useful to have a mental model of the way memory works in general1, so we’ll start from the basis of a simple one here.
In general, all computers have access to physical memory, where the actual bits are manipulated and stored for use. Most modern operating systems present physical memory to higher-level applications as an abstraction calledvirtual memory. This allows applications to see memory as if it were a contiguous block, even though the underlying physical memory may theoretically be taken from many arbitrary, heterogeneous sources — multiple memory chips, flash memory, disk drives, and so on.
The memory manager divides virtual memory into chunks of identical size called pages. A page is the smallest amount that the OS will allocate in response to requests from programs. It is also the smallest unit of transfer between main memory and any other location, such as a hard disk. The size of a page is usually fixed by the operating system’s kernel2.
Applications see virtual memory as a contiguous resource divided into units called pages. A typical page size for a modern desktop system is about 4 kilobytes.
As pages are allocated to applications, they are assigned pages in the physical memory space through a special mapping called address translation. Applications don’t know where they are in the physical space; they see only the pages they use.
Address translation maps virtual pages onto physical pages. This mapping is transparent to applications.
The memory manager knows how to aggregate different backing stores to provide the abstraction of contiguous virtual memory. By updating the address translation mechanism so that a virtual page always points to the correct physical page, the memory manager may move pages around in physical memory without directly impacting applications.
The backing stores for pages can be quite heterogeneous.

Wide variation in different kinds of physical memory

Not every backing store displays the same storage and speed characteristics. If different types of physical memory display different characteristics, the memory manager can exploit these differences to optimize system performance. It can make intelligent decisions about which pages should go where.
Consider the difference between main memory and a hard drive, for example.
characteristicstorage medium
on diskin main memory
absoluterelativeabsoluterelative
access time5ms10ns500,000× faster
peak transfer rate80 MB/s8 GB/s100× faster
storage space100 GB25× larger4 GB
price/GB storage$0.60/GB20× cheaper$12/GB
Some comparisons of a typical hard drive and typical main memory on a consumer-grade desktop.
Because storage on disk-backed file systems does indeed have different characteristics than storage in main memory, there’s significant room to optimize depending on the characteristics of how pages are accessed. As the table shows, hard disks are several orders of magnitude slower and less efficient at retrieving data than main memory. Thus, the memory manager needs to carefully balance the demand for memory against the different kinds of supply.

Flavors of pages

If all pages were of the same kind, deciding which pages should go where would be a simpler decision. For example, one strategy is to make the access time quickest for the pages that are accessed the most frequently. Unfortunately, it’s not just the types of memory that are heterogeneous, but also the contents of the pages that are stored there. On Linux, there are four different kinds of pages.
  • Kernel pages. Pages holding the program contents of the kernel itself. Unlike the other flavors, these are fixed in memory once the operating system has loaded and are never moved.
  • Program pages. Pages storing the contents of programs and libraries. These are read-only, so no updates to disk are needed.
  • File-backed pages. Pages storing the contents of files on disk. If this page has been changed in memory (for example, if it’s a document you’re working on), it will eventually need to be written out to disk to synchronize the changes.
  • Anonymous pages. Pages not backed by anything on disk. When a program requests memory be allocated to perform computations or record information, the information resides in anonymous pages.
When the in-memory version of a page is the same as the one on disk, we say that the page is clean; the contents are the same. But sometimes the contents of a page have been updated since the last time they were read. When this happens, the page becomes dirty.
A clean page can be repurposed for something else easily; no updates need to be made, and the page can simply be recycled. But a dirty page has to be written back to disk before it can be used again. For file pages, this is an expensive operation, so the kernel tries to avoid the overhead of flushing back to disk when it can.
For anonymous pages, there’s a different problem. Effectively, they’re always dirty: the very act of creating the anonymous page means that there is now data that is in memory which isn’t in disk. If the kernel wants to use anonymous pages for something else, it must first reclaim them. But anonymous pages have no files to back them. How can you flush something back to disk when there’s nowhere to flush it to?

Swap files

The use of swap can resolve many of these issues. Swap is a disk-backed area that’s treated as an extension of main memory. It serves as a holding area for pages that have been evicted by the kernel. Let’s use an illustrative example to show how swap files help make memory work better.
Legend for the next few diagrams. Unused pages have dotted borders; dirty pages have an alert symbol; and anonymous and file pages are colored orange and green, respectively.
When a moderately loaded system gets additional requests for memory, the kernel generally draws from the pool of free pages first to fulfill these requests. If there are few free pages remaining, the kernel tries to flush clean pages to make room for the new requests.
The kernel prefers to go after unused pages first.
If the clean pages also become depleted, the kernel is forced to clean a dirty page and then flush it. This is an expensive operation. For this reason, the kernel tries to maintain at least some clean pages all the time.
When the ratio of anonymous pages to dirty pages is high and the number of clean pages is low, the kernel is running out of memory. Without swap, this situation will require a number of costly disk writes. Consider a request for allocation when a number of dirty pages are already present.
Without swap space, it’s easier for systems to get overloaded.
In the figure above, a request for two anonymous pages has come in. There are no more unused pages, so the kernel must drop one of the existing pages to satisfy the request. The kernel can use one page freely: the single clean file page in slot 6.
But to allocate the second page, the kernel now has to flush one of the dirty file pages (in slots 1, 3, or 4) back to disk to make room. It cannot move the page in slot 5 anywhere, because it is anonymous and has no backing store; there’s nowhere else to put it. Even if this page has not been used in a very long time, it must still occupy space in memory until the process using it has released the page.
When space is tight and there’s no swap the kernel must make room by cleaning dirty pages and freeing them. In this example, the kernel is forced to clean page #1 back to disk to make room for the second allocated page.
Without swap, the kernel gets boxed into this unfortunate corner more easily.
With swap, however, the kernel gets an additional tool to use in its arsenal. Instead of being forced to clean one of the dirty pages, it can instead evict one of the anonymous pages to the swap region.
When swap is available, the kernel doesn’t need to clean dirty pages, and can instead move anonymous pages to swap.
As in the earlier non-swap scenario, the kernel use can use the clean page in slot 6 for the first requested page. It is allocated and the clean page is dropped.
For the second requested page, the kernel must no longer clean a dirty page to make room. Instead, it can simply flush one of the anonymous pages to the swap region. The code required to do this is generally very simple and significantly less complex than cleaning a dirty page, and the kernel prefers swapping to cleaning dirty file-backed pages.

Optimizing your swap settings

Linux provides a number of ways to interact with your swap. Two are detailed here:
  • Aggressiveness of swapping
  • Adding and removing additional swap containers

Controlling aggressiveness of swapping

The more aggressively the kernel swaps, the more efficiently existing memory can be put to use. Pages that look like they’re not being used will be swapped out rapidly. If the kernel swaps too often, though, applications that were using those pages will take longer to become responsive again as the kernel swaps their memory back into main memory.
For a desktop user, responsiveness of applications can be important, so an aggressive swap may not be desirable, even if it results in less efficient use of memory. For servers and other non-interactive systems, more aggressive swapping may be appropriate and acceptable.
On Linux, this careful balancing act can be configured to meet your personal preferences. The kernel swaps out pages with a zealousness controlled by a swappiness setting.
Swappiness is an integer that ranges from 0 to 100, and indicates the degree to which the kernel favors swap space over main memory. Higher swappiness means that the kernel will move things to swap more frequently. Lower swappiness means that the kernel tries to avoid using swap. A swappiness of zero causes the kernel to avoid swap for as long as possible.
Ubuntu and several other Linux distributions have a default swappiness of 60. You can check your swap setting by reading a /proc/sys value:
$ cat /proc/sys/vm/swappiness
60
To temporarily modify your swappiness, simply edit this value:
$ sudo sysctl vm.swappiness=40
vm.swappiness = 40
This setting lasts until reboot or you change it again with another sysctl vm.swappinessinvocation. To make this setting take effect on every reboot, edit your /etc/sysctl.confconfiguration file.
$ gksudo gedit /etc/sysctl.conf
Find the vm.swappiness line; if none exists, add it.
vm.swappiness = 40

Adding swap containers

Modern operating systems generally have either a swap partition or a swap file. In a swap partition, part of the hard drive is sliced off and becomes dedicated to swap. A swap file is just an ordinary file that holds up to its file size in swapped pages.
A swap file is considerably less complicated than a swap partition to establish. There is no speed difference between the two3, so swap files are favorable in this respect. However, if you want to be able to hibernate or suspend your computer, using a swap partition is required in some cases. (These suspend/hibernate managers usually cannot handle writing to an active file system.)
Making a new swap file is a simple process. In this example, we’ll make a 2 GiB swap file and make it available to the system as additional swap space. We’ll use primary.swap as the name of the example swap file, but there is nothing special about the name of the file or its extension. You may use anything you wish.
First, we need to create the swap file itself. We’ll use a stream of zeroes as the input source (if=/dev/zero), and write it out to a file named primary.swap in the /mnt directory (of=/mnt/primary.swap). We will write 2048 (count=2048) blocks each 1 MiB in size (bs=1M). Depending on the speed of your hard disks, this may take a little while.
$ sudo dd if=/dev/zero of=/mnt/primary.swap bs=1M count=2048
2048+0 records in
2048+0 records out
2147483648 bytes (2.1 GB) copied, 30.1085 s, 71.3 MB/s
Next, we need to format this file and prepare it for use as a swapping space. The mkswap utility sets up a swap area on a device or file.
$ sudo mkswap /mnt/primary.swap
Setting up swapspace version 1, size = 2097148 KiB
no label, UUID=7be2b3b6-83b0-4afd-8537-197cf12f8c59
After formatting it, the swap can now be added to our system. Use the swapon utility to activate the swap region.
$ sudo swapon /mnt/primary.swap
You can verify that your swap space is now 2 GiB larger.
$ cat /proc/meminfo | grep SwapTotal
SwapTotal:     2097144 kB
Your changes will be lost at reboot, so if you want to make them permanent we’ll need to edit your filesystem table in /etc/fstab.
$ gksudo gedit /etc/fstab
Now add your swap file to the list of filesystems to mount at boot by appending a line to the file.
/mnt/primary.swap  none  swap  sw  0 0

Removing a swap file

Removal works much the same way, but in reverse. If you’ve added your swap to the /etc/fstablist, you need to remove it here first.
To disable your running swaps, run the swapoff utility. You can either specify the swap you’d like to disable, or use the -a parameter.
$ sudo swapoff /mnt/primary.swap
When you disable swap, you force the kernel to clean every page on the swap and/or push it back to main memory. If there is not enough space to squeeze everything in, you may receive out of memory errors from the kernel, so use this judiciously.

Conclusion

Swap files are an essential part of the memory-management modules of operating systems. In Linux, adding and removing swap partitions and files is simple, and you can control how the kernel interacts with swap through configurable parameters. Through the use of these and other techniques, and with an understanding of the basics of swap, you can tweak your system’s use of memory to your heart’s content.

No comments: