Whenever a process is created, the kernel provides a chunk of physical memory which can be located anywhere at all. However, through the magic of virtual memory (VM), the process believes it has all the memory on the computer. You might have heard "virtual memory" in the context of using hard drive space as memory when RAM runs out. That’s called virtual memory too, but is largely unrelated to what we’re talking about. The VM we’re concerned with consists of the following principles:
- Each process is given physical memory called the process’s virtual memory space.
- A process is unaware of the details of its physical memory (i.e. where it physically resides). All the process knows is how big the chunk is and that its chunk begins at address 0.
- Each process is unaware of any other chunks of VM belonging to other processes.
- Even if the process did know about other chunks of VM, it’s physically prevented from accessing that memory.
Each time a process wants to read or write to memory, its request must be translated from a VM address to a physical memory address. Conversely, when the kernel needs to access the VM of a process, it must translate a physical memory address into a VM address. There are two major issues with this:
- Computers constantly access memory, so translations are very common; they must be lighting fast.
- How can the OS ensure that a process doesn’t trample on another process’s VM?
The answer to both questions lies in the fact that the OS doesn’t manage VM by itself; it gets help from the CPU. Many CPUs contain a device called an MMU: a memory management unit. The MMU and the OS are jointly responsible for managing VM, translating between virtual and physical addresses, enforcing permissions on which processes are allowed to access which memory locations, and enforcing read/write permissions on sections of a VM space, even for the process that owns that space.
It used to be the case that Linux could only be ported to architectures that had an MMU (so Linux wouldn’t run on, say, an x286). However, in 1998, Linux was ported to the 68000 which had no MMU. This paved the way for embedded Linux and Linux on devices such as the Palm Pilot.
- Read a short Wikipedia blurb on the MMU
- Optional: If you want to know more about VM, here’s a link. This is much more than you need to know.
That’s how VM works. For the most part, each process’s VM space is laid out in a similar and predictable manner:
|High Address||Args and env vars||Command line arguments and environment variables|
|Uninitialized Data Segment (bss)||Initialized to zero by exec.|
|Initialized Data Segment||Read from the program file by exec.|
|Low Address||Text Segment||Read from the program file by exec.|
- Text Segment: The text segment contains the actual code to be executed. It’s usually sharable, so multiple instances of a program can share the text segment to lower memory requirements. This segment is usually marked read-only so a program can’t modify its own instructions.
- Initialized Data Segment: This segment contains global variables which are initialized by the programmer.
- Uninitialized Data Segment: Also named "bss" (block started by symbol) which was an operator used by an old assembler. This segment contains uninitialized global variables. All variables in this segment are initialized to 0 or NULL pointers before the program begins to execute.
- The stack: The stack is a collection of stack frames which will be described in the next section. When a new frame needs to be added (as a result of a newly called function), the stack grows downward.
- The heap: Most dynamic memory, whether requested via C’s malloc() and friends or C++’s new is doled out to the program from the heap. The C library also gets dynamic memory for its own personal workspace from the heap as well. As more memory is requested "on the fly", the heap grows upward.
more info we can access in :