This week focused on virtual memory fundamentals and practical memory management in C. The lectures and assignments mostly dealt with understanding how virtual memory works at the OS and hardware level. The main idea was that the OS uses virtual memory to create the illusion that each process has its own private memory, while in reality all processes share the same physical RAM.
Topics covered this week
Some of the topics we covered this week:
- OSTEP 13: Address spaces
- An address space is the memory view a process thinks it has. The process uses virtual addresses, while the hardware and OS translate those into physical addresses in RAM. This lets many processes run safely at once without each one needing to know where it really sits in physical memory.
- A typical address space has three main regions
- Program Code: the compiled instructions of the program. This memory region is static (it doesn’t grow at runtime), so it sits at the top of the address space and stays fixed.
- Heap: memory for dynamically-allocated data. The heap grows upward (toward higher addresses) as the program requests more memory.
- Stack: used for function calls, local variables, and return values. The stack grows downward (toward lower addresses) with each function call and shrinks as functions return.
- Memory management goals
- Transparency: programs should be able to act as if they have their own memory.
- Efficiency: translation should be fast and not waste too much memory.
- Protection: one process should not read/write another process’s memory.
- OSTEP 14: C Memory API
- Stack vs. heap allocation
- Stack memory is managed automatically by the compiler, e.g., when declaring a local variable inside a function. Once the function returns, the stack frame is gone, and any pointer to it becomes invalid.
- Heap memory is for data that needs to outlive a single function call. It is managed manually with
mallocand released withfree.
- Common memory bugs
- Forgetting to allocate memory: if you declare a pointer but never call
malloc, then use the pointer, it will point to some random memory address. - Allocating too little memory: not allocating enough memory leads to buffer overflow, which results in undefined behavior.
- Not initializing allocated memory: allocated memory should point to the intended data or
NULL. - Memory leaks: when allocated memory isn’t explicitly freed, it results in a memory leak, which will degrade performance, especially for long-running processes.
- Forgetting to allocate memory: if you declare a pointer but never call
- Garbage Collection
- Many programming languages (but not C) perform automatic garbage collection by freeing unreachable memory objects.
- Stack vs. heap allocation
- OSTEP 15: Address Translation
- Base-and-bounds. The memory management unit (MMU) has two special registers: a base register and a bounds register.
- Base register: where memory for a process starts
- Bounds register: how much memory is allocated for a process
- Address translation
- physical address = virtual address + base
- Base-and-bounds. The memory management unit (MMU) has two special registers: a base register and a bounds register.
- OSTEP 16: Segmentation
- To avoid wasting memory by requiring large contiguous memory blocks, the OS will separate and store smaller parts of each process’s virtual memory into physical memory.
- Segmentation treats the code, heap, and stack as separate parts called segments, each with its own base and bounds, which results in better memory utilization.
- Segmentation introduces a bin packing problem: how should irregular pieces of virtual memory fit together in an optimal way?
- OSTEP 18: Paging
- Paging sidesteps the bin packing problem by storing virtual memory into fixed-sized chunks instead of irregularly-sized segments.
- Virtual memory is divided into pages, physical memory into page frames, and a per-process page table maps virtual page numbers (VPNs) to physical frame numbers (PFNs), allowing flexible placement in physical memory.
- This avoids fragmentation problems, but introduces extra translation overhead.
Reflection
The biggest challenge this week was just dealing with the sheer volume of information. There’s a lot of things that operating systems and hardware do to handle memory, and it’s a lot of dense technical information to unpack.
I think the most valuable takeaway this week was gaining more exposure to low-level systems programming and design. It’s been interesting to learn about the design challenges and technical solutions that are involved in making virtual memory work, and it looks like we will continue studying this topic in the next module, so we’ll have some more time to process it all.