Supplementary Materials for OS Chapter 9

## Background

在处理用户程序的时候，分为以下三个步骤：

1. Source code: Symbolic address
2. Complied code: **Bind to** relocatable address (relative to the start of the code)
3. **Linker/Loader:**  Relocatable address (**Bing to**) absolute address

因此，接下来要补充这中间的两个步骤：**Binding 和 Linking**

### Binding

1. **编译时绑定 (Compile time): Absolute address is used in code**
   - 如果程序在编译时已知将驻留在内存的**固定位置**，编译器可以生成**绝对地址**的代码。
   - 这种绑定发生在编译时。
   - 例如，没有内核的嵌入式系统可能使用这种方法。
   - 缺点是，如果程序在内存中的起始位置发生变化，就需要**重新编译**代码。
   - 在这种方案下，**逻辑地址和物理地址是相同的**。
2. **加载时绑定 (Load time): : Relocatable code is generated if memory location is not known at compile time**
   - 如果在编译时不知道进程将驻留在内存的哪个位置，编译器会生成**可重定位代码**。
   - 最终的绑定（将可重定位地址转换为绝对地址）会延迟到**加载时**发生。
   - 程序代码会有一个相对地址的重定位表。实际的内存地址会根据加载位置计算得出。
   - 这曾用于非常旧的多道程序设计系统。
   - 缺点是，一旦代码被加载，它在内存中的地址就固定了。
   - 在这种方案下，**逻辑地址和物理地址是相同的**。
3. **执行时绑定 (Execution time): : Binding is delayed until run time. **
   - 如果进程在执行过程中可以被移动到内存中的不同位置，那么绑定必须延迟到**运行时**进行。
   - 这种方案需要**特殊的硬件支持**来进行地址映射，例如内存管理单元（MMU）。
   - **大多数现代操作系统使用这种方法**。
   - 在这种方案下，**逻辑地址（或称虚拟地址）和物理地址是不同的**。逻辑地址是 CPU 生成和程序使用的地址，而物理地址是内存单元看到的实际地址。MMU 在运行时将逻辑地址转换为物理地址。

### Linking

Linking: 程序的地址绑定（binding）在程序的生命周期中，会在不同的阶段以不同的方式发生。源代码中的地址通常是符号地址（symbolic addresses），比如变量名 `int count; &count`。编译后的代码会将这些符号地址绑定到可重定位地址（relocatable addresses），这些地址是相对于代码模块起始位置的偏移量。**链接器（Linker）** 或加载器（Loader）会进一步将这些可重定位地址绑定到绝对地址（absolute addresses），即相对于内存最低地址 0000 的物理地址。每一次绑定都是从一个地址空间到另一个地址空间的映射。

**链接（Linking）** 就是在这个过程中，将程序代码和所需的库代码结合起来，准备好执行。根据链接发生的时间和方式，主要分为两种类型：静态链接和动态链接。

1. **静态链接 (Static linking)**
   - 在静态链接中，系统库和程序代码会由**加载器（loader）** 合并到单个二进制镜像文件中。
   - 这意味着**每个程序都包含它自己所需库函数的副本**。
   - 静态链接通常发生在程序被加载到内存**之前**，作为生成最终可执行文件的一部分。
   - 缺点是：
     - **浪费内存空间**，因为许多程序会包含相同通用系统库函数的副本。内存中会有多份相同的库代码副本。
     - 如果库有更新（例如 bug 修复），所有使用该库的程序都需要**重新链接**才能使用新版本。
   - 在一些简单系统（例如，没有内核的嵌入式系统）中，静态链接可能用于编译时绑定（compile time binding），在这种情况下，如果程序在内存中的起始位置改变，就需要重新编译代码。
2. **动态链接 (Dynamic linking)**
   - 动态链接将链接过程**推迟到执行时（execution time）或运行时（run time）**。
   - 当程序运行时需要调用动态链接库中的**例程（routine，子程序）**时，操作系统会**检查**该例程是否已经在进程的内存地址空间中。
   - 如果不在，操作系统或加载器会**将其添加到进程的地址空间中**。
   - 通常使用一个小的代码片段，称为 **stub（存根）**，来帮助定位内存中相应的库例程。Stub 代码会替换自身为例程的地址并执行该例程。
   - 动态链接**特别适用于许多进程都使用的库**。
   - 优点是：
     - **显著节省内存空间和磁盘空间**，因为库只需要在内存中保留一个副本，供所有进程共享。操作系统确保多个进程可以访问相同的内存地址（即共享库）。
     - 库的更新（如 bug 修复）可以在不重新链接所有使用该库的程序的情况下进行。
     - 避免加载和链接可能最终不会被使用的库。
   - 动态链接和共享库通常需要**操作系统的帮助**来管理和保护。

### Logical and Physical Address Space

| **Comparison Dimension**      | **Logical Address**                                          | **Physical Address**                                         |
| ----------------------------- | ------------------------------------------------------------ | ------------------------------------------------------------ |
| **Definition/Source**         | **Address generated by the program/CPU**                     | **Address actually used by the memory hardware**             |
| **Alternative Name**          | Virtual Address                                              | No specific alternative name                                 |
| **User Entity**               | **CPU (address used by all CPUs)**                           | **Memory hardware** (address operated by the memory hardware) |
| **Addressing Unit**           | Byte (addressing by byte)                                    | Byte (addressing by byte)                                    |
| **Determinant of Space Size** | CPU bit - width (depending on the width of the CPU address bus or the design of the virtual address space) | **Size of the actual physical memory** (depending on the installed memory capacity of the hardware) |
| **Core Difference**           | It belongs to the "virtual" address space from the program/CPU perspective and needs to be mapped to the physical address through address translation (such as MMU) | It belongs to the "real" address space of the memory hardware and directly corresponds to the physical memory storage units |

### Dynamic Loading

Motivation: Code is loaded from the hard disk into memory only when it is needed, on demand

#### Definition

**The loading of a process routine when it is called rather than when the process is started.**

#### Advantage

- 更好地利用内存空间 (Better memory-space utilization)
- 未使用的例程 (unused routine) 永远不会被加载，从而减少了不必要的 I/O 操作和所需的内存空间
- Some large amounts of code handle infrequently occurring cases

#### How

- 所有的例程都保存在磁盘上，采用可重定位的加载格式 (relocatable load format)
- 动态加载不需要操作系统的特殊支持 (does not require special support from the operating system)
- 它是通过用户设计他们的程序来利用这种方法来实现的 (It is the responsibility of the users to design their programs to take advantage of such a method)
- 然而，操作系统可以通过提供库例程来帮助程序员实现动态加载 (Operating systems may help the programmer, however, by providing library routines to implement dynamic loading)

#### Comparison between Dynamic Loading and Dynamic Linking

动态加载与动态链接 (Dynamic Linking) 相似，但有所不同。动态链接是将库的链接推迟到执行时 (execution time)，而动态加载是将例程的 *加载* 推迟到被调用时。动态链接尤其适用于被许多进程使用的系统库，可以将库的唯一副本保留在内存中供所有进程共享。与动态加载不同，动态链接和共享库通常需要操作系统的帮助。动态加载侧重于代码的加载时机，而动态链接侧重于库的链接时机及其共享.

## Memory Allocation

### Continuous Partition

#### Fixed Partitioning

**Definition: Memory is divided into multiple fixed size partitions**

**Disadvantages: **

- Process might not fully utilize its partition so that the memory is wasted.
- Degree of multiprogramming is limited by number of partitions 
  - 同时驻留内存并运行的进程数量，被严格限制为内存中预先划分的固定分区的数量

#### Variable Partitioning

Kernel maintains information about:

- allocated partitions

- free partitions (hole)

##### 什么是 Hole?

- block of available memory 
- holes of various size are scatteredthroughout memory
- When a process arrives, it is allocated memory from a hole large enough to accommodate it
- When a process exits, its partition is freed, adjacent free partitions are combined (coalesced/merged)

![Hole](https://tuchuang.org.cn/imgs/2025/05/26/72ae88b31557f6c2.png)

##### Allocation Algorithms

- **首次适应 (First Fit)**：扫描空闲分区链（或列表），找到第一个足够大的空洞，将进程分配到该空洞中。
- **最佳适应 (Best Fit)**：扫描整个空闲分区链，找到大小最接近进程需求的空洞，将进程分配到该空洞中。这会产生最小的剩余空洞。
- **最差适应 (Worst Fit)**：扫描整个空闲分区链，找到最大的空洞，将进程分配到该空洞中。这会产生最大的剩余空洞，理论上这有助于为后续的小进程提供更大的空洞。

#### Framentation

| Fragmentation          | Proterties                                                   |
| ---------------------- | ------------------------------------------------------------ |
| External Fragmentation | 这是可变分区的主要问题。随着进程的加载和退出，内存中会散布着许多小的空闲块（空洞）。这些空洞的总和可能足够满足一个新的进程，但由于它们不是连续的，无法满足需要**连续**空间的进程。这些零散的空闲空间就是外部碎片。 |
| Internal Fragmentation | 在可变分区中，如果分配给进程的内存块略大于进程需求，多余的空间浪费在分配块内部，形成内部碎片 [类似于固定分区，虽然来源 在描述可变分区时没有直接使用这个词，但分配算法的选择（如 Best Fit 产生最小剩余空洞）隐含了分配块可能大于需求的情况]。 |

![Fragmentation](https://tuchuang.org.cn/imgs/2025/05/26/8a6c7b40f04c6bdf.png)

实例：

First fit analysis reveals that given $N$ blocks allocated, $0.5N$ blocks lost to fragmentation

##### Solutions to fragmentation problems

For external fragmentation:

- Shuffle memory contents to place all free memory together in one large block
- Possible only if relocation is dynamic, and is done at execution time.
- Pay attention to I/O Problem while there is I/O operations on the affected memory
- **Or use non-contiguous memory allocation**

For internal fragmentation:

- **No solution** 
  - Explaination: Memory is allocated in the block of $2^N$ bytes, rather than bytes by bytes.

### Non-continuous Partition

**Definition:**

- Physical address space of a process can be non-contiguous

- Process is allocated physical memory whenever the total free memory blocks are enough

**Frame （Frame是在 Physical Memory 中的）:**

- Divide physical memory into fixed-sized blocks. Each block is called a frame
- The size of each frame is power of 2, between 512 bytes and 16 Mbytes
- Kernel keeps track of all free frames

**Pages （Page是在 Logical Memory 中的）:**

- Divide logical memory into blocks of same size called pages
- Page size = frame size
- To run a program of size $N$ pages, need to find $N$ free frames and load program
- Frames can be anywhere (non-contiguous) in RAM

**A Page table** is needed to translate logical to physical addresses

- Page table is a part of the process's **PCB** and it is invisible to process itself.
- Keep inside the kernel's own memory
- Two registers are in the MMU to find page table in the main memory
  - Page-table base register (PTBR) points to the start position of page table in memory
  - Page-table length register (PTLR) indicates lengthof the page table

### **地址转换 (Address Translation)**

地址转换由 **MMU（内存管理单元）** 硬件执行，是将CPU生成的逻辑地址映射到物理内存地址的核心过程。其具体流程与关键机制如下：

#### **1. 地址转换的基本步骤**

- **步骤1：生成逻辑地址**
  CPU生成一个逻辑地址（logical address），该地址是进程视角下的虚拟地址。

- **步骤2：拆分逻辑地址**
  MMU将逻辑地址分为两部分：
  - **页号 (page number, p)**：用于索引页表的高位部分。
  - **页内偏移 (page offset, d)**：指示页内具体位置的低位部分（由页大小决定位数）。

- **步骤3：查找页表获取帧号**
  页号 (p) 作为页表（Page Table）的索引，通过查找页表可获得该页对应的物理帧号 (frame number, f)（如图10.4示意）。

- **步骤4：生成物理地址**
  物理地址由帧号 (f) 和页内偏移 (d) 组合而成，即：
  **物理地址 = 帧号 (f) + 页内偏移 (d)**

![](https://tuchuang.org.cn/imgs/2025/05/26/2957d514ce1be064.png)

课件 P31 练习题参考答案：

![](https://tuchuang.org.cn/imgs/2025/05/26/c8a18c78e2ba3efb.png)

#### **2. TLB（转换后备缓冲区）的优化作用**

为提高地址转换速度，MMU通常配合使用硬件缓存 **TLB（Translation Look-aside Buffer）**，其核心机制如下：

- **TLB缓存映射**：TLB缓存了最近使用的“页号到帧号”的映射关系。
- **TLB命中**：若逻辑地址的页号 (p) 在TLB中存在映射（TLB命中），则直接通过TLB获取帧号 (f)，快速完成地址转换。
- **TLB未命中（TLB miss）**：若TLB中无对应映射，则需访问内存中的页表，查找帧号 (f)，可能导致转换延迟。

![](https://tuchuang.org.cn/imgs/2025/05/26/8f0fa3affec3bbb9.png)

- **Effective Access Time**
  $$
  EAT = \alpha \times (c+m) + (1 - \alpha) \times (c + 2m)
  $$

### Solutions to big page table

#### Hierarchical Page Tables

**两级页表 (Two-Level Page Table)**

最常见的分层页表是两级页表。 在这种方案中：

- 逻辑地址被分为三部分：一个用于**外部页表 (Outer Page Table)** 的页号 (p1)，一个用于**内部页表 (Inner Page Table)** 的偏移量 (p2)，以及页内偏移量 (d)。
- 外部页表中的每个条目指向一个内部页表。
- 内部页表（也称为“页表页”）具有页大小，它们可以分散存储在物理内存的不同位置。

**地址转换过程 (Address Translation)**

使用两级页表进行地址转换的步骤如下：

1. CPU 生成一个逻辑地址，该地址被硬件（MMU）分为 p1、p2 和 d 三部分。
2. 使用 p1 作为索引来访问外部页表。 外部页表通常由一个寄存器（例如页表基寄存器 PTBR）指向其在物理内存中的起始位置。
3. 外部页表中索引为 p1 的条目包含了对应内部页表（页表页）的**物理基地址**。
4. 使用 p2 作为索引来访问上一步找到的内部页表。
5. 内部页表中索引为 p2 的条目包含了该逻辑页对应的**物理帧号 (f)**。
6. 将物理帧号 (f) 与页内偏移量 (d) 组合起来，形成最终的物理地址：物理地址 = f + d。

图 9.16 和课件中的类似图示 形象地展示了这个转换过程：从逻辑地址的 p1 指向外部页表，外部页表的条目指向内部页表，再通过 p2 索引内部页表找到帧号 f，最后 f 与 d 组合形成物理地址。

以一个 Two-Level Paging Tables 为例

`p1 (10 bits) | p2 (10 bits) | d (12 bits)`

![](https://tuchuang.org.cn/imgs/2025/05/26/e6bc7e7652d1a08b.png)

#### Inverted Page Table

![](https://tuchuang.org.cn/imgs/2025/05/26/91b4c29a03b757e4.png)

**优点**

- **节省物理内存：** 倒排页表的大小取决于物理内存的大小，而不是逻辑地址空间的大小。这对于具有大型逻辑地址空间的系统尤其有利。系统中只需要一个页表。

**缺点**

- **地址转换速度慢：** 查找一个虚拟地址对应的物理帧号需要搜索整个倒排页表。因为倒排页表是按物理地址（帧号）索引的，而查找是根据虚拟地址（虚拟页号和进程 ID）进行的，可能需要线性扫描整个表，这会非常耗时。
- **需要辅助结构加速查找：** 为了提高查找速度，倒排页表通常与**散列表（hash table）**结合使用。散列表使用 `<process-id, page-number>` 作为键（key）来计算散列值，指向倒排页表中的一个或少数几个条目，从而限制了需要搜索的范围。
- **地址转换所需内存访问次数：** 在结合散列表的情况下（不考虑 TLB），进行一次地址转换至少需要两次内存访问：一次访问散列表，另一次访问倒排页表中的条目。
- **共享内存实现复杂：** 由于倒排页表中一个物理帧只能对应一个虚拟地址和进程 ID 的组合，直接实现多个进程共享同一个物理内存页（通过不同的虚拟地址映射）比较困难。虽然只保存一份映射关系可以节省页表空间，但这要求共享该页的其他进程在访问时，系统需要临时更新倒排页表中的映射关系，或者采用其他机制处理。
- **处理页不在内存（Page Fault）需要额外信息：** 倒排页表只记录了当前在物理内存中的虚拟页信息。如果一个进程试图访问的虚拟页不在物理内存中（导致页错误 Page Fault），系统需要知道这个虚拟页存储在二级存储（如硬盘）的什么位置。倒排页表本身不包含这些不在内存的页的信息。因此，通常还需要维护一个额外的、可能是基于进程的“外部页表”来记录这些信息，这些外部页表可以在需要时被调入内存。处理页错误时，可能需要先调入这个外部页表，这可能导致二次页错误。