<?xml version="1.0" encoding="utf-8"?><?xml-stylesheet type="text/xsl" href="/assets/rss-20b3285f.xsl"?>
<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom">
    <channel>
        <title>标签: csapp - ouuan's blog</title>
        <link>https://ouuan.moe/tag/csapp</link>
        <description>标签为 csapp 的文章 - ouuan 的博客</description>
        <lastBuildDate>Mon, 26 Dec 2022 06:44:19 GMT</lastBuildDate>
        <docs>https://validator.w3.org/feed/docs/rss2.html</docs>
        <generator>https://github.com/jpmonette/feed</generator>
        <language>zh-CN</language>
        <copyright>Copyright © 2022 - 2026 ouuan
Licensed under CC BY-SA 4.0</copyright>
        <atom:link href="https://ouuan.moe/tag/csapp/feed.xml" rel="self" type="application/rss+xml"/>
        <item>
            <title><![CDATA[CS:APP 第九章学习笔记]]></title>
            <link>https://ouuan.moe/post/2022/11/csapp-9</link>
            <guid>https://ouuan.moe/post/2022/11/csapp-9</guid>
            <pubDate>Mon, 26 Dec 2022 06:44:19 GMT</pubDate>
            <description><![CDATA[

<p><a href="https://csapp.cs.cmu.edu/">CS:APP</a> 第九章 <span class="mojikumi">“</span>Virtual Memory<span class="mojikumi">”</span> 的学习笔记<span class="mojikumi-line-end">。</span></p>
<p>本章的主要内容为 page table<span class="mojikumi-line-end">、</span>address translation<span class="mojikumi-line-end">、</span>memory mapping<span class="mojikumi-line-end">、</span>dynamic allocation<span class="mojikumi-line-end">。</span></p>
]]></description>
            <content:encoded><![CDATA[

<p><a href="https://csapp.cs.cmu.edu/">CS:APP</a> 第九章 <span class="mojikumi">“</span>Virtual Memory<span class="mojikumi">”</span> 的学习笔记<span class="mojikumi-line-end">。</span></p>
<p>本章的主要内容为 page table<span class="mojikumi-line-end">、</span>address translation<span class="mojikumi-line-end">、</span>memory mapping<span class="mojikumi-line-end">、</span>dynamic allocation<span class="mojikumi-line-end">。</span></p>

<p>虚存是对 main memory 的抽象<span class="mojikumi-line-end">，</span>它的主要作用有<span class="mojikumi-line-end">：</span></p>
<ul>
<li>将 main memory 用作 disk 的 cache<span class="mojikumi-line-end">，</span>只将 active 的部分放在 main memory<span class="mojikumi-line-end">，</span>在需要时在 disk 和 memory 之间传递数据</li>
<li>通过给应用程序提供统一的地址空间<span class="mojikumi-line-end">，</span>简化内存管理</li>
<li>通过给不同进程提供独立的地址空间<span class="mojikumi-line-end">，</span>防止一个进程的数据被其他进程破坏</li>
</ul>
<p>虚存在系统中起着非常重要的作用<span class="mojikumi-line-end">，</span>学习虚存一方面可以学会使用它的一些强大功能<span class="mojikumi-line-start">（</span>例如将文件映射到内存中<span class="mojikumi">）</span><span class="mojikumi-line-end">，</span>另一方面可以避免一些内存管理相关的错误<span class="mojikumi-line-end">。</span></p>
<h2 id="physical-and-virtual-addressing" class="heading"><a href="#physical-and-virtual-addressing" class="heading-anchor" aria-label="章节： Physical and Virtual Addressing" tabindex="-1"></a><span>Physical and Virtual Addressing</span></h2>
<p>内存有两种寻址方式<span class="mojikumi-line-end">：</span>物理寻址和虚拟寻址<span class="mojikumi-line-end">。</span></p>
<p>main memory 可以看作 <span class="math math-inline"><span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>M</mi></mrow><annotation encoding="application/x-tex">M</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.6833em;"></span><span class="mord mathnormal" style="margin-right:0.10903em;">M</span></span></span></span></span> 个 byte 排列在一起<span class="mojikumi-line-end">，</span>地址分别为 <span class="math math-inline"><span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mn>0</mn><mo>∼</mo><mi>M</mi><mo>−</mo><mn>1</mn></mrow><annotation encoding="application/x-tex">0 \sim M-1</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.6444em;"></span><span class="mord">0</span><span class="mspace" style="margin-right:0.2778em;"></span><span class="mrel">∼</span><span class="mspace" style="margin-right:0.2778em;"></span></span><span class="base"><span class="strut" style="height:0.7667em;vertical-align:-0.0833em;"></span><span class="mord mathnormal" style="margin-right:0.10903em;">M</span><span class="mspace" style="margin-right:0.2222em;"></span><span class="mbin">−</span><span class="mspace" style="margin-right:0.2222em;"></span></span><span class="base"><span class="strut" style="height:0.6444em;"></span><span class="mord">1</span></span></span></span></span><span class="mojikumi-line-end">，</span>物理寻址就是 CPU 直接将需要的地址传给 main memory<span class="mojikumi-line-end">，</span>获取到数据后传回 CPU<span class="mojikumi-line-end">。</span></p>
<p>虚拟寻址需要硬件和操作系统配合<span class="mojikumi-line-end">，</span>CPU 将虚拟地址传给 <i>memory management unit</i> (MMU)<span class="mojikumi-line-end">，</span>MMU 将虚拟地址翻译成物理地址传给 main memory<span class="mojikumi-line-end">，</span>而这个过程又和操作系统相关<span class="mojikumi-line-end">。</span></p>
<h2 id="address-spaces" class="heading"><a href="#address-spaces" class="heading-anchor" aria-label="章节： Address Spaces" tabindex="-1"></a><span>Address Spaces</span></h2>
<p><span class="mojikumi-line-start">（</span>线性<span class="mojikumi-line-end">）</span>地址空间是连续的非负整数构成的集合<span class="mojikumi-line-end">，</span>一个系统有一个物理地址空间 <span class="math math-inline"><span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mo stretchy="false">[</mo><mn>0</mn><mo separator="true">,</mo><mi>M</mi><mo>−</mo><mn>1</mn><mo stretchy="false">]</mo></mrow><annotation encoding="application/x-tex">[0, M-1]</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em;"></span><span class="mopen">[</span><span class="mord">0</span><span class="mpunct">,</span><span class="mspace" style="margin-right:0.1667em;"></span><span class="mord mathnormal" style="margin-right:0.10903em;">M</span><span class="mspace" style="margin-right:0.2222em;"></span><span class="mbin">−</span><span class="mspace" style="margin-right:0.2222em;"></span></span><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em;"></span><span class="mord">1</span><span class="mclose">]</span></span></span></span></span><span class="mojikumi-line-end">，</span>还有若干个虚拟地址空间 <span class="math math-inline"><span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mo stretchy="false">[</mo><mn>0</mn><mo separator="true">,</mo><mi>N</mi><mo>−</mo><mn>1</mn><mo stretchy="false">]</mo></mrow><annotation encoding="application/x-tex">[0, N-1]</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em;"></span><span class="mopen">[</span><span class="mord">0</span><span class="mpunct">,</span><span class="mspace" style="margin-right:0.1667em;"></span><span class="mord mathnormal" style="margin-right:0.10903em;">N</span><span class="mspace" style="margin-right:0.2222em;"></span><span class="mbin">−</span><span class="mspace" style="margin-right:0.2222em;"></span></span><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em;"></span><span class="mord">1</span><span class="mclose">]</span></span></span></span></span><span class="mojikumi-line-end">，</span>其中 <span class="math math-inline"><span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>N</mi><mo>=</mo><msup><mn>2</mn><mi>n</mi></msup></mrow><annotation encoding="application/x-tex">N = 2^n</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.6833em;"></span><span class="mord mathnormal" style="margin-right:0.10903em;">N</span><span class="mspace" style="margin-right:0.2778em;"></span><span class="mrel">=</span><span class="mspace" style="margin-right:0.2778em;"></span></span><span class="base"><span class="strut" style="height:0.6644em;"></span><span class="mord"><span class="mord">2</span><span class="msupsub"><span class="vlist-t"><span class="vlist-r"><span class="vlist" style="height:0.6644em;"><span style="top:-3.063em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight">n</span></span></span></span></span></span></span></span></span></span></span></span><span class="mojikumi-line-end">，</span>称作 <span class="math math-inline"><span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>n</mi></mrow><annotation encoding="application/x-tex">n</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.4306em;"></span><span class="mord mathnormal">n</span></span></span></span></span>-bit 地址空间<span class="mojikumi-line-end">，</span>一般是 32-bit 或者 64-bit<span class="mojikumi-line-end">。</span></p>
<p>同一份数据可以在不同的地址空间有不同的地址<span class="mojikumi-line-end">，</span>是虚存的一个基本思想<span class="mojikumi-line-end">。</span></p>
<h2 id="vm-as-a-tool-for-caching" class="heading"><a href="#vm-as-a-tool-for-caching" class="heading-anchor" aria-label="章节： VM as a Tool for Caching" tabindex="-1"></a><span>VM as a Tool for Caching</span></h2>
<p>可以说<span class="mojikumi-line-end">，</span>虚存是存储在磁盘上的<span class="mojikumi-line-end">，</span>而物理内存是虚存的 cache<span class="mojikumi">。</span><wbr><span class="mojikumi-line-start">（</span>但实际上虚存在很多时候会只出现在这个 cache 里而只在必要时被写入到磁盘上<span class="mojikumi">。</span><span class="mojikumi-line-end">）</span></p>
<h3 id="page" class="heading"><a href="#page" class="heading-anchor" aria-label="章节： page" tabindex="-1"></a><span>page</span></h3>
<p>虚存被分成了很多固定大小的块<span class="mojikumi-line-end">，</span>每一块称作一个 <i>virtual page</i><span class="mojikumi-line-end">，</span>而物理内存被分为同样大小的块<span class="mojikumi-line-end">，</span>每一块被称作一个 <i>physical page</i><span class="mojikumi-line-end">。</span>在 cache 中<span class="mojikumi-line-end">，</span>这样的一块一般被称作一个 block<span class="mojikumi-line-end">，</span>但在虚存中被称作一个 page<span class="mojikumi-line-end">。</span></p>
<p>因为 DRAM 比磁盘快很多<span class="mojikumi-line-end">，</span>并且磁盘的连续访问比随机访问快很多<span class="mojikumi-line-end">：</span></p>
<ul>
<li>虚存的一个 page 会比较大<span class="mojikumi-line-end">，</span>一般有 4KB ~ 2MB</li>
<li>虚存是 <a href="/post/2022/12/csapp-6#cache-%E7%9A%84%E5%88%86%E7%B1%BB">fully associative cache</a></li>
<li>操作系统会使用一些比 SRAM cache 更加复杂的算法作为 replacement policy 来管理虚存</li>
</ul>
<p>一个 virtual page 可能处于三种状态之一: unallocated<span class="mojikumi-line-end">、</span>cached<span class="mojikumi-line-end">、</span>uncached<span class="mojikumi-line-end">。</span></p>
<h3 id="page-table" class="heading"><a href="#page-table" class="heading-anchor" aria-label="章节： page table" tabindex="-1"></a><span>page table</span></h3>
<p>在物理内存中存放着一张 <i>page table</i><span class="mojikumi-line-end">，</span>虚拟地址空间中的每个 page 都对应 page table 中的一项 (<i>page table entry</i>, PTE)<span class="mojikumi-line-end">。</span>每一项包含一个 valid bit 和一个地址<span class="mojikumi-line-end">：</span></p>
<ul>
<li>cached: valid bit set<span class="mojikumi-line-end">，</span>地址为该 page 的缓存的物理地址</li>
<li>uncached: valid bit not set<span class="mojikumi-line-end">，</span>地址指向磁盘上的 virtual page</li>
<li>unallocated: valid bit not set<span class="mojikumi-line-end">，</span>地址为 null</li>
</ul>
<h3 id="page-fault" class="heading"><a href="#page-fault" class="heading-anchor" aria-label="章节： page fault" tabindex="-1"></a><span>page fault</span></h3>
<p>在地址翻译时<span class="mojikumi-line-end">，</span>MMU 会查看传入的虚拟地址对应的 PTE<span class="mojikumi-line-end">，</span>若 cached<span class="mojikumi-line-end">，</span>则称作 <i>page hit</i><span class="mojikumi-line-end">，</span>就会将 PTE 存储的物理地址传给 main memory<span class="mojikumi-line-end">；</span>否则<span class="mojikumi-line-end">，</span>就是 cache miss<span class="mojikumi-line-end">，</span>在虚存中被称作 <i>page fault</i><span class="mojikumi-line-end">。</span></p>
<p>page fault 是一个 exception<span class="mojikumi-line-end">，</span>会触发 kernel 中的 page fault handler<span class="mojikumi-line-end">。</span>page fault handler 会在 physical memory 中选择一个 physical page<span class="mojikumi-line-start">（</span>victim page<span class="mojikumi-line-end">）</span>用来存这个触发 page fault 的 page<span class="mojikumi-line-end">，</span>先将 victim page 原有的数据在必要时放回磁盘<span class="mojikumi-line-end">，</span>然后将新的数据存入 victim page<span class="mojikumi-line-end">，</span>再相应地修改 page table 中的这两个 PTE<span class="mojikumi-line-end">，</span>使得 victim page 原来存的那个 virtual page 变为 uncached<span class="mojikumi-line-end">，</span>而新存入的 virtual page 变为 cached 并且地址指向 victim page<span class="mojikumi-line-end">。</span>page fault handler 的最终效果就是<span class="mojikumi-line-end">，</span>一开始想要的 virtual page 已经 cached<span class="mojikumi-line-end">，</span>于是在返回到 exception 触发的位置时就可以 page hit 而正常读取数据了<span class="mojikumi-line-end">。</span></p>
<p>在磁盘和内存间传递数据在虚存中被称作 <i>swapping</i> 或 <i>paging</i><span class="mojikumi-line-end">：</span></p>
<blockquote>
<p>Pages are <i>swapped in</i> (<i>paged in</i>) from disk to DRAM, and <i>swapped out</i> (<i>paged out</i>) from DRAM to disk.</p>
</blockquote>
<p>虚存的 cache miss 是非常昂贵的<span class="mojikumi-line-end">，</span>但由于程序访问内存的 locality<span class="mojikumi-line-end">，</span>一般来说 page fault 很少触发<span class="mojikumi-line-end">，</span>效率就不会太差<span class="mojikumi-line-end">。</span>不断触发 page fault 的情况称作 <i>thrashing</i><span class="mojikumi-line-end">，</span>会大大影响程序的效率<span class="mojikumi-line-end">。</span></p>
<h2 id="vm-as-a-tool-for-memory-management" class="heading"><a href="#vm-as-a-tool-for-memory-management" class="heading-anchor" aria-label="章节： VM as a Tool for Memory Management" tabindex="-1"></a><span>VM as a Tool for Memory Management</span></h2>
<p>实际上<span class="mojikumi-line-end">，</span>page table 在一个系统中并非只有一份<span class="mojikumi-line-end">，</span>而是每个进程都有一份<span class="mojikumi-line-end">，</span>并且可以把同一个 physical page 映射到不同进程中的多个 virtual page<span class="mojikumi-line-end">。</span></p>
<p>虚存为内存管理提供了如下的便利<span class="mojikumi-line-end">：</span></p>
<ul>
<li>简化了 linking<span class="mojikumi-line-end">，</span>使得链接时无需考虑具体的物理地址<span class="mojikumi-line-end">，</span>不同程序可以使用同样的虚拟地址分配方案<span class="mojikumi-line-end">。</span></li>
<li>简化了 loading<span class="mojikumi-line-end">，</span>使得加载程序时只需将可执行文件的段落映射到虚存中<span class="mojikumi-line-end">，</span>不用拷贝数据<span class="mojikumi-line-end">，</span>等访问到某个 page 时才会 page in<span class="mojikumi-line-end">。</span>这样的将文件内容映射到虚存中的操作称作 <a href="#memory-mapping">memory mapping</a><span class="mojikumi-line-end">，</span>Linux 提供了 <code>mmap</code> system call 来进行 memory mapping<span class="mojikumi-line-end">。</span></li>
<li>简化了内存共享<span class="mojikumi-line-end">，</span>操作系统可以将进程私有的数据映射到不同的 physical page<span class="mojikumi-line-end">，</span>而将共享的数据映射到相同的 physical page<span class="mojikumi-line-end">。</span></li>
<li>简化了内存分配<span class="mojikumi-line-end">，</span>因为应用请求一段连续的 virtual pages 时<span class="mojikumi-line-end">，</span>操作系统可以将其映射到不连续的 physical pages<span class="mojikumi-line-end">。</span></li>
</ul>
<h2 id="vm-as-a-tool-for-memory-protection" class="heading"><a href="#vm-as-a-tool-for-memory-protection" class="heading-anchor" aria-label="章节： VM as a Tool for Memory Protection" tabindex="-1"></a><span>VM as a Tool for Memory Protection</span></h2>
<ul>
<li>虚存可以轻松地给不同的进程提供不同的私有内存空间<span class="mojikumi-line-end">。</span></li>
<li>通过给 PTE 添加 permission bit <code>SUP</code><span class="mojikumi-line-end">、</span><code>READ</code><span class="mojikumi-line-end">、</span><code>WRITE</code><span class="mojikumi-line-end">，</span>就可以使某个 page 只读或者只能在 kernel mode 下被访问<span class="mojikumi-line-end">。</span>如果试图访问一个 page 时权限出错<span class="mojikumi-line-end">，</span>则会触发 CPU 的 general protection exception<span class="mojikumi-line-end">，</span>进而由 exception handler 向进程发送 SIGSEGV<span class="mojikumi-line-end">。</span></li>
</ul>
<h2 id="address-translation" class="heading"><a href="#address-translation" class="heading-anchor" aria-label="章节： Address Translation" tabindex="-1"></a><span>Address Translation</span></h2>
<p>一个内存地址可以被分为两部分<span class="mojikumi-line-end">，</span>虚拟地址被分为高位的 <i>virtual page number</i> (VPN) 和低位的 <i>virtual page offset</i> (VPO)<span class="mojikumi-line-end">，</span>物理地址被分为 PPN 和 PPO<span class="mojikumi-line-end">。</span></p>
<p>CPU 中有一个 <i>page table base register</i> (PTBR)<span class="mojikumi-line-end">，</span>指向 page table 的起始地址<span class="mojikumi-line-end">。</span>地址翻译时<span class="mojikumi-line-end">，</span>MMU 通过 PTBR 和 VPN 得到 PTE 的地址<span class="mojikumi-line-end">，</span>从 main memory 获取 PTE<span class="mojikumi-line-end">，</span>根据 valid bit<span class="mojikumi-line-end">，</span>要么触发 page fault<span class="mojikumi-line-end">，</span>要么获取到 PPN<span class="mojikumi-line-end">，</span>而 PPO = VPO<span class="mojikumi-line-end">，</span>就得到了物理地址<span class="mojikumi-line-end">。</span></p>
<p>SRAM cache 一般会以物理地址来 cache main memory<span class="mojikumi-line-end">，</span>也就是说<span class="mojikumi-line-end">，</span>通过 PTE 的地址访问 PTE<span class="mojikumi-line-end">、</span>通过物理地址访问 main memory 时会首先尝试通过 SRAM cache 来访问<span class="mojikumi-line-end">。</span></p>
<p>如果每次都从 main memory 获取 PTE<span class="mojikumi-line-end">，</span>即使在 L1 cache hit 了效率也不够高<span class="mojikumi-line-end">，</span>所以 MMU 中还有一个小的 page table cache<span class="mojikumi-line-end">，</span>叫做 <i>translation lookaside buffer</i> (TLB)<span class="mojikumi-line-end">。</span>VPN 被分为两部分<span class="mojikumi-line-end">：</span>低位的 TLBI (index) 和高位的 TLBT (tag)<span class="mojikumi-line-end">，</span>其中 TLBI 用来选择 cache set<span class="mojikumi-line-end">，</span>TLBT 用来进行 cache line matching<span class="mojikumi-line-end">。</span>在地址翻译时<span class="mojikumi-line-end">，</span>会优先查询 TLB<span class="mojikumi-line-end">，</span>若 miss 再查询 page table<span class="mojikumi-line-end">。</span></p>
<p>地址空间往往很大<span class="mojikumi-line-end">，</span>如果只用一张 page table<span class="mojikumi-line-end">，</span>那么 page table 本身就会占用大量的空间<span class="mojikumi-line-end">，</span>所以可以将 page table 分层<span class="mojikumi-line-end">，</span>每层 page table 指向下一层 page table<span class="mojikumi-line-end">，</span>直到最后一层指向 VP / PP<span class="mojikumi-line-end">。</span></p>
<h2 id="case-study-core-i7-address-translation" class="heading"><a href="#case-study-core-i7-address-translation" class="heading-anchor" aria-label="章节： Case Study: Core i7 Address Translation" tabindex="-1"></a><span>Case Study: Core i7 Address Translation</span></h2>
<p>Core i7 memory system 如 CS:APP Figure 9.21 所示<span class="mojikumi-line-end">：</span></p>
<p><picture><img type="image/webp" srcset="/assets/csapp-fig9.21.282f25d8.webp" loading="lazy" src="/assets/csapp-fig9.21.282f25d8.webp" width="1207" height="934" alt="The Core i7 memory system"></picture></p>
<p>Core i7 使用 48-bit 的虚拟地址空间和 52-bit 的物理地址空间<span class="mojikumi-line-end">，</span>page size 可以设置为 4KB 或 4MB<span class="mojikumi-line-end">，</span>有四级 page table<span class="mojikumi-line-end">。</span></p>
<p>每个 PTE 有以下内容<span class="mojikumi">：</span><wbr><span class="mojikumi-line-start">（</span>还有一些其他内容<span class="mojikumi-line-end">）</span></p>
<ul>
<li>P: valid bit</li>
<li>R/W: 是否只读</li>
<li>U/S: 是否需要在 kernel mode 下访问</li>
<li>XD: 是否可以被读取指令<span class="mojikumi-line-start">（</span>是否可执行<span class="mojikumi-line-end">）</span></li>
<li>A: reference bit<span class="mojikumi-line-end">，</span>访问到时由 MMU 设置<span class="mojikumi-line-end">，</span>而由软件清除<span class="mojikumi-line-start">（</span>可以用于 replacement algorithm<span class="mojikumi-line-end">）</span></li>
<li>Base addr: child page table / physical page 的地址的高位 40 bits<span class="mojikumi-line-start">（</span>剩下 12 bits 即 4KB<span class="mojikumi-line-end">，</span>这要求地址以 4KB 对齐<span class="mojikumi-line-end">，</span>而 page size 一般就是 4KB<span class="mojikumi-line-end">）</span></li>
</ul>
<p>L1 page table 还有一项 PS 用来指定 page size<span class="mojikumi-line-end">。</span></p>
<p>L4 page table 还有 dirty bit D 用来表示 page 被写入过需要被 swap out (write back)<span class="mojikumi-line-end">，</span>以及 G 表示 global page 即切换进程时不从 TLB 中 evict 掉<span class="mojikumi-line-end">。</span></p>
<p>VPN 有 36 bits<span class="mojikumi-line-end">，</span>每 9 bits 用来访问一级 page table<span class="mojikumi-line-end">。</span></p>
<p>因为 L1 cache 是 8-way 32KB 的<span class="mojikumi-line-end">，</span>正好有 12 bits 用来选择 cache set<span class="mojikumi-line-end">，</span>所以在获取 PPN 的同时就可以把 VPO 发送给 L1 cache 来提前选择好 cache set<span class="mojikumi-line-end">。</span></p>
<h2 id="linux-virtual-memory-system" class="heading"><a href="#linux-virtual-memory-system" class="heading-anchor" aria-label="章节： Linux Virtual Memory System" tabindex="-1"></a><span>Linux Virtual Memory System</span></h2>
<p>kernel 的虚存中包含<span class="mojikumi-line-end">：</span></p>
<ul>
<li>
<p>kernel 的代码以及全局的数据结构</p>
</li>
<li>
<p>将整个物理内存连续地映射到虚存中<span class="mojikumi-line-end">，</span>这样就可以方便地访问特定的物理地址</p>
</li>
<li>
<p>和每个进程相关的数据结构<span class="mojikumi-line-end">，</span>例如 page table<span class="mojikumi-line-end">、</span>kernel stack<span class="mojikumi-line-end">、</span><code>task_struct</code> 等</p>
<p><span class="mojikumi-line-start">（</span>P.S. 这部分虽然是和每个进程相关<span class="mojikumi-line-end">，</span>但并不会在每个进程中有所不同<span class="mojikumi-line-end">，</span>CS:APP 中这里写错了<span class="mojikumi-line-end">，</span>在 errata 中指出了<span class="mojikumi-line-end">）</span></p>
</li>
</ul>
<p>Linux 将虚存划分为若干 <i>area</i><span class="mojikumi-line-start">（</span>也称 <i>segment</i><span class="mojikumi-line-end">）</span>来管理<span class="mojikumi-line-end">，</span>例如 code segment<span class="mojikumi-line-end">、</span>data segment<span class="mojikumi-line-end">、</span>heap<span class="mojikumi-line-end">、</span>shared library segment<span class="mojikumi-line-end">，</span>每个 area 是虚存中连续的一段<span class="mojikumi-line-end">。</span></p>
<p>kernel 为每个进程维护了一个 <code>task_struct</code><span class="mojikumi-line-end">，</span>其中的 <code>mm</code> 一项是一个 <code>mm_struct</code><span class="mojikumi-line-end">。</span><code>mm_struct</code> 的 <code>pgd</code> 一项是 L1 page table 的地址<span class="mojikumi-line-end">，</span>而 <code>mmap</code> 指向一个 <code>vm_<wbr>area_<wbr>struct</code><span class="mojikumi-line-end">。</span>每个 <code>vm_<wbr>area_<wbr>struct</code> 表示一个 area<span class="mojikumi-line-end">，</span>有以下几项<span class="mojikumi-line-start">（</span>还有一些其他项<span class="mojikumi">）</span><span class="mojikumi-line-end">：</span></p>
<ul>
<li><code>vm_start</code> / <code>vm_end</code>: 指向 area 的开头 / 结尾</li>
<li><code>vm_page_prot</code>: area 中所有 page 的 access permission</li>
<li><code>vm_flags</code>: 一些 flag<span class="mojikumi-line-end">，</span>例如这个 area 中的 page 是否被所有进程共享</li>
<li><code>vm_prev</code> / <code>vm_next</code>: 指向相邻的 <code>vm_<wbr>area_<wbr>struct</code><span class="mojikumi-line-end">，</span>构成一个链表</li>
</ul>
<p>在处理 page fault 时<span class="mojikumi-line-end">，</span>page fault handler 首先会检查地址是否在某个 area 内<span class="mojikumi-line-start">（</span>不在则触发 segmentation fault<span class="mojikumi">）</span><span class="mojikumi-line-end">，</span>然后会检查是否有访问权限<span class="mojikumi-line-start">（</span>没有则触发 protection exception<span class="mojikumi">）</span><span class="mojikumi-line-end">，</span>如果一切 ok 就会根据 replacement algorithm 选择 victim page<span class="mojikumi-line-end">，</span>若其 dirty 则将其 swap out<span class="mojikumi-line-end">，</span>然后将新的 page swap in<span class="mojikumi-line-end">，</span>最后更新 page table 并返回<span class="mojikumi-line-end">。</span></p>
<a id="segmentation-fault-vs-protection-exception" name="segmentation-fault-vs-protection-exception" aria-hidden="true"></a>
<aside role="note" data-v-a2ab257f><div class="shadow-md rd-1 b-l-6 my-6 bg-purple-2 dark:bg-purple-9 b-purple-5" data-v-a2ab257f><div class="p-3 flex justify-between items-center" data-v-a2ab257f><h3 class="flex items-center gap-1 font-bold" data-v-a2ab257f><span class="text-5 i-mdi-help-circle-outline text-purple" data-v-a2ab257f></span><span class="sr-only" data-v-a2ab257f>Question: </span><span data-v-a2ab257f>segmentation fault vs protection exception</span></h3><!--v-if--></div><div class="overflow-auto rd-br-1 bg-card px-6 dark:bg-bghover" data-v-a2ab257f><p>segmentation fault 和 protection exception 有区别吗？general protection exception 不应该是 CPU 触发的吗<span class="mojikumi-line-end">，</span>怎么是 page fault handler 触发？segmentation fault 和 SIGSEGV 是什么关系？</p><p>我的理解是 kernel 收到 CPU 的 general protection exception 会向进程发送 SIGSEGV<span class="mojikumi-line-end">，</span>但 CS:APP 这里在相邻的两段分别用了 <span class="mojikumi">“</span>segmentation fault<span class="mojikumi">”</span> 和 <span class="mojikumi">“</span>protection exception<span class="mojikumi">”</span><span class="mojikumi-line-end">。</span></p></div></div></aside>
<h2 id="memory-mapping" class="heading"><a href="#memory-mapping" class="heading-anchor" aria-label="章节： Memory Mapping" tabindex="-1"></a><span>Memory Mapping</span></h2>
<p>将一个 <i>object</i> 的内容设为一段虚存的初始值称作 <i>memory mapping</i><span class="mojikumi-line-end">。</span>这个 object 可以是文件系统中一个文件的一段 (<i>file-backed</i>)<span class="mojikumi-line-end">，</span>也可以是一个初始为空的 <i>anonymous file</i> (<i>demand-zero</i>)<span class="mojikumi-line-end">。</span></p>
<p>在 map 时并不会立即将数据放到物理内存中<span class="mojikumi-line-end">，</span>而是等到访问到某个 page 时再 swap in<span class="mojikumi-line-end">，</span>这称作 <i>demand paging</i><span class="mojikumi-line-end">。</span>操作系统会使用 <i>swap file</i> 来进行 swapping<span class="mojikumi-line-end">，</span>但只有进行了修改才会需要 swap out<span class="mojikumi-line-end">，</span>否则可以直接从 map 到的文件 swap in<span class="mojikumi-line-end">。</span></p>
<p>如果不同的进程映射到了同一个文件的同一段<span class="mojikumi-line-end">，</span>在物理内存中会只有一份数据<span class="mojikumi-line-end">。</span></p>
<p>memory mapping 有 shared 和 private 两种<span class="mojikumi-line-end">：</span></p>
<ul>
<li>map as shared objects: 修改对其他进程可见<span class="mojikumi-line-end">，</span>如果是 file-backed 还会将内存修改同步到磁盘上的文件<span class="mojikumi-line-end">。</span></li>
<li>map as private objects: 修改对其他进程不可见<span class="mojikumi-line-end">，</span>也不会同步到磁盘上<span class="mojikumi-line-end">，</span>并且是 copy-on-write 的<span class="mojikumi-line-end">：</span>一开始将 PTE 设为只读<span class="mojikumi-line-end">，</span>在触发 protection exception 后<span class="mojikumi-line-end">，</span>exception handler 发现这个 area 是可以写入但 private 的<span class="mojikumi-line-end">，</span>就创建一个新的 page<span class="mojikumi-line-end">，</span>将原来的 page 复制过去<span class="mojikumi-line-end">，</span>将 PTE 设为可以写入<span class="mojikumi-line-end">。</span></li>
</ul>
<h3 id="fork-的原理" class="heading"><a href="#fork-的原理" class="heading-anchor" aria-label="章节： fork 的原理" tabindex="-1"></a><span>fork 的原理</span></h3>
<p>fork 时会将原来的 <code>mm_struct</code> 以及 page table 复制一份<span class="mojikumi-line-end">，</span>但是将原来的 private area 中的 PTE 可以写入的重新变为只读<span class="mojikumi-line-end">，</span>从而在之后再写入时重新触发 copy-on-write<span class="mojikumi-line-end">，</span>就做到了 parent 和 child 一开始有一样的数据但后续写入独立<span class="mojikumi-line-end">。</span>在 fork 前就创建了的 shared area 会由两个进程共享<span class="mojikumi-line-end">，</span>可以利用这一点在 parent 和 child 之间通信<span class="mojikumi-line-end">。</span></p>
<h3 id="execve-的原理" class="heading"><a href="#execve-的原理" class="heading-anchor" aria-label="章节： execve 的原理" tabindex="-1"></a><span>execve 的原理</span></h3>
<ol>
<li>删除当前进程的所有 area (<code>vm_<wbr>area_<wbr>struct</code>)</li>
<li>根据 program header table 进行 memory mapping<span class="mojikumi-line-end">：</span>
<ul>
<li><code>.init</code><span class="mojikumi-line-end">、</span><code>.text</code><span class="mojikumi-line-end">、</span><code>.rodata</code>: private, file-backed, read-only</li>
<li><code>.data</code>: private, file-backed, read/write</li>
<li><code>.bss</code><span class="mojikumi-line-end">、</span>heap<span class="mojikumi-line-end">、</span>stack: private, demand-zero, read/write</li>
</ul>
</li>
<li>如果有 link 到共享库<span class="mojikumi-line-end">，</span>会进行动态链接<span class="mojikumi-line-end">，</span>将共享库 private, file-backed map</li>
<li>修改 program counter</li>
</ol>
<a id="关于共享库的-map-方式" name="关于共享库的-map-方式" aria-hidden="true"></a>
<aside role="note" data-v-a2ab257f><details class="shadow-md rd-1 b-l-6 my-6 bg-blue-1 dark:bg-blue-9 b-blue" data-v-a2ab257f><summary class="p-3 flex justify-between items-center cursor-pointer" data-v-a2ab257f><h4 class="flex items-center gap-1 font-bold" data-v-a2ab257f><span class="text-5 i-mdi-info-circle-outline text-blue" data-v-a2ab257f></span><span class="sr-only" data-v-a2ab257f>Info: </span><span data-v-a2ab257f>关于共享库的 map 方式</span></h4><span class="details-icon text-5" data-v-a2ab257f></span></summary><div class="overflow-auto rd-br-1 bg-card px-6 dark:bg-bghover" data-v-a2ab257f><p>下面是一封发送于 2022.12.18<span class="mojikumi-line-end">，</span>尚未收到回复的邮件<span class="mojikumi-line-end">：</span></p><blockquote>
<p>Dear Drs. Randy Bryant and Dave O'Hallaron,</p>
<p>I am a student at Tsinghua University and I am writing to ask a question about the book CS:APP3e.</p>
<p>In <span class="mojikumi">“</span>9.8.3 The execve Function Revisited<span class="mojikumi">”</span> on page 837, it is stated that shared libraries are <span class="mojikumi">“</span>mapped into the shared region of the user<span class="mojikumi-narrow-left">’</span>s virtual address space<span class="mojikumi">”</span>. In Figure 9.31, it is stated that <span class="mojikumi">“</span>Memory-mapped region for shared libraries<span class="mojikumi">”</span> are <span class="mojikumi">“</span>Shared, file-backed<span class="mojikumi">”</span>.</p>
<p>However, I believe that shared libraries are actually mapped as private objects rather than shared objects. I have come to this conclusion for the following reasons:</p>
<ol>
<li>If there is data in the shared library, it should be copy-on-write, and should not be shared among different processes.</li>
<li>/proc/self/maps shows that all mappings to shared libraries of my shell are private.</li>
<li>The source code and comments of dl-load indicate that the mapping should be private. (See <a href="https://github.com/bminor/glibc/blob/71e408e45dcacf429a94b2807f75aaadd8d37cb9/elf/dl-load.h#L32-L49" class="break-all">https://github.com/bminor/glibc/blob/71e408e45dcacf429a94b2807f75aaadd8d37cb9/elf/dl-load.h#L32-L49</a> and <a href="https://github.com/bminor/glibc/commit/9b8a44cd18fbf1aedeb03e19f4bcdb06b0ee409b" class="break-all">https://github.com/bminor/glibc/commit/9b8a44cd18fbf1aedeb03e19f4bcdb06b0ee409b</a>.)</li>
</ol>
<p>I have checked the errata but did not find this issue addressed. I am writing to you in the hope that you can provide an explanation of this statement or add it to the errata. Thank you for your attention to this matter.</p>
<p>Sincerely,<br>
Yufan You</p>
</blockquote></div></details></aside>
<h3 id="mmap" class="heading"><a href="#mmap" class="heading-anchor" aria-label="章节： mmap" tabindex="-1"></a><span>mmap</span></h3>
<p><code>void *mmap(void *addr, size_t length, int prot, int flags, int fd, off_t offset)</code></p>
<ul>
<li><code>addr</code>: area 的起始地址<span class="mojikumi-line-end">，</span>仅作提示作用<span class="mojikumi-line-end">，</span>一般 <code>NULL</code> 就行</li>
<li><code>length</code>: area 的长度</li>
<li><code>prot</code>: <code>PROT_EXEC</code><span class="mojikumi-line-end">、</span><code>PROT_READ</code><span class="mojikumi-line-end">、</span><code>PROT_WRITE</code><span class="mojikumi-line-end">、</span><code>PROT_NONE</code></li>
<li><code>flags</code>: 有很多<span class="mojikumi-line-end">，</span>常用的有 <code>MAP_SHARED</code><span class="mojikumi-line-end">、</span><code>MAP_PRIVATE</code><span class="mojikumi-line-end">、</span><code>MAP_<wbr>ANONYMOUS</code></li>
<li><code>fd</code>: map 到的 file descriptor</li>
<li><code>offset</code>: map 到的文件内容的 offset<span class="mojikumi-line-end">，</span>必须是 page size 的倍数</li>
</ul>
<p>在 <code>MAP_<wbr>ANONYMOUS</code> 时<span class="mojikumi-line-end">，</span>最好将 <code>fd</code> 设为 -1<span class="mojikumi-line-end">、</span><code>offset</code> 设为 0<span class="mojikumi-line-start">（</span>在有的实现中这是必须的<span class="mojikumi">）</span><span class="mojikumi-line-end">。</span></p>
<p>失败时 <code>mmap</code> 会返回 <code>MAP_FAILED</code><span class="mojikumi-line-end">。</span></p>
<p><code>int<wbr> <wbr>munmap<wbr>(<wbr>void<wbr> *<wbr>addr<wbr>, <wbr>size_t<wbr> <wbr>length<wbr>)</code>: 将自 <code>addr</code> 起 <code>length</code> 长的范围内的 mapping 删除<span class="mojikumi-line-end">，</span>以后再访问就会 segmentation fault<span class="mojikumi-line-end">。</span><code>addr</code> 必须是 page size 的倍数<span class="mojikumi-line-end">。</span></p>
<h2 id="dynamic-memory-allocation" class="heading"><a href="#dynamic-memory-allocation" class="heading-anchor" aria-label="章节： Dynamic Memory Allocation" tabindex="-1"></a><span>Dynamic Memory Allocation</span></h2>
<h3 id="动态分配的相关函数" class="heading"><a href="#动态分配的相关函数" class="heading-anchor" aria-label="章节： 动态分配的相关函数" tabindex="-1"></a><span>动态分配的相关函数</span></h3>
<p>在 C 语言中<span class="mojikumi-line-end">，</span>可以用 <code>malloc</code> 和 <code>free</code> 来获取 / 释放动态分配的内存<span class="mojikumi-line-end">。</span>可以使用 <code>calloc</code> 来初始化分配到的内存并在使用乘法计算内存大小时检测是否发生溢出<span class="mojikumi-line-end">。</span>可以使用 <code>realloc</code> 来给一块动态分配的内存调整大小<span class="mojikumi-line-end">。</span>详见 <code>man malloc</code><span class="mojikumi-line-end">。</span></p>
<p>为了让动态分配得到的内存可以用于任何数据类型<span class="mojikumi-line-end">，</span>地址会以 double word 对齐<span class="mojikumi-line-end">，</span>即 32 位系统对齐到 8 的倍数<span class="mojikumi-line-end">，</span>64 位系统对齐到 16 的倍数<span class="mojikumi-line-end">。</span></p>
<p>操作系统使用 <code>brk</code> 指针来指向 heap 的结尾<span class="mojikumi-line-end">，</span>可以通过 <code>sbrk</code> 函数来增大 heap<span class="mojikumi-line-end">。</span></p>
<h3 id="allocator-的要求和目标" class="heading"><a href="#allocator-的要求和目标" class="heading-anchor" aria-label="章节： allocator 的要求和目标" tabindex="-1"></a><span>allocator 的要求和目标</span></h3>
<p>Dynamic memory allocator 会将 heap 划分为若干大小不等的 block<span class="mojikumi-line-end">，</span>每个 block 要么 allocated 要么 free<span class="mojikumi-line-end">。</span></p>
<p>allocator 需要做到<span class="mojikumi-line-end">：</span></p>
<ul>
<li>能够处理以任意顺序发送的 allocate 和 free 请求<span class="mojikumi-line-start">（</span>不能对顺序做任何假定<span class="mojikumi-line-end">）</span></li>
<li>立即对请求做出响应<span class="mojikumi-line-start">（</span>不能离线<span class="mojikumi-line-end">）</span></li>
<li>只使用 heap 存储数据<span class="mojikumi-line-start">（</span>不能将数据存储在虚存的其他位置<span class="mojikumi-line-end">）</span></li>
<li>满足对齐要求<span class="mojikumi-line-start">（</span>能够存储任何类型的数据<span class="mojikumi-line-end">）</span></li>
<li>不能修改或移动 allocated block<span class="mojikumi-line-start">（</span>可以修改 free block 或者 heap 中不是 block 的区域<span class="mojikumi-line-end">）</span></li>
</ul>
<p>而 allocator 有两个性能方面的目标<span class="mojikumi-line-end">：</span></p>
<ol>
<li>更快地响应请求<span class="mojikumi-line-start">（</span>更大的吞吐量<span class="mojikumi-line-end">）</span></li>
<li>更高效地利用内存</li>
</ol>
<p>其中<span class="mojikumi-line-end">，</span>导致内存利用率低的主要原因是 <i>fragmentation</i><span class="mojikumi-line-end">：</span></p>
<ul>
<li>internal fragmentation: 实际分配的 allocated block 比 alloc 请求中申请的大</li>
<li>external fragmentation: 所有 free block 加起来大小足够<span class="mojikumi-line-end">，</span>但每单个 free block 都不够大<span class="mojikumi-line-end">，</span>导致需要使用更多 heap 空间</li>
</ul>
<h3 id="一种简单的-allocator-实现方式" class="heading"><a href="#一种简单的-allocator-实现方式" class="heading-anchor" aria-label="章节： 一种简单的 allocator 实现方式" tabindex="-1"></a><span>一种简单的 allocator 实现方式</span></h3>
<h4 id="block-header" class="heading"><a href="#block-header" class="heading-anchor" aria-label="章节： block header" tabindex="-1"></a><span>block header</span></h4>
<p>allocator 需要记录 block 的信息<span class="mojikumi-line-end">，</span>而只能使用 heap 空间<span class="mojikumi-line-end">，</span>所以直接在 block 的开头记录 block header<span class="mojikumi-line-end">，</span>即 block size 以及是否 allocated<span class="mojikumi-line-end">。</span></p>
<p>因为地址有对齐要求<span class="mojikumi-line-end">，</span>block size 的最低几位一定是 0<span class="mojikumi-line-end">，</span>就可以用最低位来存 allocated bit<span class="mojikumi-line-end">。</span></p>
<p>block size 充当了单向链表的作用<span class="mojikumi-line-end">。</span>如果想访问 free block<span class="mojikumi-line-end">，</span>就得访问每个 block 再看是否 free<span class="mojikumi-line-end">，</span>所以这样的结构被称作 <i>implicit free list</i><span class="mojikumi-line-end">。</span></p>
<h4 id="placement-policy" class="heading"><a href="#placement-policy" class="heading-anchor" aria-label="章节： placement policy" tabindex="-1"></a><span>placement policy</span></h4>
<p>allocate 时需要找到一个足够大的 free block<span class="mojikumi-line-end">，</span>allocator 进行这样的搜索的方式称作 <i>placement policy</i><span class="mojikumi-line-end">：</span></p>
<ul>
<li>first fit<span class="mojikumi-line-end">：</span>从头开始找<span class="mojikumi-line-end">，</span>直到找到足够大的 free block</li>
<li>next fit<span class="mojikumi-line-end">：</span>从上次搜索结束的地方开始找<span class="mojikumi-line-end">，</span>直到找到足够大的 free block</li>
<li>best fit<span class="mojikumi-line-end">：</span>遍历所有 free block<span class="mojikumi-line-end">，</span>使用足够大的 free block 中最小的</li>
</ul>
<p>使用 implicit free list 时<span class="mojikumi-line-end">，</span>next fit 比 first fit 吞吐量更大但内存利用率更低<span class="mojikumi-line-end">，</span>best fit 内存利用率最好但吞吐量最差<span class="mojikumi-line-end">。</span></p>
<h4 id="分割-free-block" class="heading"><a href="#分割-free-block" class="heading-anchor" aria-label="章节： 分割 free block" tabindex="-1"></a><span>分割 free block</span></h4>
<p>如果 allocate 时 free block 的剩余空间比需要的空间大<span class="mojikumi-line-end">，</span>且大的超过一个 block 的 minimum size (double word)<span class="mojikumi-line-end">，</span>就可以将这个 block 分为两半<span class="mojikumi-line-end">，</span>一半用作 allocated block<span class="mojikumi-line-end">，</span>另一半为 free block<span class="mojikumi-line-end">。</span></p>
<h4 id="获取更多的-heap-空间" class="heading"><a href="#获取更多的-heap-空间" class="heading-anchor" aria-label="章节： 获取更多的 heap 空间" tabindex="-1"></a><span>获取更多的 heap 空间</span></h4>
<p>如果已有的 heap 空间无法满足 allocate 请求<span class="mojikumi-line-end">，</span>可以使用 <code>sbrk</code> 来获取更多的 heap 空间<span class="mojikumi-line-end">，</span>并将新得到的空间设为 free block<span class="mojikumi-line-end">。</span></p>
<h4 id="合并-free-block" class="heading"><a href="#合并-free-block" class="heading-anchor" aria-label="章节： 合并 free block" tabindex="-1"></a><span>合并 free block</span></h4>
<p>如果很多 free block 相邻地放在一起<span class="mojikumi-line-end">，</span>可能会造成 <i>false fragmentation</i><span class="mojikumi-line-end">，</span>即合并后能放下但每个单独无法放下<span class="mojikumi-line-end">，</span>所以需要对相邻的 free block 进行合并 (coalesce)<span class="mojikumi-line-end">。</span></p>
<p>合并有两种策略<span class="mojikumi-line-end">：</span></p>
<ul>
<li>immediate coalescing<span class="mojikumi-line-end">：</span>每次 free 时都将新得到的 free block 与相邻的 free block 合并<span class="mojikumi-line-end">，</span>这样的话每时每刻都不会有相邻的 free block</li>
<li>deferred coalescing<span class="mojikumi-line-end">：</span>等到某个时候再合并<span class="mojikumi-line-end">，</span>例如在未能找到足够大的 free block 时</li>
</ul>
<p>immediate coalescing 的实现较为简单<span class="mojikumi-line-end">，</span>可以在常数时间内完成<span class="mojikumi-line-end">，</span>但可能会导致反复的合并和分割<span class="mojikumi-line-end">，</span>带来不必要的性能损失<span class="mojikumi-line-end">。</span></p>
<p>合并时需要知道上一个 block 的信息<span class="mojikumi-line-end">，</span>这可以通过在 block 尾部添加一个与 header 内容相同的 footer 来实现<span class="mojikumi-line-end">，</span>这被称作使用 <i>boundary tags</i><span class="mojikumi-line-end">。</span>由于只有 free block 需要 footer<span class="mojikumi-line-end">，</span>可以省去 allocated block 的 footer<span class="mojikumi-line-end">，</span>而在 header 中存储上一块的 allocated bit<span class="mojikumi-line-end">，</span>来节省空间<span class="mojikumi-line-end">。</span></p>
<h3 id="explicit-free-list" class="heading"><a href="#explicit-free-list" class="heading-anchor" aria-label="章节： explicit free list" tabindex="-1"></a><span>explicit free list</span></h3>
<p>可以在 free block 中存储指向前驱后继的指针来维护一个 free block 的链表<span class="mojikumi-line-end">，</span>称作 <i>explicit free list</i><span class="mojikumi-line-end">。</span></p>
<p>这个 list 可以是 LIFO 的或者按地址顺序的<span class="mojikumi-line-end">。</span>LIFO 的 list 可以在常数时间内完成 free 操作<span class="mojikumi-line-end">，</span>而按地址顺序的 list 需要使用线性时间来找到一个 block 在 list 中的位置<span class="mojikumi-line-end">，</span>但内存利用率更高<span class="mojikumi-line-end">。</span></p>
<p>由于需要足够大的空间来存储前驱后继的指针<span class="mojikumi-line-end">，</span>explicit free list 的 minimum block size 更大<span class="mojikumi-line-end">，</span>可能会出现更严重的 internal fragmentation 导致内存利用率下降<span class="mojikumi-line-end">。</span></p>
<h3 id="segregated-free-lists" class="heading"><a href="#segregated-free-lists" class="heading-anchor" aria-label="章节： segregated free lists" tabindex="-1"></a><span>segregated free lists</span></h3>
<p>可以将 block 按 size 分类<span class="mojikumi-line-end">，</span>例如按 2 的次幂分类<span class="mojikumi-line-end">，</span>每一类维护一个 list<span class="mojikumi-line-end">。</span>具体实现方式有很多<span class="mojikumi-line-end">，</span>例如 simple segregated storage 和 segregated fits<span class="mojikumi-line-end">。</span></p>
<h4 id="simple-segregated-storage" class="heading"><a href="#simple-segregated-storage" class="heading-anchor" aria-label="章节： simple segregated storage" tabindex="-1"></a><span>simple segregated storage</span></h4>
<p>每一类的所有 block 都是这一类的最大 size<span class="mojikumi-line-end">，</span>如果一类 block 用光了就申请新的 heap 空间<span class="mojikumi-line-end">，</span>free 时直接放回相应的 list<span class="mojikumi-line-end">，</span>不合并也不分割<span class="mojikumi-line-end">。</span></p>
<p>这样的话<span class="mojikumi-line-end">，</span>header 和 footer 都不需要了<span class="mojikumi-line-end">，</span>只需在 free block 里存放一个后继指针即可<span class="mojikumi-line-end">，</span>但 internal fragmentation 和 external fragmentation 都很严重<span class="mojikumi-line-end">。</span></p>
<h4 id="segregated-fit" class="heading"><a href="#segregated-fit" class="heading-anchor" aria-label="章节： segregated fit" tabindex="-1"></a><span>segregated fit</span></h4>
<p>每一类中有不同大小的 block<span class="mojikumi-line-end">，</span>有分割和合并<span class="mojikumi-line-end">。</span>allocate 时从相应的类别开始找<span class="mojikumi-line-end">，</span>在一类中找不到就继续找下一类<span class="mojikumi-line-end">，</span>这样近似于 best-fit search<span class="mojikumi-line-end">，</span>但速度很快<span class="mojikumi-line-end">。</span></p>
<p>segregated fit 的综合性能较好<span class="mojikumi-line-end">，</span>所以包括 libc 中的 <code>malloc</code> 函数在内的 allocator 往往选择使用 segregated fit<span class="mojikumi-line-end">。</span></p>
<h4 id="buddy-system" class="heading"><a href="#buddy-system" class="heading-anchor" aria-label="章节： buddy system" tabindex="-1"></a><span>buddy system</span></h4>
<p>所有 block size 都是 2 的次幂<span class="mojikumi-line-end">，</span>分割时每次分成两半直到大小合适<span class="mojikumi-line-end">，</span>合并时只和 <span class="mojikumi">“</span>buddy<span class="mojikumi">”</span> 合并<span class="mojikumi-line-end">。</span></p>
<p>这里描述清楚可能比较复杂<span class="mojikumi-line-end">，</span>就感性理解一下<span class="mojikumi-line-end">，</span>所有的 block 会形成一个如下图所示树状的样子<span class="mojikumi-line-start">（</span>有点树状数组的感觉<span class="mojikumi">）</span><span class="mojikumi-line-end">，</span>parent 相同的 block 就是 buddy<span class="mojikumi-line-end">。</span></p>
<p><picture><img type="image/webp" srcset="/assets/buddy-system.955de85d.webp" loading="lazy" src="/assets/buddy-system.955de85d.webp" width="321" height="158" alt="buddy system"></picture></p>
<p>这样的话<span class="mojikumi-line-end">，</span>搜索和合并会比较快速<span class="mojikumi-line-end">，</span>但由于 block size 都是 2 的次幂<span class="mojikumi-line-end">，</span>可能出现严重的 internal fragmentation<span class="mojikumi-line-end">。</span></p>
<h3 id="平衡树维护-free-block" class="heading"><a href="#平衡树维护-free-block" class="heading-anchor" aria-label="章节： 平衡树维护 free block" tabindex="-1"></a><span>平衡树维护 free block</span></h3>
<p>CS:APP 中没有提到这种方式<span class="mojikumi-line-end">，</span>但只要理解了上面这几种 free list<span class="mojikumi-line-end">，</span>就很好理解<span class="mojikumi-line-end">，</span>free block 不一定要用链表维护<span class="mojikumi-line-end">，</span>也可以用平衡树维护<span class="mojikumi-line-end">：</span>在 free block 中存放树的节点所需的 children<span class="mojikumi-line-end">、</span>parent 等信息<span class="mojikumi-line-end">，</span>就可以高效实现严格的 best fit<span class="mojikumi-line-end">，</span>复杂度也不会像 segregated fit 一样在极端情况下发生退化<span class="mojikumi-line-end">。</span>但是树的节点需要的信息往往比链表多<span class="mojikumi-line-end">，</span>可能会让 minimum block size 增大到 6 个 word<span class="mojikumi-line-end">。</span></p>
<p>我自己写 malloc lab 的时候试着写了个 Splay<span class="mojikumi-line-end">，</span>发现一般情况下还是比 segregated fit 慢不少<span class="mojikumi-line-end">，</span>内存利用率也不一定有明显提升<span class="mojikumi-line-end">，</span>不知道其他平衡树 / 特殊场景下性能如何<span class="mojikumi-line-end">。</span>倒是在网上看到有说红黑树可以在 malloc lab 拿高分<span class="mojikumi-line-start">（</span><s>谁用好的算法拿高分啊<span class="mojikumi-line-end">，</span>不是考验对着数据调参的能力吗<span class="mojikumi-line-end">，</span>我觉得我对数据过拟合的 segregated fit 分已经够高了</s><span class="mojikumi">）</span><wbr><span class="mojikumi-line-start">（</span><s>虚假的 segregated fit<span class="mojikumi-line-end">：</span>按 block size segregate 来寻找 fit<span class="mojikumi-line-end">；</span>真正的 segregated fit<span class="mojikumi-line-end">：</span>按测试数据 segregate 分别进行 fit</s><span class="mojikumi">）</span><span class="mojikumi-line-end">。</span></p>
<h3 id="malloc-lab" class="heading"><a href="#malloc-lab" class="heading-anchor" aria-label="章节： malloc lab" tabindex="-1"></a><span>malloc lab</span></h3>
<p>CS:APP 经典 lab 的代码似乎是可以公开的<span class="mojikumi-line-start">（</span></p>
<p><a href="https://github.com/ouuan/course-assignments/tree/master/csapp/malloc-lab">https<wbr>://<wbr>github<wbr>.<wbr>com<wbr>/<wbr>ouuan<wbr>/<wbr>course<wbr>-<wbr>assignments<wbr>/<wbr>tree<wbr>/<wbr>master<wbr>/<wbr>csapp<wbr>/<wbr>malloc<wbr>-<wbr>lab</a></p>
<h2 id="garbage-collection" class="heading"><a href="#garbage-collection" class="heading-anchor" aria-label="章节： Garbage Collection" tabindex="-1"></a><span>Garbage Collection</span></h2>
<p>可以通过 block 之间以及 stack<span class="mojikumi-line-end">、</span>register<span class="mojikumi-line-end">、</span>global 变量对 block 的引用关系找到不可达的 block 而进行 garbage collection<span class="mojikumi-line-end">。</span></p>
<p>在 C 中<span class="mojikumi-line-end">，</span>由于没有类型信息<span class="mojikumi-line-end">，</span>可能会将非指针类型的数据视作对 block 的引用<span class="mojikumi-line-end">，</span>导致不可达的 block 被视作可达<span class="mojikumi-line-end">，</span>所以 C 语言的 garbage collection 只能是 conservative 的<span class="mojikumi-line-end">。</span></p>]]></content:encoded>
            <category domain="https://ouuan.moe/tag/csapp">csapp</category>
            <category domain="https://ouuan.moe/tag/%E5%AD%A6%E4%B9%A0%E7%AC%94%E8%AE%B0">学习笔记</category>
        </item>
        <item>
            <title><![CDATA[CS:APP 第八章学习笔记]]></title>
            <link>https://ouuan.moe/post/2022/11/csapp-8</link>
            <guid>https://ouuan.moe/post/2022/11/csapp-8</guid>
            <pubDate>Sat, 17 Dec 2022 06:27:26 GMT</pubDate>
            <description><![CDATA[
<p><a href="https://csapp.cs.cmu.edu/">CS:APP</a> 第八章 <span class="mojikumi">“</span>Exceptional Control Flow<span class="mojikumi">”</span> 的学习笔记<span class="mojikumi-line-end">。</span></p>
<p>本章的主要内容为 exception<span class="mojikumi-line-end">、</span>system call<span class="mojikumi-line-end">、</span>process<span class="mojikumi-line-end">、</span>signal<span class="mojikumi-line-end">、</span>longjmp<span class="mojikumi-line-end">。</span></p>
]]></description>
            <content:encoded><![CDATA[
<p><a href="https://csapp.cs.cmu.edu/">CS:APP</a> 第八章 <span class="mojikumi">“</span>Exceptional Control Flow<span class="mojikumi">”</span> 的学习笔记<span class="mojikumi-line-end">。</span></p>
<p>本章的主要内容为 exception<span class="mojikumi-line-end">、</span>system call<span class="mojikumi-line-end">、</span>process<span class="mojikumi-line-end">、</span>signal<span class="mojikumi-line-end">、</span>longjmp<span class="mojikumi-line-end">。</span></p>

<p>在一般情况下<span class="mojikumi-line-end">，</span>PC 会按照指令的顺序以及跳转指令来变化<span class="mojikumi-line-end">。</span>但在很多时候<span class="mojikumi-line-end">，</span>这样的控制流是不能满足需要的<span class="mojikumi-line-end">，</span>需要 <i>exceptional control flow</i> (ECF) 作为跳转指令的补充<span class="mojikumi-line-end">，</span>以处理一些<span class="mojikumi-line-start">“</span>异常<span class="mojikumi-line-end">”</span>的或者来自<span class="mojikumi-line-start">“</span>外部<span class="mojikumi-line-end">”</span>的变化<span class="mojikumi-line-end">。</span></p>
<p>ECF 存在于各个层次<span class="mojikumi-line-end">，</span>例如<span class="mojikumi-line-end">：</span></p>
<ul>
<li>硬件监测到事件发生时调用 exception handler</li>
<li>操作系统在不同进程之间进行 <a href="#context-switch">context switch</a></li>
<li>不同进程间通过发送 <a href="#signals">signal</a> 来调用接收者的 signal handler</li>
<li>程序内部通过 <a href="#nonlocal-jumps">nonlocal jump</a> 来实现错误处理</li>
</ul>
<h2 id="exceptions" class="heading"><a href="#exceptions" class="heading-anchor" aria-label="章节： Exceptions" tabindex="-1"></a><span>Exceptions</span></h2>
<p><i>exception</i> 是由某种<span class="mojikumi-line-start">“</span>状态改变<span class="mojikumi">”</span><wbr><span class="mojikumi-line-start">（</span>可能是某条指令执行的结果<span class="mojikumi-line-end">，</span>或者来自外部 I/O 的变化等等<span class="mojikumi-line-end">）</span>导致的控制流的突变<span class="mojikumi-line-end">。</span></p>
<p>处理器检测到这种状态改变后<span class="mojikumi-line-end">，</span>会调用 <i>exception handler</i><span class="mojikumi-line-end">，</span>然后跳转到触发前的指令或下一条指令<span class="mojikumi-line-end">，</span>或者终止整个程序<span class="mojikumi-line-end">。</span></p>
<h3 id="exception-handling" class="heading"><a href="#exception-handling" class="heading-anchor" aria-label="章节： Exception Handling" tabindex="-1"></a><span>Exception Handling</span></h3>
<p>每种 exception 都会有一个 <i>exception number</i><span class="mojikumi-line-end">，</span>某些 exception 的 number 由硬件决定<span class="mojikumi-line-end">，</span>另一些由操作系统决定<span class="mojikumi-line-end">。</span></p>
<p>内存中会有一个 <i>exception table</i><span class="mojikumi-line-end">，</span>以 exception number 为索引<span class="mojikumi-line-end">，</span>每一项是对应的 exception handler<span class="mojikumi-line-end">。</span>处理器中有一个 <i>exception table base register</i><span class="mojikumi-line-end">，</span>用来存 exception table 的起始地址<span class="mojikumi-line-end">，</span>结合 exception number 就可以对每一项寻址<span class="mojikumi-line-end">。</span></p>
<p>exception 与 procedure call 的主要区别有<span class="mojikumi-line-end">：</span></p>
<ul>
<li>procedure call 返回到栈顶存储的返回地址<span class="mojikumi-line-end">，</span>而 exception 返回到触发时的指令或下一条指令<span class="mojikumi-line-end">，</span>或终止程序<span class="mojikumi-line-end">。</span></li>
<li>调用 exception handler 时<span class="mojikumi-line-end">，</span>会保存包括 condition codes 在内的一些处理器状态<span class="mojikumi-line-end">，</span>在返回时恢复<span class="mojikumi-line-end">。</span></li>
<li>exception handler 在 <a href="#user-kernel-mode">kernel mode</a> 下运行<span class="mojikumi-line-end">，</span>使用的运行栈也是 kernel 的<span class="mojikumi-line-end">。</span></li>
</ul>
<h3 id="classes-of-exceptions" class="heading"><a href="#classes-of-exceptions" class="heading-anchor" aria-label="章节： Classes of Exceptions" tabindex="-1"></a><span>Classes of Exceptions</span></h3>
<p>exception 一般有四种<span class="mojikumi-line-end">：</span></p>
<ul>
<li>interrupt: 异步触发<span class="mojikumi-line-start">（</span>不是某条指令的执行导致了 exception<span class="mojikumi">）</span><span class="mojikumi-line-end">，</span>返回到下一条指令<span class="mojikumi-line-end">。</span>一般是由外部 I/O 设备触发<span class="mojikumi-line-start">（</span>设备通过 interrupt pin 告诉处理器有 interrupt<span class="mojikumi-line-end">，</span>通过 system bus 发送 exception number<span class="mojikumi-line-end">，</span>处理器在每执行完一条指令后检查 interrupt pin<span class="mojikumi">）</span><span class="mojikumi-line-end">，</span>触发后调用 interrupt handler<span class="mojikumi-line-end">，</span>再回到原来的位置继续执行下一条指令<span class="mojikumi-line-end">。</span></li>
<li>trap: 同步触发<span class="mojikumi-line-end">，</span>返回到下一条指令<span class="mojikumi-line-end">。</span>比如 <a href="#linux-%E4%B8%AD%E7%9A%84-system-call">system call</a> 是一种常见的 trap<span class="mojikumi-line-end">，</span>通过 <code>syscall</code> 指令主动触发 exception<span class="mojikumi-line-end">，</span>看上去和函数调用类似<span class="mojikumi-line-end">，</span>但可以在 kernel mode 下运行<span class="mojikumi-line-end">。</span></li>
<li>fault: 同步触发<span class="mojikumi-line-end">，</span>返回到触发 exception 的指令或退出<span class="mojikumi-line-end">。</span>一般来说<span class="mojikumi-line-end">，</span>fault handler 会尝试解决导致 fault 发生的问题<span class="mojikumi-line-end">，</span>如果成功解决则返回到触发 exception 的指令<span class="mojikumi-line-end">，</span>并且能够不再次触发 exception 而继续执行下去<span class="mojikumi-line-end">；</span>如果没能成功解决<span class="mojikumi-line-end">，</span>则 abort<span class="mojikumi-line-end">。</span></li>
<li>abort: 同步触发<span class="mojikumi-line-end">，</span>一定退出<span class="mojikumi-line-end">。</span>一般代表严重的不可恢复的错误<span class="mojikumi-line-end">。</span></li>
</ul>
<h3 id="exceptions-in-linuxx86-64-systems" class="heading"><a href="#exceptions-in-linuxx86-64-systems" class="heading-anchor" aria-label="章节： Exceptions in Linux/x86-64 Systems" tabindex="-1"></a><span>Exceptions in Linux/x86-64 Systems</span></h3>
<h4 id="x86-64-中的-fault-abort" class="heading"><a href="#x86-64-中的-fault-abort" class="heading-anchor" aria-label="章节： x86-64 中的 fault / abort" tabindex="-1"></a><span>x86-64 中的 fault / abort</span></h4>
<ul>
<li>Divide Error Exception (Interrupt 0): 除以零<span class="mojikumi-line-end">。</span>它是 fault<span class="mojikumi-line-end">，</span>但实际上 Linux 不会尝试从 divide error 中恢复<span class="mojikumi-line-end">，</span>而是会直接 abort<span class="mojikumi-line-end">，</span>一般会显示为 <span class="mojikumi">“</span>floating point exception<span class="mojikumi">”</span><span class="mojikumi-line-end">。</span></li>
<li>General Protection Exception (Interrupt 13): 有多种触发原因<span class="mojikumi-line-end">，</span>例如访问未定义的内存<span class="mojikumi-line-end">，</span>尝试写入只读的内存段<span class="mojikumi-line-end">。</span>Linux 也不会尝试从中恢复<span class="mojikumi-line-end">，</span>而是会直接 abort<span class="mojikumi-line-end">，</span>一般会显示为 <span class="mojikumi">“</span>segmentation fault<span class="mojikumi">”</span><span class="mojikumi-line-end">。</span></li>
<li>Page-Fault Exception (Interrupt 14): page fault 是一个名副其实的 fault<span class="mojikumi-line-end">，</span>会尝试恢复<span class="mojikumi-line-end">，</span>详见<a href="/post/2022/11/csapp-9">第九章</a><span class="mojikumi-line-end">。</span></li>
<li>Machine-Check Exception (Interrupt 18): 严重的硬件错误<span class="mojikumi-line-end">，</span>是 abort<span class="mojikumi-line-end">。</span></li>
</ul>
<p><span class="mojikumi-line-start">（</span>完整列表参见 <a href="https://www.intel.com/content/www/us/en/developer/articles/technical/intel-sdm.html">Intel® 64 and IA-32 Architectures Software Developer Manuals</a> Volume 3A 的 <span class="mojikumi">“</span>6.15 EXCEPTION AND INTERRUPT REFERENCE<span class="mojikumi">”</span> 一节<span class="mojikumi">。</span><span class="mojikumi-line-end">）</span></p>
<h4 id="linux-中的-system-call" class="heading"><a href="#linux-中的-system-call" class="heading-anchor" aria-label="章节： Linux 中的 system call" tabindex="-1"></a><span>Linux 中的 system call</span></h4>
<p>Linux 中常用的一些 system call 如 CS:APP Figure 8.10 所示<span class="mojikumi-line-end">：</span></p>
<p><picture><img type="image/webp" srcset="/assets/csapp-fig8.10.f8581cd0.webp" loading="lazy" src="/assets/csapp-fig8.10.f8581cd0.webp" width="1094" height="320" alt="Linux 中常用的一些 system call"></picture></p>
<p><span class="mojikumi-line-start">（</span>更多 system call 参见 <code>man syscalls</code><span class="mojikumi-line-end">）</span></p>
<p>在 C 语言中<span class="mojikumi-line-end">，</span>可以使用 <code>syscall</code> 函数来调用 system call<span class="mojikumi-line-end">，</span>但一般不这样做<span class="mojikumi-line-end">，</span>而是使用每个 system call 对应的 wrapper function<span class="mojikumi-line-end">。</span><code>syscall</code> 和 wrapper function 统称为 <i>system-level function</i><span class="mojikumi-line-end">。</span></p>
<h2 id="processes" class="heading"><a href="#processes" class="heading-anchor" aria-label="章节： Processes" tabindex="-1"></a><span>Processes</span></h2>
<p>一个系统中会有很多进程同时运行<span class="mojikumi-line-end">，</span>但营造出了每个进程都独占了处理器和内存的假象<span class="mojikumi-line-end">。</span></p>
<p>进程独占内存的假象是通过每个进程的 private address space 实现的<span class="mojikumi-line-end">，</span>详见<a href="/post/2022/11/csapp-9">第九章</a><span class="mojikumi-line-end">。</span></p>
<h3 id="logical-concurrent-flow" class="heading"><a href="#logical-concurrent-flow" class="heading-anchor" aria-label="章节： Logical / Concurrent Flow" tabindex="-1"></a><span>Logical / Concurrent Flow</span></h3>
<p>根据一个程序的指令得到的 control flow 称作 <i>logical (control) flow</i><span class="mojikumi-line-end">。</span>系统会在不同的进程间来回切换<span class="mojikumi-line-end">，</span>从一个进程切换出去称作将这个进程 <i>preempt</i><span class="mojikumi-line-end">。</span></p>
<p>如果两个 control flow 的存活时间有重叠<span class="mojikumi-line-end">，</span>则称它们是 <i>concurrent flow</i> 或它们 <i>run concurrently</i><span class="mojikumi-line-end">。</span>这种现象被称作 <i>concurrency</i><span class="mojikumi-line-end">，</span>也被称作 <i>multitasking</i><span class="mojikumi-line-end">。</span>每次连续执行的同一个 logical flow 中的一段称作一个 <i>time slice</i><span class="mojikumi-line-end">，</span>所以 multitasking 也被称作 <i>time slicing</i><span class="mojikumi-line-end">。</span>如果两个 logical flow 在不同的 processor core 上运行<span class="mojikumi-line-end">，</span>则称它们是 <i>parallel flow</i><span class="mojikumi-line-end">，</span><i>run in parallel</i><span class="mojikumi-line-end">。</span></p>
<h3 id="user-kernel-mode" class="heading"><a href="#user-kernel-mode" class="heading-anchor" aria-label="章节： User / Kernel Mode" tabindex="-1"></a><span>User / Kernel Mode</span></h3>
<p>在处理器中存有一个 <i>mode bit</i><span class="mojikumi-line-end">，</span>表示当前是 user mode 还是 kernel mode<span class="mojikumi-line-end">。</span>只有在 kernel mode 下才能执行某些 <i>privileged instruction</i><span class="mojikumi-line-end">、</span>修改 mode bit<span class="mojikumi-line-end">、</span>访问地址空间中属于 kernel 的区域<span class="mojikumi-line-end">。</span></p>
<p>user mode 的程序只能通过 exception 来进入 kernel mode<span class="mojikumi-line-end">，</span>以执行 privileged instruction 或者访问 kernel 的数据<span class="mojikumi-line-end">。</span>在 Linux 中<span class="mojikumi-line-end">，</span>也可以在 user mode 下访问 <code>/proc</code><span class="mojikumi-line-end">、</span><code>/sys</code> 来获得一些 kernel 的数据<span class="mojikumi-line-end">。</span></p>
<h3 id="context-switch" class="heading"><a href="#context-switch" class="heading-anchor" aria-label="章节： Context Switch" tabindex="-1"></a><span>Context Switch</span></h3>
<p>每个进程都有一个 <i>context</i><span class="mojikumi-line-end">，</span>包括寄存器内容<span class="mojikumi-line-end">、</span>PC<span class="mojikumi-line-end">、</span>user stack<span class="mojikumi-line-end">、</span>kernel stack<span class="mojikumi-line-end">、</span>condition codes<span class="mojikumi-line-end">、</span>page table<span class="mojikumi-line-end">、</span>process table<span class="mojikumi-line-end">、</span>file table 等等<span class="mojikumi-line-end">。</span></p>
<p>操作系统通过 <i>context switch</i> 来在不同进程间切换<span class="mojikumi-line-end">，</span>即保存当前进程的 context<span class="mojikumi-line-end">，</span>恢复要切换到的进程的 context<span class="mojikumi-line-end">，</span>最后切换过去<span class="mojikumi-line-end">。</span>context switch 在 exception 中发生<span class="mojikumi-line-end">，</span>处理 exception 时操作系统中的 <i>scheduler</i> 会决定是否进行 context switch<span class="mojikumi-line-end">，</span>schedule 到哪个进程<span class="mojikumi-line-end">。</span>例如<span class="mojikumi-line-end">：</span></p>
<ul>
<li>在通过 system call 读取文件时进行 context switch<span class="mojikumi-line-end">，</span>以在等待读取文件时先执行其他进程<span class="mojikumi-line-end">；</span>读取到文件后在 interrupt 中再 context switch 回来<span class="mojikumi-line-end">。</span></li>
<li>系统会周期性地<span class="mojikumi-line-start">（</span>例如每 1ms<span class="mojikumi-line-end">）</span>触发 interrupt<span class="mojikumi-line-end">，</span>从而可以在一个进程执行了一段时间后进行 context switch<span class="mojikumi-line-end">。</span></li>
</ul>
<p>因为程序不知道操作系统会如何 schedule<span class="mojikumi-line-end">，</span>一般来说<span class="mojikumi-line-end">，</span>不同进程的执行顺序是没有保证的<span class="mojikumi-line-end">。</span></p>
<h2 id="system-call-error-handling" class="heading"><a href="#system-call-error-handling" class="heading-anchor" aria-label="章节： System Call Error Handling" tabindex="-1"></a><span>System Call Error Handling</span></h2>
<p>system-level function 一般以返回 -1 代表出错<span class="mojikumi-line-end">，</span>而将具体的错误记录在全局整型变量 <code>errno</code> (<code>#include &#x3C;errno.h></code>)<span class="mojikumi-line-end">，</span>函数 <code>strerror</code> 可以用来根据 <code>errno</code> 得到文字错误信息<span class="mojikumi-line-end">。</span></p>
<p>调用 system-level function 时应当检查错误<span class="mojikumi-line-end">。</span>为了使错误处理更加简便<span class="mojikumi-line-end">，</span>可以使用类似下面的 wrapper function<span class="mojikumi-line-end">：</span></p>
<section class="code-block relative my-6 shadow" itemprop="hasPart" itemscope itemtype="https://schema.org/SoftwareSourceCode" data-v-c675dba6><div class="h-6 items-center rd-t-1 bg-area px-4 dark:bg-#2A313A media-screen:important-flex" style="display:none;" data-v-c675dba6><h3 class="text-3 text-footer" itemprop="programmingLanguage" aria-label="C 代码块" data-v-c675dba6>C</h3><ile-root id="ile-1"><button title="复制到剪贴板" class="copy-button b-footer text-footer" data-v-63dfb2af><span class="i-mdi-content-copy" data-v-63dfb2af></span><span class="sr-only" role="status" data-v-63dfb2af></span></button></ile-root><!--ISLAND_HYDRATION_PLACEHOLDER_ile-1--></div><div class="dark:hidden" itemprop="text" data-v-c675dba6><pre class="shiki light" style="background-color: #FBFBFB" tabindex="0"><code><span><span style="color: #994CC3">#include</span><span style="color: #403F53"> </span><span style="color: #111111">&lt;</span><span style="color: #C96765">errno.h</span><span style="color: #111111">&gt;</span></span>
<span><span style="color: #994CC3">#include</span><span style="color: #403F53"> </span><span style="color: #111111">&lt;</span><span style="color: #C96765">stdio.h</span><span style="color: #111111">&gt;</span></span>
<span><span style="color: #994CC3">#include</span><span style="color: #403F53"> </span><span style="color: #111111">&lt;</span><span style="color: #C96765">stdlib.h</span><span style="color: #111111">&gt;</span></span>
<span><span style="color: #994CC3">#include</span><span style="color: #403F53"> </span><span style="color: #111111">&lt;</span><span style="color: #C96765">string.h</span><span style="color: #111111">&gt;</span></span>
<span><span style="color: #994CC3">#include</span><span style="color: #403F53"> </span><span style="color: #111111">&lt;</span><span style="color: #C96765">unistd.h</span><span style="color: #111111">&gt;</span></span>
<span></span>
<span><span style="color: #994CC3">void</span><span style="color: #403F53"> </span><span style="color: #4876D6">unix_error</span><span style="color: #403F53">(</span><span style="color: #994CC3">char</span><span style="color: #403F53"> </span><span style="color: #0C969B">*</span><span style="color: #403F53">msg)</span></span>
<span><span style="color: #403F53">{</span></span>
<span><span style="color: #403F53">    </span><span style="color: #4876D6">fprintf(stderr, </span><span style="color: #111111">&quot;</span><span style="color: #4876D6">%s</span><span style="color: #C96765">: </span><span style="color: #4876D6">%s</span><span style="color: #AA0982">\n</span><span style="color: #111111">&quot;</span><span style="color: #4876D6">, msg, strerror(errno))</span><span style="color: #403F53">;</span></span>
<span><span style="color: #403F53">    </span><span style="color: #4876D6">exit(errno)</span><span style="color: #403F53">;</span></span>
<span><span style="color: #403F53">}</span></span>
<span></span>
<span><span style="color: #994CC3">pid_t</span><span style="color: #403F53"> </span><span style="color: #4876D6">Fork</span><span style="color: #403F53">(</span><span style="color: #994CC3">void</span><span style="color: #403F53">)</span></span>
<span><span style="color: #403F53">{</span></span>
<span><span style="color: #403F53">    </span><span style="color: #994CC3">pid_t</span><span style="color: #403F53"> pid </span><span style="color: #994CC3">=</span><span style="color: #403F53"> </span><span style="color: #4876D6">fork()</span><span style="color: #403F53">;</span></span>
<span></span>
<span><span style="color: #403F53">    </span><span style="color: #994CC3">if</span><span style="color: #403F53"> (pid </span><span style="color: #994CC3">&lt;</span><span style="color: #403F53"> </span><span style="color: #AA0982">0</span><span style="color: #403F53">)</span></span>
<span><span style="color: #403F53">        </span><span style="color: #4876D6">unix_error(</span><span style="color: #111111">&quot;</span><span style="color: #C96765">Fork error</span><span style="color: #111111">&quot;</span><span style="color: #4876D6">)</span><span style="color: #403F53">;</span></span>
<span></span>
<span><span style="color: #403F53">    </span><span style="color: #994CC3">return</span><span style="color: #403F53"> pid;</span></span>
<span><span style="color: #403F53">}</span></span></code></pre></div><div class="dark:important-block" style="display:none;" data-v-c675dba6><pre class="shiki dark" style="background-color: #011627" tabindex="0"><code><span><span style="color: #C792EA">#include</span><span style="color: #D6DEEB"> </span><span style="color: #D9F5DD">&lt;</span><span style="color: #ECC48D">errno.h</span><span style="color: #D9F5DD">&gt;</span></span>
<span><span style="color: #C792EA">#include</span><span style="color: #D6DEEB"> </span><span style="color: #D9F5DD">&lt;</span><span style="color: #ECC48D">stdio.h</span><span style="color: #D9F5DD">&gt;</span></span>
<span><span style="color: #C792EA">#include</span><span style="color: #D6DEEB"> </span><span style="color: #D9F5DD">&lt;</span><span style="color: #ECC48D">stdlib.h</span><span style="color: #D9F5DD">&gt;</span></span>
<span><span style="color: #C792EA">#include</span><span style="color: #D6DEEB"> </span><span style="color: #D9F5DD">&lt;</span><span style="color: #ECC48D">string.h</span><span style="color: #D9F5DD">&gt;</span></span>
<span><span style="color: #C792EA">#include</span><span style="color: #D6DEEB"> </span><span style="color: #D9F5DD">&lt;</span><span style="color: #ECC48D">unistd.h</span><span style="color: #D9F5DD">&gt;</span></span>
<span></span>
<span><span style="color: #C792EA">void</span><span style="color: #D6DEEB"> </span><span style="color: #82AAFF">unix_error</span><span style="color: #D6DEEB">(</span><span style="color: #C792EA">char</span><span style="color: #D6DEEB"> </span><span style="color: #7FDBCA">*</span><span style="color: #D7DBE0">msg</span><span style="color: #D6DEEB">)</span></span>
<span><span style="color: #D6DEEB">{</span></span>
<span><span style="color: #D6DEEB">    </span><span style="color: #82AAFF">fprintf(stderr, </span><span style="color: #D9F5DD">&quot;</span><span style="color: #82AAFF">%s</span><span style="color: #ECC48D">: </span><span style="color: #82AAFF">%s</span><span style="color: #F78C6C">\n</span><span style="color: #D9F5DD">&quot;</span><span style="color: #82AAFF">, msg, strerror(errno))</span><span style="color: #D6DEEB">;</span></span>
<span><span style="color: #D6DEEB">    </span><span style="color: #82AAFF">exit(errno)</span><span style="color: #D6DEEB">;</span></span>
<span><span style="color: #D6DEEB">}</span></span>
<span></span>
<span><span style="color: #C792EA">pid_t</span><span style="color: #D6DEEB"> </span><span style="color: #82AAFF">Fork</span><span style="color: #D6DEEB">(</span><span style="color: #C792EA">void</span><span style="color: #D6DEEB">)</span></span>
<span><span style="color: #D6DEEB">{</span></span>
<span><span style="color: #D6DEEB">    </span><span style="color: #C792EA">pid_t</span><span style="color: #D6DEEB"> pid </span><span style="color: #C792EA">=</span><span style="color: #D6DEEB"> </span><span style="color: #82AAFF">fork()</span><span style="color: #D6DEEB">;</span></span>
<span></span>
<span><span style="color: #D6DEEB">    </span><span style="color: #C792EA">if</span><span style="color: #D6DEEB"> (pid </span><span style="color: #C792EA">&lt;</span><span style="color: #D6DEEB"> </span><span style="color: #F78C6C">0</span><span style="color: #D6DEEB">)</span></span>
<span><span style="color: #D6DEEB">        </span><span style="color: #82AAFF">unix_error(</span><span style="color: #D9F5DD">&quot;</span><span style="color: #ECC48D">Fork error</span><span style="color: #D9F5DD">&quot;</span><span style="color: #82AAFF">)</span><span style="color: #D6DEEB">;</span></span>
<span></span>
<span><span style="color: #D6DEEB">    </span><span style="color: #C792EA">return</span><span style="color: #D6DEEB"> pid;</span></span>
<span><span style="color: #D6DEEB">}</span></span></code></pre></div></section>
<h2 id="process-control" class="heading"><a href="#process-control" class="heading-anchor" aria-label="章节： Process Control" tabindex="-1"></a><span>Process Control</span></h2>
<p>C 语言中有很多用来控制 Unix 进程的函数<span class="mojikumi-line-end">。</span></p>
<h3 id="获取-pid" class="heading"><a href="#获取-pid" class="heading-anchor" aria-label="章节： 获取 PID" tabindex="-1"></a><span>获取 PID</span></h3>
<p>每个进程都有一个 PID<span class="mojikumi-line-end">。</span></p>
<ul>
<li><code>pid_t<wbr> <wbr>getpid<wbr>(<wbr>void<wbr>)</code>: 返回当前进程的 PID</li>
<li><code>pid_t<wbr> <wbr>getppid<wbr>(<wbr>void<wbr>)</code>: 返回当前进程的 parent 的 PID</li>
</ul>
<h3 id="进程的状态" class="heading"><a href="#进程的状态" class="heading-anchor" aria-label="章节： 进程的状态" tabindex="-1"></a><span>进程的状态</span></h3>
<p>每个进程可能处于三种状态之一<span class="mojikumi-line-end">：</span></p>
<ol>
<li>Running: 正在运行中<span class="mojikumi-line-end">，</span>会被 schedule<span class="mojikumi-line-end">。</span></li>
<li>Stopped: 被 suspend 了<span class="mojikumi-line-end">，</span>不会被 schedule<span class="mojikumi-line-end">。</span>Stopped 可能是 SIGSTOP<span class="mojikumi-line-end">、</span>SIGTSTP<span class="mojikumi-line-end">、</span>SIGTTIN<span class="mojikumi-line-end">、</span>SIGTTOU 导致的<span class="mojikumi-line-end">，</span>可以由 SIGCONT 恢复运行<span class="mojikumi-line-end">。</span></li>
<li>Terminated: 进程永久地结束了<span class="mojikumi-line-end">，</span>可能是从 <code>main</code> 函数返回<span class="mojikumi-line-end">、</span>调用了 <code>exit</code> 函数或者收到了某些 signal<span class="mojikumi-line-end">。</span></li>
</ol>
<ul>
<li><code>void exit(int status)</code>: 以某个 exit status 将当前进程 terminate</li>
</ul>
<h3 id="fork" class="heading"><a href="#fork" class="heading-anchor" aria-label="章节： fork" tabindex="-1"></a><span>fork</span></h3>
<ul>
<li><code>pid_t fork(void)</code>: 创建子进程</li>
</ul>
<p>fork 会将当前进程的所有状态复制一份创建一个新的进程<span class="mojikumi-line-end">，</span>新的进程有着和原来相同的代码<span class="mojikumi-line-end">、</span>数据<span class="mojikumi-line-end">、</span>文件<span class="mojikumi-line-start">（</span>例如 <code>stdout</code><span class="mojikumi">）</span><span class="mojikumi-line-end">，</span>但 PID 不同<span class="mojikumi-line-end">，</span>并且后续对数据的修改是和原进程独立的<span class="mojikumi-line-end">。</span></p>
<p>fork 会调用一次<span class="mojikumi-line-end">，</span>返回两次<span class="mojikumi-line-end">，</span>分别在两个进程中返回<span class="mojikumi-line-end">，</span>在 parent 中返回 child 的 PID<span class="mojikumi-line-end">，</span>在 child 中返回 0<span class="mojikumi-line-end">，</span>出错则返回 -1<span class="mojikumi-line-end">。</span></p>
<p>fork 出的进程和原进程在接下来会执行同一份代码<span class="mojikumi-line-end">，</span>所以一般会判断 <code>fork</code> 的返回值是否为 0 来让两个进程执行不同的分支<span class="mojikumi-line-end">。</span></p>
<h3 id="process-group" class="heading"><a href="#process-group" class="heading-anchor" aria-label="章节： process group" tabindex="-1"></a><span>process group</span></h3>
<p>每个进程会属于一个 process group<span class="mojikumi-line-end">，</span>每个 process group 有一个 ID<span class="mojikumi-line-end">。</span></p>
<p>创建子进程时<span class="mojikumi-line-end">，</span>子进程会默认处于 parent 的 process group<span class="mojikumi-line-end">。</span></p>
<ul>
<li><code>pid_t<wbr> <wbr>getpgrp<wbr>(<wbr>void<wbr>)</code>: 返回当前进程的 process group ID</li>
<li><code>int<wbr> <wbr>setpgid<wbr>(<wbr>pid_t<wbr> <wbr>pid<wbr>, <wbr>pid_t<wbr> <wbr>pgid<wbr>)</code>: 将 <code>pid</code> 对应的进程的 progress group ID 修改为 <code>pgid</code><span class="mojikumi-line-end">，</span><code>pid</code> 为 0 表示当前进程<span class="mojikumi-line-end">，</span><code>pgid</code> 为 0 表示修改为 <code>pid</code> 对应的进程的 PID</li>
</ul>
<h3 id="wait" class="heading"><a href="#wait" class="heading-anchor" aria-label="章节： wait" tabindex="-1"></a><span>wait</span></h3>
<ul>
<li><code>pid_t<wbr> <wbr>waitpid<wbr>(<wbr>pid_t<wbr> <wbr>pid<wbr>, <wbr>int<wbr> *<wbr>statusp<wbr>, <wbr>int<wbr> <wbr>options<wbr>)</code>: 等待子进程结束</li>
<li><code>pid_t wait(int *statusp)</code>: <code>waitpid<wbr>(-<wbr>1<wbr>, <wbr>statusp<wbr>, <wbr>0<wbr>)</code></li>
</ul>
<h4 id="waitpid-的-pid-参数" class="heading"><a href="#waitpid-的-pid-参数" class="heading-anchor" aria-label="章节： waitpid 的 pid 参数" tabindex="-1"></a><span>waitpid 的 pid 参数</span></h4>
<p>参数 <code>pid</code> 决定了要等待的是哪些子进程<span class="mojikumi-line-end">：</span></p>
<ul>
<li>-1: 所有子进程</li>
<li>> 0: PID 为 <code>pid</code> 的子进程</li>
<li>0: process group 与当前进程相同的子进程</li>
<li>&#x3C; -1: process group ID 为 <code>-pid</code> 的子进程</li>
</ul>
<h4 id="waitpid-的行为-options" class="heading"><a href="#waitpid-的行为-options" class="heading-anchor" aria-label="章节： waitpid 的行为 (options)" tabindex="-1"></a><span>waitpid 的行为 (options)</span></h4>
<p>默认情况下<span class="mojikumi-line-end">，</span><code>waitpid</code> 会等待到有某个被等待的子进程 terminate 再返回<span class="mojikumi-line-end">，</span><code>options</code> 可以改变这一行为<span class="mojikumi-line-end">，</span>其值可以包含下列 flag<span class="mojikumi-line-end">：</span></p>
<ul>
<li><code>WNOHANG</code>: 立即返回<span class="mojikumi-line-end">，</span>如果没有符合条件的子进程则返回 0</li>
<li><code>WUNTRACED</code>: 除了 terminate<span class="mojikumi-line-end">，</span>子进程 stop 也可以结束等待</li>
<li><code>WCONTINUED</code>: 除了 terminate<span class="mojikumi-line-end">，</span>子进程从 stopped 中 continue 也可以结束等待</li>
</ul>
<h4 id="reap" class="heading"><a href="#reap" class="heading-anchor" aria-label="章节： reap" tabindex="-1"></a><span>reap</span></h4>
<p>除了等待<span class="mojikumi-line-end">，</span>wait 还会将 terminated 的子进程 <i>reap</i><span class="mojikumi-line-end">，</span>即彻底清除掉<span class="mojikumi-line-end">。</span>没有被 reap 但 terminated 的进程被称作 <i>zombie</i><span class="mojikumi-line-end">，</span>会占用一定的系统资源<span class="mojikumi-line-end">。</span>在 <code>ps</code> 中<span class="mojikumi-line-end">，</span>zombie 显示为 <code>[defunct]</code><span class="mojikumi-line-end">。</span></p>
<p>如果 parent terminate 了<span class="mojikumi-line-end">，</span>没有 terminate 的子进程会被设置为 PID 为 1 的 <code>init</code> 进程的子进程<span class="mojikumi-line-end">，</span>而 zombie 子进程则会被 <code>init</code> reap<span class="mojikumi-line-end">。</span></p>
<h4 id="wait-获取子进程的-status" class="heading"><a href="#wait-获取子进程的-status" class="heading-anchor" aria-label="章节： wait 获取子进程的 status" tabindex="-1"></a><span>wait 获取子进程的 status</span></h4>
<p>如果 <code>statusp</code> 参数不是 <code>NULL</code><span class="mojikumi-line-end">，</span>在 <code>waitpid</code> 返回时 <code>*statusp</code> 内就会存有引起等待结束的那个子进程的信息<span class="mojikumi-line-end">。</span></p>
<p>有一系列 macro 可以用来提取 status 中的信息<span class="mojikumi-line-start">（</span>参数是 <code>*statusp</code><span class="mojikumi-line-end">，</span>不是指针<span class="mojikumi">）</span><span class="mojikumi-line-end">：</span></p>
<ul>
<li><code>WIFEXITED<wbr>(<wbr>status<wbr>)</code>: 是否正常退出 (从 <code>main</code> 函数返回或调用了 <code>exit</code> 函数)</li>
<li><code>WEXITSTATUS<wbr>(<wbr>status<wbr>)</code>: 如果正常退出<span class="mojikumi-line-end">，</span>则返回 exit status (<code>main</code> 函数返回值 / <code>exit</code> 函数参数)</li>
<li><code>WIFSIGNALED<wbr>(<wbr>status<wbr>)</code>: 是否由某个 signal terminate</li>
<li><code>WTERMSIG<wbr>(<wbr>status<wbr>)</code>: 如果是由某个 signal terminate<span class="mojikumi-line-end">，</span>返回这个 signal</li>
<li><code>WIFSTOPPED<wbr>(<wbr>status<wbr>)</code>: 是否被 stop</li>
<li><code>WSTOPSIG<wbr>(<wbr>status<wbr>)</code>: 如果被 stop<span class="mojikumi-line-end">，</span>返回使其 stop 的 signal</li>
<li><code>WIFCONTINUED<wbr>(<wbr>status<wbr>)</code>: 是否被 continue</li>
</ul>
<h4 id="wait-的报错" class="heading"><a href="#wait-的报错" class="heading-anchor" aria-label="章节： wait 的报错" tabindex="-1"></a><span>wait 的报错</span></h4>
<p>出错时 wait 会返回 -1<span class="mojikumi-line-end">，</span><code>errno</code> 可能是 <code>ECHILD</code> 表示被等待的子进程集合为空<span class="mojikumi-line-end">，</span>可能是 <code>EINTR</code> 表示 wait 函数被某个 signal 中断了<span class="mojikumi-line-end">。</span></p>
<p>wait 会在每有一个子进程结束时返回<span class="mojikumi-line-end">，</span>但子进程全部结束时会报错 <code>ECHILD</code><span class="mojikumi-line-end">，</span>可以利用这一点通过 <code>while</code> 循环来等待所有子进程全部结束<span class="mojikumi-line-end">。</span></p>
<h3 id="sleep" class="heading"><a href="#sleep" class="heading-anchor" aria-label="章节： sleep" tabindex="-1"></a><span>sleep</span></h3>
<ul>
<li><code>unsigned<wbr> <wbr>int<wbr> <wbr>sleep<wbr>(<wbr>unsigned<wbr> <wbr>int<wbr> <wbr>secs<wbr>)</code>: sleep 若干秒<span class="mojikumi-line-end">，</span>返回剩余应当 sleep 的秒数<span class="mojikumi-line-start">（</span>正常情况下没被 interrupt 就是 0<span class="mojikumi-line-end">）</span></li>
<li><code>int<wbr> <wbr>pause<wbr>(<wbr>void<wbr>)</code>: 一直 sleep<span class="mojikumi-line-end">，</span>直到被 signal interrupt<span class="mojikumi-line-end">，</span>总是返回 -1</li>
</ul>
<h3 id="execve" class="heading"><a href="#execve" class="heading-anchor" aria-label="章节： execve" tabindex="-1"></a><span>execve</span></h3>
<ul>
<li><code>int<wbr> <wbr>execve<wbr>(<wbr>const<wbr> <wbr>char<wbr> *<wbr>filename<wbr>, <wbr>char<wbr> *<wbr>const<wbr> <wbr>argv<wbr>[], <wbr>char<wbr> *<wbr>const<wbr> <wbr>envp<wbr>[])</code></li>
</ul>
<p><code>execve</code> 会以 <code>argv</code> 作为参数<span class="mojikumi-line-end">、</span><code>envp</code> 作为环境变量<span class="mojikumi-line-end">，</span>在当前进程内执行 executable object file <code>filename</code><span class="mojikumi-line-end">。</span>可以和 <code>fork</code> 配合来在子进程内执行其他程序<span class="mojikumi-line-end">。</span></p>
<p><code>argv</code> 是一个以 <code>NULL</code> 为结尾的字符串数组<span class="mojikumi-line-end">，</span>表示各个参数<span class="mojikumi-line-end">，</span>其中第一个一般是程序的名称<span class="mojikumi-line-end">。</span></p>
<p><code>envp</code> 也是以 <code>NULL</code> 为结尾的字符串数组<span class="mojikumi-line-end">，</span>每个字符串形如 <code>name=value</code><span class="mojikumi-line-end">。</span></p>
<p>有一些函数可以用来获取<span class="mojikumi-line-end">、</span>设置环境变量<span class="mojikumi-line-end">：</span></p>
<ul>
<li><code>char<wbr> *<wbr>getenv<wbr>(<wbr>const<wbr> <wbr>char<wbr> *<wbr>name<wbr>)</code>: 返回 <code>NULL</code> 或环境变量的值</li>
<li><code>int<wbr> <wbr>setenv<wbr>(<wbr>const<wbr> <wbr>char<wbr> *<wbr>name<wbr>, <wbr>const<wbr> <wbr>char<wbr> *<wbr>newvalue<wbr>, <wbr>int<wbr> <wbr>overwrite<wbr>)</code>: 成功则返回 0<span class="mojikumi-line-end">，</span>失败<span class="mojikumi-line-start">（</span><code>overwrite</code> 为 0 而 <code>name</code> 已存在<span class="mojikumi-line-end">）</span>则返回 -1</li>
<li><code>void<wbr> <wbr>unsetenv<wbr>(<wbr>const<wbr> <wbr>char<wbr> *<wbr>name<wbr>)</code></li>
</ul>
<h2 id="signals" class="heading"><a href="#signals" class="heading-anchor" aria-label="章节： Signals" tabindex="-1"></a><span>Signals</span></h2>
<h3 id="signal-的种类" class="heading"><a href="#signal-的种类" class="heading-anchor" aria-label="章节： signal 的种类" tabindex="-1"></a><span>signal 的种类</span></h3>
<p>可以用 <code>man signal.7</code> 查看 signal 的列表<span class="mojikumi-line-start">（</span>名称<span class="mojikumi-line-end">、</span>语义<span class="mojikumi-line-end">、</span>编号<span class="mojikumi-line-end">、</span>默认行为<span class="mojikumi">）</span><span class="mojikumi-line-end">。</span></p>
<p>特别地<span class="mojikumi-line-end">：</span></p>
<ul>
<li>除以零时会被发送 SIGFPE</li>
<li>执行非法指令时会被发送 SIGILL</li>
<li>非法访问内存时会被发送 SIGSEGV</li>
<li>按 Ctrl+C 时 foreground process group 会被发送 SIGINT</li>
<li>子进程 terminate 时会向 parent 发送 SIGCHLD</li>
<li>可以通过 SIGKILL 来强行 terminate 一个进程</li>
</ul>
<h3 id="signal-的工作流程" class="heading"><a href="#signal-的工作流程" class="heading-anchor" aria-label="章节： signal 的工作流程" tabindex="-1"></a><span>signal 的工作流程</span></h3>
<ul>
<li>每个进程会记录每个 signal 是否 <i>pending</i><span class="mojikumi-line-end">、</span>是否 <i>blocked</i></li>
<li>发送 signal 会使接收者的这个 signal 变为 pending</li>
<li>进程可以改变每个 signal 的 blocked 状态</li>
<li>在切换到 user mode 执行进程时<span class="mojikumi-line-end">，</span>如果一个 signal 处于 pending 状态且没有被 blocked<span class="mojikumi-line-end">，</span>就会接收这个 signal<span class="mojikumi-line-end">，</span>并设为没有在 pending</li>
</ul>
<p>这意味着<span class="mojikumi-line-end">：</span></p>
<ul>
<li>signal 只记录是否 pending<span class="mojikumi-line-end">，</span>不会记录发送了几次<span class="mojikumi-line-end">，</span>在被接收前多次发送只会被接收一次</li>
<li>在 blocked 状态下被发送 signal<span class="mojikumi-line-end">，</span>会在 unblock 时收到</li>
</ul>
<h3 id="发送-signal" class="heading"><a href="#发送-signal" class="heading-anchor" aria-label="章节： 发送 signal" tabindex="-1"></a><span>发送 signal</span></h3>
<h4 id="kill-命令" class="heading"><a href="#kill-命令" class="heading-anchor" aria-label="章节： kill 命令" tabindex="-1"></a><span>kill 命令</span></h4>
<p>可以用 <code>kill</code> 命令在 shell 中向指定的进程发送信号<span class="mojikumi-line-end">。</span>一般 shell 会有 builtin 的 <code>kill</code><span class="mojikumi-line-end">，</span>也有位于 <code>/<wbr>usr<wbr>/<wbr>bin<wbr>/<wbr>kill</code> 的 <code>kill</code><span class="mojikumi-line-end">，</span>可能有一定的区别<span class="mojikumi-line-end">。</span></p>
<p>基础的 <code>kill</code> 命令形如 <code>kill -sig pid</code><span class="mojikumi-line-end">，</span>其中 <code>-sig</code> 可以形如 <code>-INT</code>/<code>-SIGINT</code>/<code>-2</code><span class="mojikumi-line-end">，</span>而 <code>pid</code> 表示要把信号发送给<span class="mojikumi-line-end">：</span></p>
<ul>
<li>> 0: PID 为 <code>pid</code> 的进程</li>
<li>0: process group 和当前进程相同的进程</li>
<li>-1: 除 PID 为 1 的 <code>init</code> 外的所有进程</li>
<li>&#x3C; 0: process group ID 为 <code>-pid</code> 的进程</li>
</ul>
<p>这与 <a href="#waitpid-%E7%9A%84-pid-%E5%8F%82%E6%95%B0">waitpid 的 pid 参数</a> 是类似的<span class="mojikumi-line-end">。</span></p>
<h4 id="在-shell-中使用键盘发送-sigint-sigtstp" class="heading"><a href="#在-shell-中使用键盘发送-sigint-sigtstp" class="heading-anchor" aria-label="章节： 在 shell 中使用键盘发送 SIGINT / SIGTSTP" tabindex="-1"></a><span>在 shell 中使用键盘发送 SIGINT / SIGTSTP</span></h4>
<p>shell 中会有至多一个 foreground job 和零个或若干个 background job<span class="mojikumi-line-end">。</span>shell 会给每个 job 中的所有进程指定同样的 process group<span class="mojikumi-line-end">。</span></p>
<p>Ctrl+C 会向 foreground group 发送 SIGINT<span class="mojikumi-line-end">，</span>Ctrl+Z 会向 foreground group 发送 SIGTSTP<span class="mojikumi-line-end">。</span></p>
<h4 id="使用函数发送-signal" class="heading"><a href="#使用函数发送-signal" class="heading-anchor" aria-label="章节： 使用函数发送 signal" tabindex="-1"></a><span>使用函数发送 signal</span></h4>
<ul>
<li><code>int kill(pid_t pid, int sig)</code>: 与 <a href="#kill-%E5%91%BD%E4%BB%A4">kill 命令</a>类似</li>
<li><code>unsigned<wbr> <wbr>int<wbr> <wbr>alarm<wbr>(<wbr>unsigned<wbr> <wbr>int<wbr> <wbr>secs<wbr>)</code>: 让 kernel 在 <code>secs</code> 秒后向当前进程发送 SIGALRM<span class="mojikumi-line-end">；</span>如果有尚未发送的 alarm 则取消掉<span class="mojikumi-line-end">，</span>如果 <code>secs</code> 为 0 则取消后不会发送新的 SIGALRM<span class="mojikumi-line-end">；</span>没有尚未发送的 alarm 则返回值是 0<span class="mojikumi-line-end">，</span>否则是被取消的 alarm 还剩的秒数</li>
</ul>
<h3 id="设置-signal-handler" class="heading"><a href="#设置-signal-handler" class="heading-anchor" aria-label="章节： 设置 signal handler" tabindex="-1"></a><span>设置 signal handler</span></h3>
<p>除了 SIGKILL 和 SIGSTOP<span class="mojikumi-line-end">，</span>其他 signal 的行为可以被改变<span class="mojikumi-line-end">。</span></p>
<section class="code-block relative my-6 shadow" itemprop="hasPart" itemscope itemtype="https://schema.org/SoftwareSourceCode" data-v-c675dba6><div class="h-6 items-center rd-t-1 bg-area px-4 dark:bg-#2A313A media-screen:important-flex" style="display:none;" data-v-c675dba6><h4 class="text-3 text-footer" itemprop="programmingLanguage" aria-label="C 代码块" data-v-c675dba6>C</h4><ile-root id="ile-2"><button title="复制到剪贴板" class="copy-button b-footer text-footer" data-v-63dfb2af><span class="i-mdi-content-copy" data-v-63dfb2af></span><span class="sr-only" role="status" data-v-63dfb2af></span></button></ile-root><!--ISLAND_HYDRATION_PLACEHOLDER_ile-2--></div><div class="dark:hidden" itemprop="text" data-v-c675dba6><pre class="shiki light" style="background-color: #FBFBFB" tabindex="0"><code><span><span style="color: #994CC3">#include</span><span style="color: #403F53"> </span><span style="color: #111111">&lt;</span><span style="color: #C96765">signal.h</span><span style="color: #111111">&gt;</span></span>
<span><span style="color: #994CC3">typedef</span><span style="color: #403F53"> </span><span style="color: #994CC3">void</span><span style="color: #403F53"> (</span><span style="color: #0C969B">*</span><span style="color: #4876D6">sighandler_t</span><span style="color: #403F53">)(</span><span style="color: #994CC3">int</span><span style="color: #403F53">);</span></span>
<span><span style="color: #4876D6">sighandler_t</span><span style="color: #403F53"> </span><span style="color: #4876D6">signal</span><span style="color: #403F53">(</span><span style="color: #994CC3">int</span><span style="color: #403F53"> signum, </span><span style="color: #4876D6">sighandler_t</span><span style="color: #403F53"> handler);</span></span></code></pre></div><div class="dark:important-block" style="display:none;" data-v-c675dba6><pre class="shiki dark" style="background-color: #011627" tabindex="0"><code><span><span style="color: #C792EA">#include</span><span style="color: #D6DEEB"> </span><span style="color: #D9F5DD">&lt;</span><span style="color: #ECC48D">signal.h</span><span style="color: #D9F5DD">&gt;</span></span>
<span><span style="color: #C792EA">typedef</span><span style="color: #D6DEEB"> </span><span style="color: #C792EA">void</span><span style="color: #D6DEEB"> (</span><span style="color: #7FDBCA">*</span><span style="color: #C5E478">sighandler_t</span><span style="color: #D6DEEB">)(</span><span style="color: #C792EA">int</span><span style="color: #D6DEEB">);</span></span>
<span><span style="color: #C5E478">sighandler_t</span><span style="color: #D6DEEB"> </span><span style="color: #82AAFF">signal</span><span style="color: #D6DEEB">(</span><span style="color: #C792EA">int</span><span style="color: #D6DEEB"> </span><span style="color: #D7DBE0">signum</span><span style="color: #D6DEEB">, </span><span style="color: #C5E478">sighandler_t</span><span style="color: #D6DEEB"> </span><span style="color: #D7DBE0">handler</span><span style="color: #D6DEEB">);</span></span></code></pre></div></section>
<p>函数 <code>signal</code> 用来改变处理 signal <code>signum</code> 的方式<span class="mojikumi-line-end">。</span><code>handler</code> 可以是一个函数指针<span class="mojikumi-line-end">，</span>也可以是 <code>SIG_IGN</code> 表示无视这个 signal<span class="mojikumi-line-end">，</span>或者 <code>SIG_DFL</code> 表示使用这个 signal 的默认行为<span class="mojikumi-line-end">。</span></p>
<p>有 handler 时<span class="mojikumi-line-end">，</span>接收到一个 signal 就会触发 exception 来执行 handler<span class="mojikumi-line-end">，</span>在 handler 结束时一般会返回到原来的指令<span class="mojikumi-line-end">。</span></p>
<p>在执行 handler 的过程中<span class="mojikumi-line-end">，</span>相应的 signal 会被 block<span class="mojikumi-line-end">，</span>但 handler 可以被其他类型的 signal interrupt<span class="mojikumi-line-end">，</span>在处理完这另一个 signal 后返回到一开始的 handler<span class="mojikumi-line-end">。</span></p>
<h3 id="block-unblock-signal" class="heading"><a href="#block-unblock-signal" class="heading-anchor" aria-label="章节： block / unblock signal" tabindex="-1"></a><span>block / unblock signal</span></h3>
<p>进程可以主动 block / unblock 指定的 signal<span class="mojikumi-line-end">：</span></p>
<ul>
<li><code>int<wbr> <wbr>sigprocmask<wbr>(<wbr>int<wbr> <wbr>how<wbr>, <wbr>const<wbr> <wbr>sigset_t<wbr> *<wbr>set<wbr>, <wbr>sigset_t<wbr> *<wbr>oldset<wbr>)</code></li>
</ul>
<p>其中 <code>how</code> 是 <code>SIG_BLOCK</code> / <code>SIG_UNBLOCK</code> / <code>SIG_SETMASK</code><span class="mojikumi-line-end">，</span>分别表示 block <code>set</code> 里的 signal / unblock <code>set</code> 里的 signal / 将 blocked set 设为 <code>set</code><span class="mojikumi-line-end">。</span></p>
<p>若 <code>oldset</code> 不是 <code>NULL</code><span class="mojikumi-line-end">，</span>则会将修改前的 blocked set 存下来<span class="mojikumi-line-end">。</span></p>
<p>还有一些用来设置 <code>sigset_t</code> 的函数<span class="mojikumi-line-end">：</span></p>
<ul>
<li><code>int<wbr> <wbr>sigemptyset<wbr>(<wbr>sigset_t<wbr> *<wbr>set<wbr>)</code>: 将 <code>set</code> 设为空</li>
<li><code>int<wbr> <wbr>sigfillset<wbr>(<wbr>sigset_t<wbr> *<wbr>set<wbr>)</code>: 将 <code>set</code> 设为所有 signal</li>
<li><code>int<wbr> <wbr>sigaddset<wbr>(<wbr>sigset_t<wbr> *<wbr>set<wbr>, <wbr>int<wbr> <wbr>signum<wbr>)</code>: 将 <code>signum</code> 加入 <code>set</code></li>
<li><code>int<wbr> <wbr>sigdelset<wbr>(<wbr>sigset_t<wbr> *<wbr>set<wbr>, <wbr>int<wbr> <wbr>signum<wbr>)</code>: 将 <code>signum</code> 从 <code>set</code> 中删去</li>
<li><code>int<wbr> <wbr>sigismember<wbr>(<wbr>const<wbr> <wbr>sigset_t<wbr> *<wbr>set<wbr>, <wbr>int<wbr> <wbr>signum<wbr>)</code>: 检查 <code>signum</code> 是否在 <code>set</code> 中<span class="mojikumi-line-end">，</span>返回 0/1 或出错返回 -1</li>
</ul>
<h3 id="编写、使用-signal-handler" class="heading"><a href="#编写、使用-signal-handler" class="heading-anchor" aria-label="章节： 编写、使用 signal handler" tabindex="-1"></a><span>编写<span class="mojikumi-line-end">、</span>使用 signal handler</span></h3>
<h4 id="编写安全的-signal-handler" class="heading"><a href="#编写安全的-signal-handler" class="heading-anchor" aria-label="章节： 编写安全的 signal handler" tabindex="-1"></a><span>编写安全的 signal handler</span></h4>
<p>由于 signal handler 和主程序并行运行<span class="mojikumi-line-end">、</span>共享数据<span class="mojikumi-line-end">，</span>并且主程序可能在意想不到的地方接收到 signal 而被 interrupt<span class="mojikumi-line-end">，</span>编写安全的 signal handler 是困难的<span class="mojikumi-line-end">，</span>一般要遵循下面的守则<span class="mojikumi-line-end">：</span></p>
<ol start="0">
<li>handler 应当尽量简单<span class="mojikumi-line-end">，</span>例如可以设置一个 flag 而在主程序中检查 flag 并进行处理<span class="mojikumi-line-end">，</span>而非直接在 handler 中处理</li>
<li>在 handler 中只调用 async-signal-safe 的函数<span class="mojikumi-line-start">（</span>函数列表参见 <code>man<wbr> <wbr>signal<wbr>-<wbr>safety</code><span class="mojikumi">）</span><span class="mojikumi-line-end">，</span>常用的 <code>printf</code><span class="mojikumi-line-end">、</span><code>sprintf</code><span class="mojikumi-line-end">、</span><code>malloc</code><span class="mojikumi-line-end">、</span><code>exit</code> 都不是 async-signal-safe 的</li>
<li>存储并恢复 <code>errno</code><span class="mojikumi-line-end">，</span>保证调用 handler 前后 <code>errno</code> 不变</li>
<li>访问 handler 与主程序共享的数据时<span class="mojikumi-line-end">，</span>block signal 以防止在访问的中途被 interrupt</li>
<li>把在 handler 中修改而在主程序中访问的的全局变量声明为 <code>volatile</code> 的<span class="mojikumi-line-end">，</span>防止编译器误认为变量没有被修改而错误地进行优化</li>
<li>将 flag 声明为 <code>sig_atomic_t</code> 类型<span class="mojikumi-line-end">，</span>它的单次访问是 atomic 的<span class="mojikumi-line-end">，</span>不会被 interrupt<span class="mojikumi-line-start">（</span>但先读后写是两次访问<span class="mojikumi-line-end">，</span>可能被 interrupt<span class="mojikumi-line-end">）</span></li>
</ol>
<a id="为什么函数会不-async-signal-safe" name="为什么函数会不-async-signal-safe" aria-hidden="true"></a>
<aside role="note" data-v-a2ab257f><div class="shadow-md rd-1 b-l-6 my-6 bg-blue-1 dark:bg-blue-9 b-blue" data-v-a2ab257f><div class="p-3 flex justify-between items-center" data-v-a2ab257f><h5 class="flex items-center gap-1 font-bold" data-v-a2ab257f><span class="text-5 i-mdi-pencil text-blue" data-v-a2ab257f></span><span class="sr-only" data-v-a2ab257f>Note: </span><span data-v-a2ab257f>为什么函数会不 async-signal-safe</span></h5><!--v-if--></div><div class="overflow-auto rd-br-1 bg-card px-6 dark:bg-bghover" data-v-a2ab257f><p>以 <code>printf</code> 为例<span class="mojikumi-line-end">，</span><code>printf</code> 会使用一个 statically allocated buffer<span class="mojikumi-line-end">，</span>如果在中途被 interrupt<span class="mojikumi-line-end">，</span>而在返回到被 interrupt 的位置前的这段时间里<span class="mojikumi-line-end">，</span><code>printf</code> 再次被调用<span class="mojikumi-line-end">，</span>那么这个 buffer 就会处于一个 inconsistent 的中间态<span class="mojikumi-line-end">，</span>导致 UB<span class="mojikumi-line-end">。</span></p><p>也就是说<span class="mojikumi-line-end">，</span>一个函数不 async-signal-safe 一般是因为使用了一些全局变量并且可能在执行过程中被 interrupt<span class="mojikumi-line-end">。</span>要安全地调用这样的函数<span class="mojikumi-line-end">，</span>在其被 interrupt 后<span class="mojikumi-line-end">，</span>下次执行必须要是从被 interrupt 的地方继续<span class="mojikumi-line-end">，</span>而不能从头开始<span class="mojikumi-line-end">。</span></p><p>除了在 handler 中只调用 async-signal-safe 的函数<span class="mojikumi-line-end">，</span>也可以选择在主程序中调用非 async-signal-safe 函数时 block 掉 handler 使用了这一函数的 signal<span class="mojikumi-line-end">，</span>但这很难做到<span class="mojikumi-line-end">。</span></p></div></div></aside>
<h4 id="正确处理多次发送的-signal" class="heading"><a href="#正确处理多次发送的-signal" class="heading-anchor" aria-label="章节： 正确处理多次发送的 signal" tabindex="-1"></a><span>正确处理多次发送的 signal</span></h4>
<p>多次发送 signal 可能只会收到一次<span class="mojikumi-line-end">，</span>所以处理 signal 时不能误以为收到的次数与发送的次数相同<span class="mojikumi-line-end">。</span></p>
<p>例如<span class="mojikumi-line-end">，</span>接收 SIGCHLD 来 reap child 时<span class="mojikumi-line-end">，</span>应当在 handler 中 reap 掉所有已 terminate 的子进程<span class="mojikumi-line-end">，</span>而非只 reap 一个子进程<span class="mojikumi-line-end">。</span></p>
<h4 id="不同系统上-signal-handling-的差异" class="heading"><a href="#不同系统上-signal-handling-的差异" class="heading-anchor" aria-label="章节： 不同系统上 signal handling 的差异" tabindex="-1"></a><span>不同系统上 signal handling 的差异</span></h4>
<p>在一些系统上<span class="mojikumi-line-end">，</span>signal handling 的语义会有区别<span class="mojikumi-line-end">：</span></p>
<ul>
<li>
<p>在一些系统上<span class="mojikumi-line-end">，</span>调用了 handler 后这个 signal 就会恢复默认行为<span class="mojikumi-line-end">，</span>需要在 handler 中重新调用 <code>signal</code> 才能一直使用这个 handler<span class="mojikumi-line-end">。</span></p>
</li>
<li>
<p>在一些系统上<span class="mojikumi-line-end">，</span>需要执行较长时间的 system call 会在被 interrupt 后报错 EINTR<span class="mojikumi-line-end">，</span>而在现代系统上会尽可能地自动重新执行这个 system call<span class="mojikumi-line-end">，</span>详见 <code>man signal.7</code> 的 <span class="mojikumi">“</span>Interruption of system calls and library functions by signal handlers<span class="mojikumi">”</span> 一节<span class="mojikumi-line-end">。</span></p>
<p><span class="mojikumi-line-start">（</span>P.S. 这就是 <a href="https://www.dreamsongs.com/RiseOfWorseIsBetter.html">Rise of Worse Is Better</a> 中用来举例的 <span class="mojikumi">“</span>PC loser-ing problem<span class="mojikumi">”</span><span class="mojikumi-line-end">，</span>原本采用 worse-is-better 的 Unix 现在也进化成了 the right thing<span class="mojikumi-line-end">）</span>
<span class="mojikumi-line-start">（</span>P.P.S. 当时读这篇的时候我完全没看懂这一段<span class="mojikumi-line-end">，</span>没想到现在竟然还能记起来<span class="mojikumi-line-end">）</span></p>
</li>
</ul>
<p>可以通过 <code>sigaction</code> 函数来设置想要的 signal handling 语义<span class="mojikumi-line-end">。</span></p>
<h4 id="注意-handler-被调用的时机" class="heading"><a href="#注意-handler-被调用的时机" class="heading-anchor" aria-label="章节： 注意 handler 被调用的时机" tabindex="-1"></a><span>注意 handler 被调用的时机</span></h4>
<p>handler 可能会在意想不到的时机被调用<span class="mojikumi-line-end">，</span>为了避免出错<span class="mojikumi-line-start">（</span>race<span class="mojikumi">）</span><span class="mojikumi-line-end">，</span>可能会需要暂时 block signal 来确保 handler 在正确的时机被调用<span class="mojikumi-line-end">。</span>详见 CS:APP 上的例子<span class="mojikumi-line-end">。</span></p>
<h3 id="等待-signal" class="heading"><a href="#等待-signal" class="heading-anchor" aria-label="章节： 等待 signal" tabindex="-1"></a><span>等待 signal</span></h3>
<ul>
<li><code>int<wbr> <wbr>sigsuspend<wbr>(<wbr>const<wbr> <wbr>sigset_t<wbr> *<wbr>mask<wbr>)</code>: 将 blocked set 设为 <code>mask</code><span class="mojikumi-line-end">，</span>在接收到任何 signal 后返回</li>
</ul>
<p>可以在程序的其他部分 block 掉某个 signal<span class="mojikumi-line-end">，</span>然后在 <code>sigsuspend</code> 的参数中将其 unblock<span class="mojikumi-line-end">，</span>以达到等待该 signal 的目的<span class="mojikumi-line-end">。</span>因为 <code>sigsuspend</code> 等待的不是某个特定的 signal<span class="mojikumi-line-end">，</span>可以配合 <code>while</code> 循环来检查由 handler 设置的某个 flag<span class="mojikumi-line-end">。</span></p>
<p><code>sigsuspend</code> 的效果类似于下面的这段代码<span class="mojikumi-line-end">：</span></p>
<section class="code-block relative my-6 shadow" itemprop="hasPart" itemscope itemtype="https://schema.org/SoftwareSourceCode" data-v-c675dba6><div class="h-6 items-center rd-t-1 bg-area px-4 dark:bg-#2A313A media-screen:important-flex" style="display:none;" data-v-c675dba6><h4 class="text-3 text-footer" itemprop="programmingLanguage" aria-label="C 代码块" data-v-c675dba6>C</h4><ile-root id="ile-3"><button title="复制到剪贴板" class="copy-button b-footer text-footer" data-v-63dfb2af><span class="i-mdi-content-copy" data-v-63dfb2af></span><span class="sr-only" role="status" data-v-63dfb2af></span></button></ile-root><!--ISLAND_HYDRATION_PLACEHOLDER_ile-3--></div><div class="dark:hidden" itemprop="text" data-v-c675dba6><pre class="shiki light" style="background-color: #FBFBFB" tabindex="0"><code><span><span style="color: #4876D6">sigprocmask</span><span style="color: #403F53">(SIG_SETMASK, </span><span style="color: #0C969B">&amp;</span><span style="color: #403F53">mask, </span><span style="color: #0C969B">&amp;</span><span style="color: #403F53">prev);</span></span>
<span><span style="color: #4876D6">pause</span><span style="color: #403F53">();</span></span>
<span><span style="color: #4876D6">sigprocmask</span><span style="color: #403F53">(SIG_SETMASK, </span><span style="color: #0C969B">&amp;</span><span style="color: #403F53">prev, </span><span style="color: #4876D6">NULL</span><span style="color: #403F53">);</span></span></code></pre></div><div class="dark:important-block" style="display:none;" data-v-c675dba6><pre class="shiki dark" style="background-color: #011627" tabindex="0"><code><span><span style="color: #82AAFF">sigprocmask</span><span style="color: #D6DEEB">(SIG_SETMASK, </span><span style="color: #7FDBCA">&amp;</span><span style="color: #D7DBE0">mask</span><span style="color: #D6DEEB">, </span><span style="color: #7FDBCA">&amp;</span><span style="color: #D7DBE0">prev</span><span style="color: #D6DEEB">);</span></span>
<span><span style="color: #82AAFF">pause</span><span style="color: #D6DEEB">();</span></span>
<span><span style="color: #82AAFF">sigprocmask</span><span style="color: #D6DEEB">(SIG_SETMASK, </span><span style="color: #7FDBCA">&amp;</span><span style="color: #D7DBE0">prev</span><span style="color: #D6DEEB">, </span><span style="color: #82AAFF">NULL</span><span style="color: #D6DEEB">);</span></span></code></pre></div></section>
<p>不同的是<span class="mojikumi-line-end">，</span>上面这段代码有可能会恰好在 <code>sigprocmask</code> 之后<span class="mojikumi-line-end">、</span><code>pause</code> 之前接收到 signal<span class="mojikumi-line-end">，</span>导致这个 signal 没有将 <code>pause</code> interrupt 而一直 sleep 下去<span class="mojikumi-line-end">。</span><code>sigsuspend</code> 是 atomic 的<span class="mojikumi-line-end">，</span>就不存在这样的问题<span class="mojikumi-line-end">。</span></p>
<h2 id="nonlocal-jumps" class="heading"><a href="#nonlocal-jumps" class="heading-anchor" aria-label="章节： Nonlocal Jumps" tabindex="-1"></a><span>Nonlocal Jumps</span></h2>
<ul>
<li><code>int<wbr> <wbr>setjmp<wbr>(<wbr>jmp_buf<wbr> <wbr>env<wbr>)</code></li>
<li><code>void<wbr> <wbr>longjmp<wbr>(<wbr>jmp_buf<wbr> <wbr>env<wbr>, <wbr>int<wbr> <wbr>val<wbr>)</code></li>
</ul>
<p><code>setjmp</code> 会将当前的 PC 和寄存器等信息存在 <code>env</code> 中<span class="mojikumi-line-end">，</span>而 <code>longjmp</code> 会恢复 <code>env</code> 中保存的信息<span class="mojikumi-line-end">，</span>跳转到 <code>setjmp</code> 的位置<span class="mojikumi-line-end">。</span></p>
<p>这意味着 <code>setjmp</code> 可能返回多次<span class="mojikumi-line-end">，</span>而 <code>longjmp</code> 不会返回<span class="mojikumi-line-end">。</span>第一次调用 <code>setjmp</code> 会返回 0<span class="mojikumi-line-end">，</span>而之后调用 <code>longjmp</code> 时会在 <code>setjmp</code> 的位置返回参数 <code>val</code> 的值<span class="mojikumi-line-start">（</span>特别地<span class="mojikumi-line-end">，</span>如果 <code>val</code> 的值是 0<span class="mojikumi-line-end">，</span>会返回 1<span class="mojikumi-line-end">，</span>强制和首次返回区分开<span class="mojikumi">）</span><span class="mojikumi-line-end">。</span></p>
<p>因为 <code>setjmp</code> / <code>longjmp</code> 只是恢复 PC 和寄存器<span class="mojikumi-line-start">（</span>包括 <code>%rsp</code><span class="mojikumi">）</span><span class="mojikumi-line-end">：</span></p>
<ul>
<li>调用 <code>longjmp</code> 时 <code>setjmp</code> 所在的函数必须还没有返回<span class="mojikumi-line-end">，</span>否则 <code>setjmp</code> 所在的 stack frame 就失效了<span class="mojikumi-line-end">。</span></li>
<li><code>setjmp</code> 的返回值只应出现在<a href="https://pubs.opengroup.org/onlinepubs/9699919799/functions/setjmp.html">一些简单的表达式中</a><span class="mojikumi-line-end">，</span>否则是 UB<span class="mojikumi-line-end">。</span>特别地<span class="mojikumi-line-end">，</span>不应将 <code>setjmp</code> 的返回值赋给一个变量<span class="mojikumi-line-end">，</span>但可以放在 <code>if</code> 或 <code>switch</code> 里<span class="mojikumi-line-end">。</span>这是考虑到<span class="mojikumi-line-end">，</span>计算一个复杂的表达式可能会有一些中间量以及 dynamic stack allocation<span class="mojikumi-line-end">，</span>而 <code>longjmp</code> 回来时这些中间量<span class="mojikumi-line-end">、</span>dynamic stack allocation 不一定能被正确恢复<span class="mojikumi-line-end">，</span>导致表达式不一定能被正确计算<span class="mojikumi-line-end">。</span></li>
<li>如果修改了存放在内存中的局部变量<span class="mojikumi-line-end">，</span>跳转后会是被修改过的值而不是原来的值<span class="mojikumi-line-end">，</span>而存放在寄存器中的值则会被恢复<span class="mojikumi-line-end">。</span>要确保变量不被存在寄存器中<span class="mojikumi-line-end">，</span>必须使用 <code>volatile</code> 声明变量<span class="mojikumi-line-end">，</span>否则<span class="mojikumi-line-start">（</span>即便使用了 <code>register</code> 或 <code>auto</code> 来声明变量<span class="mojikumi-line-end">）</span>编译器可能任意地把变量放在内存或寄存器中<span class="mojikumi-line-end">，</span>造成跳转后变量的值不确定<span class="mojikumi-line-end">。</span></li>
</ul>
<a id="volatile-vs-取地址" name="volatile-vs-取地址" aria-hidden="true"></a>
<aside role="note" data-v-a2ab257f><div class="shadow-md rd-1 b-l-6 my-6 bg-purple-2 dark:bg-purple-9 b-purple-5" data-v-a2ab257f><div class="p-3 flex justify-between items-center" data-v-a2ab257f><h3 class="flex items-center gap-1 font-bold" data-v-a2ab257f><span class="text-5 i-mdi-help-circle-outline text-purple" data-v-a2ab257f></span><span class="sr-only" data-v-a2ab257f>Question: </span><span data-v-a2ab257f>volatile vs 取地址</span></h3><!--v-if--></div><div class="overflow-auto rd-br-1 bg-card px-6 dark:bg-bghover" data-v-a2ab257f><p>C99 rationale 和 <code>man setjmp</code> 都说要用 <code>volatile</code> 才能确保局部变量存在栈上<span class="mojikumi-line-end">，</span>那如果一个局部变量被取了地址<span class="mojikumi-line-end">，</span>还有可能存在寄存器中吗？如果有可能的话<span class="mojikumi-line-end">，</span>是标准允许这样但事实上不会<span class="mojikumi-line-end">，</span>还是真的可以在 gcc 中做到？</p></div></div></aside>
<ul>
<li><code>int<wbr> <wbr>sigsetjmp<wbr>(<wbr>sigjmp_buf<wbr> <wbr>env<wbr>, <wbr>int<wbr> <wbr>savesigs<wbr>)</code></li>
<li><code>void<wbr> <wbr>siglongjmp<wbr>(<wbr>sigjmp_buf<wbr> <wbr>env<wbr>, <wbr>int<wbr> <wbr>val<wbr>)</code></li>
</ul>
<p><code>sigsetjmp</code> / <code>siglongjmp</code> 会额外存储<span class="mojikumi-line-end">、</span>恢复 pending / blocked signal 的信息<span class="mojikumi-line-start">（</span>需要以非 0 <code>savesigs</code> 调用 <code>sigsetjmp</code><span class="mojikumi">）</span><span class="mojikumi-line-end">，</span>可以用于 signal handler<span class="mojikumi-line-end">。</span></p>
<p>nonlocal jump 主要有两种用途<span class="mojikumi-line-end">：</span></p>
<ul>
<li>出错时直接跳转到一个集中的位置来处理错误<span class="mojikumi-line-end">，</span>而不用一层层往上返回</li>
<li>处理 signal 时不返回到被 interrupt 的位置<span class="mojikumi-line-end">，</span>而跳转到指定的位置</li>
</ul>
<p>在 signal handler 中使用 nonlocal jump 时需要注意<span class="mojikumi-line-end">：</span></p>
<ul>
<li>先 <code>sigsetjmp</code> 再 install signal handler<span class="mojikumi-line-end">，</span>否则可能 race</li>
<li><code>siglongjmp</code> 跳转到的后续代码中只能调用 async-signal-safe 的函数</li>
</ul>
<p>nonlocal jump 可能造成可读性的问题<span class="mojikumi-line-end">，</span>也可能因为跳过了中间很多函数的返回<span class="mojikumi-line-end">，</span>造成内存泄露等后果<span class="mojikumi-line-end">，</span>要谨慎使用<span class="mojikumi-line-end">。</span></p>
<h2 id="tools-for-manipulating-processes" class="heading"><a href="#tools-for-manipulating-processes" class="heading-anchor" aria-label="章节： Tools for Manipulating Processes" tabindex="-1"></a><span>Tools for Manipulating Processes</span></h2>
<ul>
<li><code>strace</code>: 显示程序调用的所有 system call<span class="mojikumi-line-end">，</span>可以静态链接来避免看到大量共享库相关的输出</li>
<li><code>ps</code>: 列出进程信息</li>
<li><code>top</code>: 列出进程的资源使用<span class="mojikumi-line-start">（</span>可以用 <code>htop</code><span class="mojikumi-line-end">）</span></li>
<li><code>pmap</code>: 查看进程的 memory map</li>
<li><code>/proc</code>: 查看各种进程相关的信息 (<code>man proc.5</code>)</li>
</ul>]]></content:encoded>
            <category domain="https://ouuan.moe/tag/csapp">csapp</category>
            <category domain="https://ouuan.moe/tag/%E5%AD%A6%E4%B9%A0%E7%AC%94%E8%AE%B0">学习笔记</category>
        </item>
        <item>
            <title><![CDATA[CS:APP 第六章学习笔记]]></title>
            <link>https://ouuan.moe/post/2022/12/csapp-6</link>
            <guid>https://ouuan.moe/post/2022/12/csapp-6</guid>
            <pubDate>Mon, 05 Dec 2022 07:47:43 GMT</pubDate>
            <description><![CDATA[





<p><a href="https://csapp.cs.cmu.edu/">CS:APP</a> 第六章 <span class="mojikumi">“</span>The Memory Hierarchy<span class="mojikumi">”</span> 的学习笔记<span class="mojikumi-line-end">。</span></p>
<p>这章的主要内容有<span class="mojikumi-line-end">：</span>各种存储设备<span class="mojikumi-line-start">（</span>RAM<span class="mojikumi-line-end">、</span>ROM<span class="mojikumi-line-end">、</span>HDD<span class="mojikumi-line-end">、</span>SSD<span class="mojikumi-line-end">）</span>的特点<span class="mojikumi-line-end">、</span>程序的局部性<span class="mojikumi-line-end">、</span>缓存的结构以及原理<span class="mojikumi-line-end">、</span>缓存对程序性能的影响<span class="mojikumi-line-end">。</span></p>
]]></description>
            <content:encoded><![CDATA[





<p><a href="https://csapp.cs.cmu.edu/">CS:APP</a> 第六章 <span class="mojikumi">“</span>The Memory Hierarchy<span class="mojikumi">”</span> 的学习笔记<span class="mojikumi-line-end">。</span></p>
<p>这章的主要内容有<span class="mojikumi-line-end">：</span>各种存储设备<span class="mojikumi-line-start">（</span>RAM<span class="mojikumi-line-end">、</span>ROM<span class="mojikumi-line-end">、</span>HDD<span class="mojikumi-line-end">、</span>SSD<span class="mojikumi-line-end">）</span>的特点<span class="mojikumi-line-end">、</span>程序的局部性<span class="mojikumi-line-end">、</span>缓存的结构以及原理<span class="mojikumi-line-end">、</span>缓存对程序性能的影响<span class="mojikumi-line-end">。</span></p>

<p>因为时间不太够<span class="mojikumi-line-end">，</span>本来我想先跳过这章以后再补的<span class="mojikumi-line-end">，</span>但学第九章的时候感觉还是跳不得<span class="mojikumi-line-end">，</span>否则第九章有些东西感觉学了个半懂<span class="mojikumi-line-end">。</span><s>虽然只用学一小部分就足以满足第九章的需求<span class="mojikumi-line-end">，</span>但我打算摆烂了<span class="mojikumi-line-end">，</span>该学的东西学不完就学不完<span class="mojikumi-line-end">，</span>我想学啥就学啥<span class="mojikumi-line-end">。</span></s></p>
<h2 id="storage-technologies" class="heading"><a href="#storage-technologies" class="heading-anchor" aria-label="章节： Storage Technologies" tabindex="-1"></a><span>Storage Technologies</span></h2>
<h3 id="ram" class="heading"><a href="#ram" class="heading-anchor" aria-label="章节： RAM" tabindex="-1"></a><span>RAM</span></h3>
<p><i>Random access memory</i> 分为 SRAM 和 DRAM 两种<span class="mojikumi-line-end">，</span>SRAM 有更快的访问速度但更加昂贵<span class="mojikumi-line-end">。</span></p>
<h4 id="sram" class="heading"><a href="#sram" class="heading-anchor" aria-label="章节： SRAM" tabindex="-1"></a><span>SRAM</span></h4>
<p>SRAM (Static RAM) 将每个 bit 存储在一个 <i>bistable</i> 的 memory cell 中<span class="mojikumi-line-end">，</span>每个 cell 由 6 个晶体管组成<span class="mojikumi-line-end">，</span>有两种可能的稳定态<span class="mojikumi-line-end">，</span>遇到微小的扰动也会迅速恢复到这两种状态之一<span class="mojikumi-line-end">。</span></p>
<h4 id="dram" class="heading"><a href="#dram" class="heading-anchor" aria-label="章节： DRAM" tabindex="-1"></a><span>DRAM</span></h4>
<p>DRAM (Dynamic RAM) 将每个 bit 存储在一个很小的电容中<span class="mojikumi-line-end">，</span>容易受到外界干扰<span class="mojikumi-line-end">，</span>所以需要周期性地将数据复制出去再复制回来以进行刷新<span class="mojikumi-line-end">，</span>可能还会配合纠错码来保证数据正确<span class="mojikumi-line-end">。</span></p>
<p>DRAM 的设计使其存储密度更高<span class="mojikumi-line-end">，</span>但访问速度更慢<span class="mojikumi-line-end">；</span>SRAM 则更快<span class="mojikumi-line-end">，</span>但密度更低<span class="mojikumi-line-end">，</span>更贵<span class="mojikumi-line-end">，</span>更费电<span class="mojikumi-line-end">。</span>访问 DRAM 的用时大约是 SRAM 的 10 倍<span class="mojikumi-line-end">，</span>而 SRAM 的造价大约是 DRAM 的 1000 倍<span class="mojikumi-line-end">。</span></p>
<h4 id="conventional-dram" class="heading"><a href="#conventional-dram" class="heading-anchor" aria-label="章节： Conventional DRAM" tabindex="-1"></a><span>Conventional DRAM</span></h4>
<p>DRAM 芯片被分为若干 <i>supercell</i><span class="mojikumi-line-end">，</span>每个 supercell 存储一个 word<span class="mojikumi-line-end">，</span>一般是 1 byte<span class="mojikumi-line-end">。</span>supercell 排列为二维阵列<span class="mojikumi-line-end">，</span>可以用二维坐标 <span class="math math-inline"><span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mo stretchy="false">(</mo><mi>i</mi><mo separator="true">,</mo><mi>j</mi><mo stretchy="false">)</mo></mrow><annotation encoding="application/x-tex">(i, j)</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em;"></span><span class="mopen">(</span><span class="mord mathnormal">i</span><span class="mpunct">,</span><span class="mspace" style="margin-right:0.1667em;"></span><span class="mord mathnormal" style="margin-right:0.05724em;">j</span><span class="mclose">)</span></span></span></span></span> 定位<span class="mojikumi-line-end">。</span></p>
<p>DRAM 通过 <i>pin</i> 连接到 <i>memory controller</i> 来和外界通信<span class="mojikumi-line-end">。</span>读取位于 <span class="math math-inline"><span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mo stretchy="false">(</mo><mi>i</mi><mo separator="true">,</mo><mi>j</mi><mo stretchy="false">)</mo></mrow><annotation encoding="application/x-tex">(i, j)</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em;"></span><span class="mopen">(</span><span class="mord mathnormal">i</span><span class="mpunct">,</span><span class="mspace" style="margin-right:0.1667em;"></span><span class="mord mathnormal" style="margin-right:0.05724em;">j</span><span class="mclose">)</span></span></span></span></span> 的 supercell 时<span class="mojikumi-line-end">，</span>memory controller 会依次发送 <i>row access strobe</i> (RAS) <span class="math math-inline"><span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>i</mi></mrow><annotation encoding="application/x-tex">i</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.6595em;"></span><span class="mord mathnormal">i</span></span></span></span></span> 和 <i>column access strobe</i> (CAS) <span class="math math-inline"><span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>j</mi></mrow><annotation encoding="application/x-tex">j</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.854em;vertical-align:-0.1944em;"></span><span class="mord mathnormal" style="margin-right:0.05724em;">j</span></span></span></span></span><span class="mojikumi-line-end">，</span>在收到 RAS 后 DRAM 会将第 <span class="math math-inline"><span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>i</mi></mrow><annotation encoding="application/x-tex">i</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.6595em;"></span><span class="mord mathnormal">i</span></span></span></span></span> 行复制到一个内部的 row buffer<span class="mojikumi-line-end">，</span>收到 CAS 后再从 row buffer 里将第 <span class="math math-inline"><span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>j</mi></mrow><annotation encoding="application/x-tex">j</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.854em;vertical-align:-0.1944em;"></span><span class="mord mathnormal" style="margin-right:0.05724em;">j</span></span></span></span></span> 列发送给 memory controller<span class="mojikumi-line-end">。</span></p>
<h4 id="memory-module" class="heading"><a href="#memory-module" class="heading-anchor" aria-label="章节： Memory Module" tabindex="-1"></a><span>Memory Module</span></h4>
<p>DRAM 芯片会被组装为 <i>memory module</i> 来插到主板上<span class="mojikumi-line-end">。</span></p>
<p>DIMM 是一种 memory module<span class="mojikumi-line-end">。</span>例如<span class="mojikumi-line-end">，</span>一个 DIMM 可以包含 8 个 DRAM 芯片<span class="mojikumi-line-end">，</span>每个 64-bit 的 word 在每个 DRAM 芯片的同一个地址上分别存一个 byte<span class="mojikumi-line-end">，</span>从而整个 DIMM 可以以 64-bit 为单位与外界通信<span class="mojikumi-line-end">。</span></p>
<h4 id="enhanced-dram" class="heading"><a href="#enhanced-dram" class="heading-anchor" aria-label="章节： Enhanced DRAM" tabindex="-1"></a><span>Enhanced DRAM</span></h4>
<p>朴素的 DRAM 是比较慢的<span class="mojikumi-line-end">，</span>历史上曾经有过若干对 conventional DRAM 的优化<span class="mojikumi-line-end">：</span></p>
<ol>
<li>FPM (fast page mode) DRAM: 如果连续两次 RAS 是一样的<span class="mojikumi-line-end">，</span>可以省略掉后续相同的 RAS<span class="mojikumi-line-end">，</span>直接发送 CAS</li>
<li>EDO (extended data out) DRAM: 延长了数据输出的时间<span class="mojikumi-line-end">，</span>对 pipelining 有帮助</li>
<li>SDRAM (synchronous): 通过时钟信号的 rising edge 同步地通信<span class="mojikumi-line-end">，</span>而非通过发送 RAS/CAS 异步通信</li>
<li>DDR (double data-rate) SDRAM: 通过同时使用时钟信号的 rising edge 和 falling edge 达到 double data-rate<span class="mojikumi-line-end">，</span>分为 DDR<span class="mojikumi-line-end">、</span>DDR2<span class="mojikumi-line-end">、</span>DDR3<span class="mojikumi-line-end">、</span>DDR4<span class="mojikumi-line-end">、</span>DDR5 等</li>
<li>VRAM (video): 一般用于显卡<span class="mojikumi-line-end">、</span>frame buffer 等<span class="mojikumi-line-end">，</span>它的输出是直接输出整个 buffer<span class="mojikumi-line-end">，</span>并且可以并行地同时读和写</li>
</ol>
<h3 id="rom" class="heading"><a href="#rom" class="heading-anchor" aria-label="章节： ROM" tabindex="-1"></a><span>ROM</span></h3>
<p>RAM 会在断电后丢失数据<span class="mojikumi-line-end">，</span>所以是 <i>volatile</i> 的<span class="mojikumi-line-end">。</span>与之相对<span class="mojikumi-line-end">，</span>还有 nonvolatile 的存储器<span class="mojikumi-line-end">，</span>统称为 <i>read-only memory</i> (ROM)<span class="mojikumi-line-end">，</span>尽管有的 ROM 是可以写入的<span class="mojikumi-line-end">。</span>ROM 的写入称作 <i>reprogram</i><span class="mojikumi-line-end">。</span></p>
<ul>
<li>PROM (programmable ROM) 只能被写入一次<span class="mojikumi-line-end">。</span></li>
<li>EPROM (erasable PROM) 需要用特殊设备写入<span class="mojikumi-line-end">，</span>可以写入大约 1000 次<span class="mojikumi-line-end">。</span></li>
<li>EEPROM (electrically EPROM) 不需要用特殊设备就可以写入<span class="mojikumi-line-end">，</span>可以写入大约 <span class="math math-inline"><span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msup><mn>10</mn><mn>5</mn></msup></mrow><annotation encoding="application/x-tex">10^5</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.8141em;"></span><span class="mord">1</span><span class="mord"><span class="mord">0</span><span class="msupsub"><span class="vlist-t"><span class="vlist-r"><span class="vlist" style="height:0.8141em;"><span style="top:-3.063em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight">5</span></span></span></span></span></span></span></span></span></span></span></span> 次<span class="mojikumi-line-end">。</span></li>
<li>flash memory 是一种基于 EEPROM 的 nonvolatile 存储器<span class="mojikumi-line-end">，</span>被广泛使用<span class="mojikumi-line-end">，</span>包括用于 <a href="#ssd">SSD</a><span class="mojikumi-line-end">。</span></li>
<li>固件 (firmware) 往往存储于 ROM 中<span class="mojikumi-line-end">。</span></li>
</ul>
<h3 id="访问-main-memory" class="heading"><a href="#访问-main-memory" class="heading-anchor" aria-label="章节： 访问 main memory" tabindex="-1"></a><span>访问 main memory</span></h3>
<p>一个 <i>bus</i> 是一组用来通信的导线<span class="mojikumi-line-end">，</span>可以传输地址<span class="mojikumi-line-end">、</span>数据<span class="mojikumi-line-end">、</span>控制信号等<span class="mojikumi-line-end">。</span>CPU 和 main memory 之间的通信通过 <i>bus transaction</i> 进行<span class="mojikumi-line-end">。</span></p>
<p>CPU 通过 system bus 连接 I/O bridge<span class="mojikumi-line-end">，</span>I/O bridge 通过 memory bus 连接 main memory<span class="mojikumi-line-end">。</span>I/O bridge 负责 system bus signal 和 memory bus signal 之间的转换<span class="mojikumi-line-end">。</span></p>
<h3 id="hdd" class="heading"><a href="#hdd" class="heading-anchor" aria-label="章节： HDD" tabindex="-1"></a><span>HDD</span></h3>
<h4 id="磁盘的结构" class="heading"><a href="#磁盘的结构" class="heading-anchor" aria-label="章节： 磁盘的结构" tabindex="-1"></a><span>磁盘的结构</span></h4>
<p>磁盘由若干 <i>platter</i><span class="mojikumi-line-start">（</span>盘片<span class="mojikumi-line-end">）</span>组成<span class="mojikumi-line-end">。</span>每个 platter 有两个 <i>surface</i><span class="mojikumi-line-start">（</span>表面<span class="mojikumi">）</span><span class="mojikumi-line-end">，</span>每个 surface 上覆盖着磁性记录材料<span class="mojikumi-line-end">。</span>platter 由位于中心的 <i>spindle</i><span class="mojikumi-line-start">（</span>主轴<span class="mojikumi-line-end">）</span>带动<span class="mojikumi-line-end">，</span>以某个一般是 5400~15000 RPM 的速度转动<span class="mojikumi-line-end">。</span></p>
<p>每个 surface 被分成若干个称作 <i>track</i><span class="mojikumi-line-start">（</span>磁道<span class="mojikumi-line-end">）</span>的同心圆环<span class="mojikumi-line-end">，</span>每个 track 被分为若干 <i>sector</i><span class="mojikumi-line-start">（</span>扇区<span class="mojikumi">）</span><span class="mojikumi-line-end">。</span>每个 sector 存有相同大小的数据<span class="mojikumi-line-start">（</span>一般是 512 bytes<span class="mojikumi">）</span><span class="mojikumi-line-end">，</span>相邻的 sector 之间由 <i>gap</i><span class="mojikumi-line-start">（</span>间隙<span class="mojikumi-line-end">）</span>隔开<span class="mojikumi-line-end">，</span>gap 不存储数据<span class="mojikumi-line-end">，</span>而是用来识别 sector<span class="mojikumi-line-end">。</span></p>
<p>一个磁盘通常由多个堆叠在一起的 platter 构成<span class="mojikumi-line-end">，</span>这些 platter 共享一个 spindle<span class="mojikumi-line-end">。</span>对于某个距离 <span class="math math-inline"><span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>k</mi></mrow><annotation encoding="application/x-tex">k</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.6944em;"></span><span class="mord mathnormal" style="margin-right:0.03148em;">k</span></span></span></span></span><span class="mojikumi-line-end">，</span>一个磁盘内所有 surface 上离转轴距离为 <span class="math math-inline"><span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>k</mi></mrow><annotation encoding="application/x-tex">k</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.6944em;"></span><span class="mord mathnormal" style="margin-right:0.03148em;">k</span></span></span></span></span> 的 track 的集合称作一个 <i>cylinder</i><span class="mojikumi-line-start">（</span>柱面<span class="mojikumi">）</span><span class="mojikumi-line-end">。</span></p>
<p>整体结构如 CS:APP Figure 6.9 所示<span class="mojikumi-line-end">：</span></p>
<p><picture><img type="image/webp" srcset="/assets/csapp-fig6.9.930d8843.webp" loading="lazy" src="/assets/csapp-fig6.9.930d8843.webp" width="1517" height="537" alt="磁盘结构示意图"></picture></p>
<h4 id="磁盘的容量" class="heading"><a href="#磁盘的容量" class="heading-anchor" aria-label="章节： 磁盘的容量" tabindex="-1"></a><span>磁盘的容量</span></h4>
<p>磁盘的容量有三个衡量指标<span class="mojikumi-line-end">：</span></p>
<ul>
<li>recording density: 单位长度的 track 存储的 bit 数量</li>
<li>track density: 单位长度的半径上的 track 个数</li>
<li>areal density: 单位面积上存储的 bit 数量</li>
</ul>
<p>早期的磁盘的所有 track 都有相同数量的 sector<span class="mojikumi-line-end">，</span>这样的话位于外部的 track 的 sector 就会更加稀疏<span class="mojikumi-line-end">。</span>后来为了提高容量<span class="mojikumi-line-end">，</span>将 cylinder 划分成了若干个 <i>recording zone</i><span class="mojikumi-line-end">，</span>每个 recording zone 由若干相邻的 cylinder 组成<span class="mojikumi-line-end">，</span>同一个 recording zone 内的所有 track 有相同数量的 sector<span class="mojikumi-line-end">。</span></p>
<h4 id="磁盘的读写" class="heading"><a href="#磁盘的读写" class="heading-anchor" aria-label="章节： 磁盘的读写" tabindex="-1"></a><span>磁盘的读写</span></h4>
<p>磁盘通过连在传动臂上的读写头进行读写<span class="mojikumi-line-end">，</span>每次读写前需要先将读写头移动到相应的位置<span class="mojikumi-line-start">（</span>寻道<span class="mojikumi">）</span><span class="mojikumi-line-end">，</span>并等待目标 sector 转动到读写头下<span class="mojikumi-line-end">，</span>再开始读写<span class="mojikumi-line-end">。</span></p>
<p>寻道用时与读写头原本的位置到目标位置的距离有关<span class="mojikumi-line-end">，</span>等待转动的用时则看运气<span class="mojikumi-line-end">。</span>在 CS:APP 举的例子中<span class="mojikumi-line-end">，</span>寻道平均用时为 9 ms<span class="mojikumi-line-end">，</span>等待旋转平均用时为 4 ms<span class="mojikumi-line-end">，</span>读写一个 sector 用时 20 μs<span class="mojikumi-line-end">。</span></p>
<p>也就是说<span class="mojikumi-line-end">，</span>磁盘读写的主要用时是寻道以及等待旋转用时<span class="mojikumi-line-end">，</span>也就是初次访问一段连续的 sector 的用时<span class="mojikumi-line-end">，</span>而与访问多少个连续的 sector 关系不大<span class="mojikumi-line-end">。</span>对于单个 sector<span class="mojikumi-line-end">，</span>磁盘访问的用时可以达到 SRAM 的 <span class="math math-inline"><span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msup><mn>10</mn><mn>4</mn></msup></mrow><annotation encoding="application/x-tex">10^4</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.8141em;"></span><span class="mord">1</span><span class="mord"><span class="mord">0</span><span class="msupsub"><span class="vlist-t"><span class="vlist-r"><span class="vlist" style="height:0.8141em;"><span style="top:-3.063em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight">4</span></span></span></span></span></span></span></span></span></span></span></span> 倍<span class="mojikumi-line-end">，</span>DRAM 的 <span class="math math-inline"><span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msup><mn>10</mn><mn>3</mn></msup></mrow><annotation encoding="application/x-tex">10^3</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.8141em;"></span><span class="mord">1</span><span class="mord"><span class="mord">0</span><span class="msupsub"><span class="vlist-t"><span class="vlist-r"><span class="vlist" style="height:0.8141em;"><span style="top:-3.063em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight">3</span></span></span></span></span></span></span></span></span></span></span></span> 倍<span class="mojikumi-line-end">，</span>但连续 sector 的读写用时仅为 DRAM 的不到十倍<span class="mojikumi-line-end">。</span></p>
<h4 id="logical-disk-blocks" class="heading"><a href="#logical-disk-blocks" class="heading-anchor" aria-label="章节： Logical Disk Blocks" tabindex="-1"></a><span>Logical Disk Blocks</span></h4>
<p>磁盘对外提供了 <i>logical block</i> 作为 sector 的抽象<span class="mojikumi-line-end">，</span>每个 logical block 的大小和一个 sector 相同<span class="mojikumi-line-end">，</span>由连续的非负整数索引<span class="mojikumi-line-end">，</span>通过 <i>disk controller</i> 翻译成形如 <i>(surface, track, sector)</i> 的坐标<span class="mojikumi-line-end">。</span></p>
<a id="磁盘格式化" name="磁盘格式化" aria-hidden="true"></a>
<aside role="note" data-v-a2ab257f><div class="shadow-md rd-1 b-l-6 my-6 bg-blue-1 dark:bg-blue-9 b-blue" data-v-a2ab257f><div class="p-3 flex justify-between items-center" data-v-a2ab257f><h5 class="flex items-center gap-1 font-bold" data-v-a2ab257f><span class="text-5 i-mdi-pencil text-blue" data-v-a2ab257f></span><span class="sr-only" data-v-a2ab257f>Note: </span><span data-v-a2ab257f>磁盘格式化</span></h5><!--v-if--></div><div class="overflow-auto rd-br-1 bg-card px-6 dark:bg-bghover" data-v-a2ab257f><p>磁盘在使用前需要进行格式化<span class="mojikumi-line-end">：</span>在 gap 中写入 sector 的标识信息<span class="mojikumi-line-end">，</span>识别出有故障的 cylinder<span class="mojikumi-line-end">，</span>将一些 cylinder 设为备用以防其他 cylinder 损坏<span class="mojikumi-line-end">。</span>由于备用 cylinder 的存在<span class="mojikumi-line-end">，</span>formatted capacity 会小于 maximum capacity<span class="mojikumi-line-end">。</span></p></div></div></aside>
<h3 id="io-bus" class="heading"><a href="#io-bus" class="heading-anchor" aria-label="章节： I/O bus" tabindex="-1"></a><span>I/O bus</span></h3>
<p>不同的 I/O 设备通过 I/O bus 与 I/O bridge 连接<span class="mojikumi-line-end">。</span>例如显卡<span class="mojikumi-line-end">、</span>连接各种设备的 USB controller<span class="mojikumi-line-end">、</span>通过 SCSI/SATA 等接口连接磁盘的 host bus adapter 等都会连接到 I/O bus<span class="mojikumi-line-end">。</span></p>
<h3 id="访问磁盘" class="heading"><a href="#访问磁盘" class="heading-anchor" aria-label="章节： 访问磁盘" tabindex="-1"></a><span>访问磁盘</span></h3>
<p>访问磁盘需要向磁盘发送三条指令<span class="mojikumi-line-end">：</span></p>
<ol>
<li>向磁盘发送一个信号<span class="mojikumi-line-end">，</span>告诉磁盘要读取数据</li>
<li>将要读取的 logical block number 发送给磁盘</li>
<li>告诉磁盘读取到的数据要放在 main memory 的哪个地址</li>
</ol>
<p>发送完这些指令后<span class="mojikumi-line-end">，</span>CPU 会继续干其他事情<span class="mojikumi-line-end">。</span>磁盘读取到数据后<span class="mojikumi-line-end">，</span>会通过 I/O bus 直接将数据存放到 main memory 中而不经过 CPU<span class="mojikumi-line-end">，</span>这被称作 <i>direct memory access</i> (DMA)<span class="mojikumi-line-end">。</span>存放好数据后<span class="mojikumi-line-end">，</span>磁盘向 CPU 发送 interrupt signal 来跳转到处理磁盘读取完成的 signal handler<span class="mojikumi-line-end">。</span></p>
<h3 id="ssd" class="heading"><a href="#ssd" class="heading-anchor" aria-label="章节： SSD" tabindex="-1"></a><span>SSD</span></h3>
<p>SSD 将一个或多个 flash memory 包装起来<span class="mojikumi-line-end">，</span>并且有一个 <i>flash translation layer</i> 来将输入的 logical block number 转换为对 flash memory 的访问<span class="mojikumi-line-end">，</span>对外表现出与 HDD 类似的接口<span class="mojikumi-line-end">。</span></p>
<p>flash memory 由若干 block 组成<span class="mojikumi-line-end">，</span>每个 block 又由若干<span class="mojikumi-line-start">（</span>32-128 个<span class="mojikumi-line-end">）</span>page 组成<span class="mojikumi-line-end">，</span>每个 page 一般是 512B-4KB 大<span class="mojikumi-line-end">，</span>数据传输的最小单位是 page<span class="mojikumi-line-end">。</span></p>
<p>SSD 的写入比较特殊<span class="mojikumi-line-end">：</span>一个 page 需要在所属的整个 block 都被擦除<span class="mojikumi-line-start">（</span>改为全 1<span class="mojikumi-line-end">）</span>后才能写入一次<span class="mojikumi-line-end">，</span>如果要写入第二次就得再把整个 block 擦除一遍<span class="mojikumi-line-end">。</span>在写入时<span class="mojikumi-line-end">，</span>为了擦除某个 block<span class="mojikumi-line-end">，</span>可能会需要把这个 block 存储的数据复制到其他 block<span class="mojikumi-line-end">。</span>擦除是一个耗时相对较长的操作<span class="mojikumi-line-end">，</span>需要约 1 ms<span class="mojikumi-line-end">，</span>并且每个 block 在擦除约 <span class="math math-inline"><span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msup><mn>10</mn><mn>5</mn></msup></mrow><annotation encoding="application/x-tex">10^5</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.8141em;"></span><span class="mord">1</span><span class="mord"><span class="mord">0</span><span class="msupsub"><span class="vlist-t"><span class="vlist-r"><span class="vlist" style="height:0.8141em;"><span style="top:-3.063em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight">5</span></span></span></span></span></span></span></span></span></span></span></span> 次后就会损坏<span class="mojikumi-line-end">。</span></p>
<p>这使得 SSD 的写入比读取略慢<span class="mojikumi-line-end">，</span>并且写入很多次后可能损坏<span class="mojikumi-line-end">。</span>flash translation layer 会通过 <i>wear-leveling logic</i> 来尽可能使得每个 block 的擦除次数相同<span class="mojikumi-line-end">，</span>以延长 SSD 的使用寿命<span class="mojikumi-line-end">。</span></p>
<p>disk<span class="mojikumi-line-end">、</span>RAM<span class="mojikumi-line-end">、</span>CPU 速度差异的历史变化如 CS:APP Figure 6.16 所示<span class="mojikumi-line-end">，</span>其中 CPU cycle time 是单核的<span class="mojikumi-line-end">，</span>effective CPU cycle time 是多核的<span class="mojikumi-line-end">：</span></p>
<p><picture><img type="image/webp" srcset="/assets/csapp-fig6.16.737e4bc7.webp" loading="lazy" src="/assets/csapp-fig6.16.737e4bc7.webp" width="1017" height="464" alt="disk、RAM、CPU 速度差异的历史变化"></picture></p>
<h2 id="locality" class="heading"><a href="#locality" class="heading-anchor" aria-label="章节： Locality" tabindex="-1"></a><span>Locality</span></h2>
<p>好的程序具有良好的 <i>locality</i><span class="mojikumi-line-end">。</span>locality 有两种表现形式<span class="mojikumi-line-end">，</span><i>temporal locality</i> 指的是最近访问过的数据更有可能在不久的将来再次被访问<span class="mojikumi-line-end">，</span><i>spatial locality</i> 指的是访问过一处的数据后更有可能在不久的将来访问邻近的其他数据<span class="mojikumi-line-end">。</span></p>
<p>具有良好 locality 的程序跑得更快<span class="mojikumi-line-end">，</span>因为计算机系统设计的各个层面都利用 locality 做了优化<span class="mojikumi-line-end">。</span></p>
<p>一些 locality 的例子<span class="mojikumi-line-end">：</span></p>
<ul>
<li>重复引用同一个变量的程序有良好的 locality<span class="mojikumi-line-end">。</span></li>
<li>在一段连续内存<span class="mojikumi-line-start">（</span>数组<span class="mojikumi-line-end">）</span>中依次访问每个元素称作 <i>stride-1 reference pattern</i><span class="mojikumi-line-end">，</span>每次间隔 <span class="math math-inline"><span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>k</mi><mo>−</mo><mn>1</mn></mrow><annotation encoding="application/x-tex">k-1</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.7778em;vertical-align:-0.0833em;"></span><span class="mord mathnormal" style="margin-right:0.03148em;">k</span><span class="mspace" style="margin-right:0.2222em;"></span><span class="mbin">−</span><span class="mspace" style="margin-right:0.2222em;"></span></span><span class="base"><span class="strut" style="height:0.6444em;"></span><span class="mord">1</span></span></span></span></span> 个元素进行访问称作 <i>stride-k reference pattern</i><span class="mojikumi-line-end">，</span><span class="math math-inline"><span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>k</mi></mrow><annotation encoding="application/x-tex">k</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.6944em;"></span><span class="mord mathnormal" style="margin-right:0.03148em;">k</span></span></span></span></span> 越小 locality 越好<span class="mojikumi-line-end">。</span>遍历高维数组时尤其要注意访问的顺序<span class="mojikumi-line-end">。</span></li>
<li>由于循环会重复访问同一段指令<span class="mojikumi-line-end">，</span>循环的指令读取局部性良好</li>
</ul>
<h2 id="the-memory-hierarchy" class="heading"><a href="#the-memory-hierarchy" class="heading-anchor" aria-label="章节： The Memory Hierarchy" tabindex="-1"></a><span>The Memory Hierarchy</span></h2>
<p>在硬件上<span class="mojikumi-line-end">，</span>不同存储技术之间存在性能<span class="mojikumi-line-end">、</span>价格<span class="mojikumi-line-end">、</span>容量的 trade-off<span class="mojikumi-line-end">；</span>在软件上<span class="mojikumi-line-end">，</span>程序具有 locality<span class="mojikumi-line-end">。</span>硬件和软件的这两条性质正好可以搭配在一起<span class="mojikumi-line-end">，</span>促使 memory system 采用了如 CS:APP Figure 6.21 所示的称作 <i>memory hierarchy</i> 的组织方式<span class="mojikumi-line-end">：</span></p>
<p><picture><img type="image/webp" srcset="/assets/csapp-fig6.21.26bcdff1.webp" loading="lazy" src="/assets/csapp-fig6.21.26bcdff1.webp" width="1047" height="624" alt="The memory hierarchy"></picture></p>
<p>memory hierarchy 的构成并不一定和上图完全一致<span class="mojikumi-line-end">，</span>例如 SRAM 的级数可能不是三级<span class="mojikumi-line-end">、</span>DRAM 和 HDD 间可能还有 SSD<span class="mojikumi-line-end">、</span>磁带也可以作为 memory hierarchy 中比 HDD 更低的一级<span class="mojikumi-line-end">。</span></p>
<h3 id="cache" class="heading"><a href="#cache" class="heading-anchor" aria-label="章节： Cache" tabindex="-1"></a><span>Cache</span></h3>
<p>caching 指的是用一个相对小而快的存储设备来存储一个相对大而慢的存储设备中最为活跃的部分<span class="mojikumi-line-end">，</span>这个小的存储设备称作大的存储设备的 cache<span class="mojikumi-line-end">。</span></p>
<p>在 memory hierarchy 中<span class="mojikumi-line-end">，</span>每一级都是下一级的 cache<span class="mojikumi-line-end">。</span>数据会在各个相邻层级间不断地传输<span class="mojikumi-line-end">，</span>不同层级之间会以不同的 block size 作为数据传输的基本单位<span class="mojikumi-line-end">。</span></p>
<h3 id="从-cache-获取数据" class="heading"><a href="#从-cache-获取数据" class="heading-anchor" aria-label="章节： 从 cache 获取数据" tabindex="-1"></a><span>从 cache 获取数据</span></h3>
<p>想要从 memory hierarchy 的某一级获取数据时<span class="mojikumi-line-end">，</span>首先会尝试从它的 cache 获取数据<span class="mojikumi-line-end">，</span>如果成功获取则称作 <i>cache hit</i><span class="mojikumi-line-end">，</span>否则称作 <i>cache miss</i><span class="mojikumi-line-end">。</span></p>
<p>发生 cache miss 时<span class="mojikumi-line-end">，</span>一般会先将数据从下一级复制到上一级<span class="mojikumi-line-end">，</span>从而最终还是表现为从 cache 中获取数据<span class="mojikumi-line-end">。</span>如果 cache 满了<span class="mojikumi-line-end">，</span>在从下一级获取数据时<span class="mojikumi-line-end">，</span>就需要删除 cache 中的一些数据来腾出空间<span class="mojikumi-line-end">，</span>这时需要在 cache 中选择被删除的数据<span class="mojikumi-line-end">，</span>被删除的 block 称作 <i>victim block</i><span class="mojikumi-line-end">，</span>这个行为称作将 victim block <i>evict</i><span class="mojikumi-line-end">，</span>而选择 victim block 是根据 <i>replacement policy</i> 进行的<span class="mojikumi-line-end">，</span>例如 random replacement policy<span class="mojikumi-line-end">、</span>least recently used (LRU) replacement policy 等<span class="mojikumi-line-end">。</span></p>
<h3 id="cache-的管理" class="heading"><a href="#cache-的管理" class="heading-anchor" aria-label="章节： Cache 的管理" tabindex="-1"></a><span>Cache 的管理</span></h3>
<p>cache 可能由硬件<span class="mojikumi-line-end">、</span>OS<span class="mojikumi-line-end">、</span>软件以及它们之间的相互配合来进行管理<span class="mojikumi-line-end">，</span>而这在大部分时候都是自动完成的<span class="mojikumi-line-end">，</span>无需应用程序的程序员操心<span class="mojikumi-line-end">。</span></p>
<p>各级 cache 如 CS:APP Figure 6.23 所示<span class="mojikumi-line-end">：</span></p>
<p><picture><img type="image/webp" srcset="/assets/csapp-fig6.23.2f4f3b1d.webp" loading="lazy" src="/assets/csapp-fig6.23.2f4f3b1d.webp" width="1593" height="609" alt="无处不在的各式各样的 cache"></picture></p>
<h3 id="cache-对-locality-的利用" class="heading"><a href="#cache-对-locality-的利用" class="heading-anchor" aria-label="章节： Cache 对 locality 的利用" tabindex="-1"></a><span>Cache 对 locality 的利用</span></h3>
<p>temporal locality 使得重复使用的数据留存在 cache 中从而更容易 cache hit<span class="mojikumi-line-end">；</span>cache 中的数据按 block 存储则利用了 spatial locality<span class="mojikumi-line-end">，</span>使得一个数据被 cache 时与其邻近的处于同一个 block 的数据也被 cache<span class="mojikumi-line-end">。</span></p>
<h2 id="cache-memories" class="heading"><a href="#cache-memories" class="heading-anchor" aria-label="章节： Cache Memories" tabindex="-1"></a><span>Cache Memories</span></h2>
<p>随着 CPU 和 DRAM 的速度差异越来越大<span class="mojikumi-line-end">，</span>SRAM 被用来填充它们之间的 gap<span class="mojikumi-line-end">。</span></p>
<p>在下面的讨论中<span class="mojikumi-line-end">，</span>为了简便<span class="mojikumi-line-end">，</span>假设只有 L1 cache<span class="mojikumi-line-end">，</span>没有 L2<span class="mojikumi-line-end">、</span>L3 cache<span class="mojikumi">。</span><wbr><span class="mojikumi-line-start">（</span>或者也可以看成是在介绍 L3 cache 是如何工作的<span class="mojikumi">。</span><span class="mojikumi-line-end">）</span></p>
<h3 id="cache-的结构与读取" class="heading"><a href="#cache-的结构与读取" class="heading-anchor" aria-label="章节： Cache 的结构与读取" tabindex="-1"></a><span>Cache 的结构与读取</span></h3>
<p>设 main memory 有 <span class="math math-inline"><span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msup><mn>2</mn><mi>m</mi></msup></mrow><annotation encoding="application/x-tex">2^m</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.6644em;"></span><span class="mord"><span class="mord">2</span><span class="msupsub"><span class="vlist-t"><span class="vlist-r"><span class="vlist" style="height:0.6644em;"><span style="top:-3.063em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight">m</span></span></span></span></span></span></span></span></span></span></span></span> 个地址<span class="mojikumi-line-end">，</span>每个地址存放一个 byte<span class="mojikumi-line-end">。</span>它的 cache 会分成 <span class="math math-inline"><span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msup><mn>2</mn><mi>s</mi></msup></mrow><annotation encoding="application/x-tex">2^s</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.6644em;"></span><span class="mord"><span class="mord">2</span><span class="msupsub"><span class="vlist-t"><span class="vlist-r"><span class="vlist" style="height:0.6644em;"><span style="top:-3.063em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight">s</span></span></span></span></span></span></span></span></span></span></span></span> 个 <i>cache set</i><span class="mojikumi-line-end">，</span>每个 cache set 包含 <span class="math math-inline"><span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>E</mi></mrow><annotation encoding="application/x-tex">E</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.6833em;"></span><span class="mord mathnormal" style="margin-right:0.05764em;">E</span></span></span></span></span> 个 <i>cache line</i><span class="mojikumi-line-end">，</span>每个 cache line 存放一个大小为 <span class="math math-inline"><span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msup><mn>2</mn><mi>b</mi></msup></mrow><annotation encoding="application/x-tex">2^b</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.8491em;"></span><span class="mord"><span class="mord">2</span><span class="msupsub"><span class="vlist-t"><span class="vlist-r"><span class="vlist" style="height:0.8491em;"><span style="top:-3.063em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight">b</span></span></span></span></span></span></span></span></span></span></span></span> byte 的 data block<span class="mojikumi-line-end">、</span>一个 <i>valid bit</i><span class="mojikumi-line-end">、</span>以及长度为 <span class="math math-inline"><span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>t</mi><mo>=</mo><mi>m</mi><mo>−</mo><mi>b</mi><mo>−</mo><mi>s</mi></mrow><annotation encoding="application/x-tex">t = m-b-s</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.6151em;"></span><span class="mord mathnormal">t</span><span class="mspace" style="margin-right:0.2778em;"></span><span class="mrel">=</span><span class="mspace" style="margin-right:0.2778em;"></span></span><span class="base"><span class="strut" style="height:0.6667em;vertical-align:-0.0833em;"></span><span class="mord mathnormal">m</span><span class="mspace" style="margin-right:0.2222em;"></span><span class="mbin">−</span><span class="mspace" style="margin-right:0.2222em;"></span></span><span class="base"><span class="strut" style="height:0.7778em;vertical-align:-0.0833em;"></span><span class="mord mathnormal">b</span><span class="mspace" style="margin-right:0.2222em;"></span><span class="mbin">−</span><span class="mspace" style="margin-right:0.2222em;"></span></span><span class="base"><span class="strut" style="height:0.4306em;"></span><span class="mord mathnormal">s</span></span></span></span></span> 的 <i>tag bits</i><span class="mojikumi-line-end">。</span></p>
<p>每个地址会被分成三部分<span class="mojikumi-line-end">，</span>高位的 <span class="math math-inline"><span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>t</mi></mrow><annotation encoding="application/x-tex">t</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.6151em;"></span><span class="mord mathnormal">t</span></span></span></span></span> 位是 tag<span class="mojikumi-line-end">，</span>中间 <span class="math math-inline"><span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>s</mi></mrow><annotation encoding="application/x-tex">s</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.4306em;"></span><span class="mord mathnormal">s</span></span></span></span></span> 位是 set index<span class="mojikumi-line-end">，</span>低位 <span class="math math-inline"><span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>b</mi></mrow><annotation encoding="application/x-tex">b</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.6944em;"></span><span class="mord mathnormal">b</span></span></span></span></span> 位是 block offset<span class="mojikumi-line-end">。</span>获取存放在某个地址的数据时<span class="mojikumi-line-end">，</span>先根据其 set index 找到对应的 cache set<span class="mojikumi-line-end">，</span>再在 cache set 中找到 valid bit 为 1 且 tag 相符的 cache line<span class="mojikumi-line-end">，</span>最后通过 block offset 来从 block 中提取出单个 byte<span class="mojikumi-line-end">。</span></p>
<p>在 cache miss 时<span class="mojikumi-line-end">，</span>需要从下一级获取数据<span class="mojikumi-line-end">，</span>存放到 cache 中<span class="mojikumi-line-end">。</span>如果对应的 cache set 所有 cache line 都满了<span class="mojikumi-line-end">，</span>就需要 evict 某个已有的 cache line<span class="mojikumi-line-end">。</span></p>
<h3 id="conflict-miss" class="heading"><a href="#conflict-miss" class="heading-anchor" aria-label="章节： Conflict Miss" tabindex="-1"></a><span>Conflict Miss</span></h3>
<p>cache set 的设计基于一个假设<span class="mojikumi-line-end">，</span>即在局部内访问的数据地址的低位往往是不同的<span class="mojikumi-line-end">，</span>但实际上可能并非如此<span class="mojikumi-line-end">。</span>如果以 <span class="math math-inline"><span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msup><mn>2</mn><mrow><mi>s</mi><mo>+</mo><mi>b</mi></mrow></msup></mrow><annotation encoding="application/x-tex">2^{s+b}</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.8491em;"></span><span class="mord"><span class="mord">2</span><span class="msupsub"><span class="vlist-t"><span class="vlist-r"><span class="vlist" style="height:0.8491em;"><span style="top:-3.063em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mathnormal mtight">s</span><span class="mbin mtight">+</span><span class="mord mathnormal mtight">b</span></span></span></span></span></span></span></span></span></span></span></span></span> 的倍数为地址间隔访问数据<span class="mojikumi-line-end">，</span>就可能连续访问同一个 cache set 内的数据<span class="mojikumi-line-end">，</span>导致 cache miss<span class="mojikumi-line-start">（</span><span class="math math-inline"><span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>E</mi></mrow><annotation encoding="application/x-tex">E</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.6833em;"></span><span class="mord mathnormal" style="margin-right:0.05764em;">E</span></span></span></span></span> 较小<span class="mojikumi-line-end">，</span>尤其是 <span class="math math-inline"><span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>E</mi><mo>=</mo><mn>1</mn></mrow><annotation encoding="application/x-tex">E=1</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.6833em;"></span><span class="mord mathnormal" style="margin-right:0.05764em;">E</span><span class="mspace" style="margin-right:0.2778em;"></span><span class="mrel">=</span><span class="mspace" style="margin-right:0.2778em;"></span></span><span class="base"><span class="strut" style="height:0.6444em;"></span><span class="mord">1</span></span></span></span></span> 时<span class="mojikumi-line-end">，</span>这种情况更可能触发<span class="mojikumi">）</span><span class="mojikumi-line-end">。</span>例如<span class="mojikumi-line-end">，</span>数组的大小是 <span class="math math-inline"><span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mn>2</mn></mrow><annotation encoding="application/x-tex">2</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.6444em;"></span><span class="mord">2</span></span></span></span></span> 的次幂而交替访问相邻数组的同一个下标时就可能这样<span class="mojikumi">。</span><wbr><span class="mojikumi-line-start">（</span>这大概在 APIO2019 讲过<span class="mojikumi-line-end">，</span>当时我自然是啥都没听懂<span class="mojikumi-line-end">，</span>就只记得数组不要开 <span class="math math-inline"><span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mn>2</mn></mrow><annotation encoding="application/x-tex">2</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.6444em;"></span><span class="mord">2</span></span></span></span></span> 的次幂<span class="mojikumi">。</span><span class="mojikumi-line-end">）</span></p>
<h3 id="cache-的分类" class="heading"><a href="#cache-的分类" class="heading-anchor" aria-label="章节： Cache 的分类" tabindex="-1"></a><span>Cache 的分类</span></h3>
<p><span class="math math-inline"><span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>E</mi><mo>=</mo><mn>1</mn></mrow><annotation encoding="application/x-tex">E=1</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.6833em;"></span><span class="mord mathnormal" style="margin-right:0.05764em;">E</span><span class="mspace" style="margin-right:0.2778em;"></span><span class="mrel">=</span><span class="mspace" style="margin-right:0.2778em;"></span></span><span class="base"><span class="strut" style="height:0.6444em;"></span><span class="mord">1</span></span></span></span></span> 的 cache 称作 <i>direct-mapped cache</i><span class="mojikumi">。</span><wbr><span class="mojikumi-line-start">（</span><s>书上在这仔细解释了半天<span class="mojikumi-line-end">，</span>感觉废话好多啊<span class="mojikumi-line-end">。</span></s><span class="mojikumi-line-end">）</span></p>
<p><span class="math math-inline"><span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>E</mi><mo>></mo><mn>1</mn></mrow><annotation encoding="application/x-tex">E > 1</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.7224em;vertical-align:-0.0391em;"></span><span class="mord mathnormal" style="margin-right:0.05764em;">E</span><span class="mspace" style="margin-right:0.2778em;"></span><span class="mrel">></span><span class="mspace" style="margin-right:0.2778em;"></span></span><span class="base"><span class="strut" style="height:0.6444em;"></span><span class="mord">1</span></span></span></span></span> 的 cache 称作 <i>set associative cache</i><span class="mojikumi-line-end">。</span>其中<span class="mojikumi-line-end">，</span><span class="math math-inline"><span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>s</mi><mo>></mo><mn>0</mn></mrow><annotation encoding="application/x-tex">s > 0</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.5782em;vertical-align:-0.0391em;"></span><span class="mord mathnormal">s</span><span class="mspace" style="margin-right:0.2778em;"></span><span class="mrel">></span><span class="mspace" style="margin-right:0.2778em;"></span></span><span class="base"><span class="strut" style="height:0.6444em;"></span><span class="mord">0</span></span></span></span></span> 的称作 E-way set associative cache<span class="mojikumi-line-end">，</span>而 <span class="math math-inline"><span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>s</mi><mo>=</mo><mn>0</mn></mrow><annotation encoding="application/x-tex">s = 0</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.4306em;"></span><span class="mord mathnormal">s</span><span class="mspace" style="margin-right:0.2778em;"></span><span class="mrel">=</span><span class="mspace" style="margin-right:0.2778em;"></span></span><span class="base"><span class="strut" style="height:0.6444em;"></span><span class="mord">0</span></span></span></span></span> 的称作 fully associative cache<span class="mojikumi-line-end">。</span></p>
<h3 id="cache-的写入" class="heading"><a href="#cache-的写入" class="heading-anchor" aria-label="章节： Cache 的写入" tabindex="-1"></a><span>Cache 的写入</span></h3>
<p>在 cache hit 时<span class="mojikumi-line-end">，</span>有两种处理方式<span class="mojikumi-line-end">：</span></p>
<ul>
<li><i>write-through</i>: 既修改 cache<span class="mojikumi-line-end">，</span>又修改下一级</li>
<li><i>write-back</i>: 只修改 cache<span class="mojikumi-line-end">，</span>并且在每个 cache line 中添加一个 <i>dirty bit</i><span class="mojikumi-line-end">，</span>用来记录是否被修改过<span class="mojikumi-line-end">，</span>在被 evict 时若 dirty 则写入下一级</li>
</ul>
<p>在 cache miss 时<span class="mojikumi-line-end">，</span>也有两种处理方式<span class="mojikumi-line-end">：</span></p>
<ul>
<li><i>write-allocate</i>: 先从下一级获取数据<span class="mojikumi-line-end">，</span>然后用与 cache hit 相同的处理方式</li>
<li><i>no-write-allocate</i>: 直接写入下一级<span class="mojikumi-line-end">，</span>不获取到 cache 中</li>
</ul>
<p>一般 write-through 和 no-write-allocate 搭配<span class="mojikumi-line-end">，</span>write-back 和 write-allocate 搭配<span class="mojikumi-line-end">。</span></p>
<p>实际上<span class="mojikumi-line-end">，</span>cache 写入的优化是非常复杂的问题<span class="mojikumi-line-end">，</span>这里只是简单介绍了一下<span class="mojikumi-line-end">。</span>作为程序员<span class="mojikumi-line-end">，</span>可以把 cache 写入当成是 write-back<span class="mojikumi-line-end">、</span>write-allocate 的<span class="mojikumi-line-end">。</span></p>
<h3 id="i-cache-和-d-cache" class="heading"><a href="#i-cache-和-d-cache" class="heading-anchor" aria-label="章节： i-cache 和 d-cache" tabindex="-1"></a><span>i-cache 和 d-cache</span></h3>
<p>只存放指令的 cache 称作 <i>i-cache</i><span class="mojikumi-line-end">，</span>只存放数据的 cache 称作 <i>d-cache</i><span class="mojikumi-line-end">，</span>都存放的 cache 称作 <i>unified cache</i><span class="mojikumi-line-end">。</span></p>
<p>将 i-cache 和 d-cache 分开<span class="mojikumi-line-end">，</span>就可以对它们分别进行优化<span class="mojikumi-line-end">，</span>例如 i-cache 是只读的<span class="mojikumi-line-end">，</span>二者可以有不一样的大小<span class="mojikumi-line-end">、</span>不一样的 cache set 设置<span class="mojikumi-line-end">。</span>将两者分开还可以一定程度上避免 conflict miss<span class="mojikumi-line-end">。</span></p>
<p>在 Core i7 处理器中<span class="mojikumi-line-end">，</span>每个核有自己的 L1 i-cache<span class="mojikumi-line-end">、</span>L1 d-cache<span class="mojikumi-line-end">、</span>L2 unified cache<span class="mojikumi-line-end">，</span>所有核共享一个 L3 unified cache<span class="mojikumi-line-end">。</span></p>
<h3 id="cache-的性能" class="heading"><a href="#cache-的性能" class="heading-anchor" aria-label="章节： Cache 的性能" tabindex="-1"></a><span>Cache 的性能</span></h3>
<p>cache 性能的衡量指标有<span class="mojikumi-line-end">：</span></p>
<ul>
<li>miss rate</li>
<li>hit rate</li>
<li>hit time: cache hit 时的访问用时</li>
<li>miss penalty: cache miss 时的访问用时<span class="mojikumi-line-end">，</span>与最终从哪一级获取到数据有关</li>
</ul>
<p>一般来说<span class="mojikumi-line-end">，</span>cache 的参数对性能的影响是<span class="mojikumi-line-end">：</span></p>
<ul>
<li>cache size 越大<span class="mojikumi-line-end">，</span>hit rate 就越高<span class="mojikumi-line-end">，</span>但速度会慢<span class="mojikumi-line-end">。</span></li>
<li>增大 block size 可以更好地利用 spatial locality<span class="mojikumi-line-end">，</span>但也有可能因 cache line 数量减少而降低 hit rate<span class="mojikumi-line-end">，</span>并且会因为每次需要传递的数据变多而增大 miss penalty<span class="mojikumi-line-end">。</span></li>
<li>更大的 <span class="math math-inline"><span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>E</mi></mrow><annotation encoding="application/x-tex">E</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.6833em;"></span><span class="mord mathnormal" style="margin-right:0.05764em;">E</span></span></span></span></span> 可以降低 conflict miss 的可能性<span class="mojikumi-line-end">，</span>但也会使得 tag 匹配以及 victim line 的选择更加复杂<span class="mojikumi-line-end">，</span>从而增大 hit time 和 miss penalty<span class="mojikumi-line-end">。</span>在 Core i7 处理器中<span class="mojikumi-line-end">，</span>L1<span class="mojikumi-line-end">、</span>L2 cache 是 8-way 的<span class="mojikumi-line-end">，</span>L3 cache 是 16-way 的<span class="mojikumi-line-end">。</span></li>
<li>write-through 实现起来更加容易<span class="mojikumi-line-end">，</span>并且在 read miss 时不会触发写入<span class="mojikumi-line-end">。</span>而 write-back 可以减少数据传递的总量<span class="mojikumi-line-end">，</span>降低 I/O bus 带宽的占用<span class="mojikumi-line-end">，</span>也可能降低数据传递的用时<span class="mojikumi-line-end">。</span>一般来说<span class="mojikumi-line-end">，</span>memory hierarchy 中较低的层级更倾向于使用 write-back<span class="mojikumi-line-end">。</span></li>
</ul>
<h2 id="the-impact-of-caches-on-program-performance" class="heading"><a href="#the-impact-of-caches-on-program-performance" class="heading-anchor" aria-label="章节： The Impact of Caches on Program Performance" tabindex="-1"></a><span>The Impact of Caches on Program Performance</span></h2>
<h3 id="the-memory-mountain" class="heading"><a href="#the-memory-mountain" class="heading-anchor" aria-label="章节： The Memory Mountain" tabindex="-1"></a><span>The Memory Mountain</span></h3>
<p>对一定 size 的数据按照一定的 stride 进行访问<span class="mojikumi-line-end">，</span>将 size<span class="mojikumi-line-end">、</span>stride 与数据吞吐量的关系画成三维图像<span class="mojikumi-line-end">，</span>就得到了 <i>memory mountain</i><span class="mojikumi-line-end">。</span></p>
<p>CS:APP Figure 6.41 展示了一座 Core i7 的 memory mountain:<span class="mojikumi-line-start">（</span>这也是 CS:APP 的封面<span class="mojikumi-line-end">）</span></p>
<p><picture><img type="image/webp" srcset="/assets/csapp-fig6.41.869a6222.webp" loading="lazy" src="/assets/csapp-fig6.41.869a6222.webp" width="1285" height="847" alt="Core i7 的 memory mountain"></picture></p>
<p>Memory mountain 较为完整地呈现了一个 memory system 的性能<span class="mojikumi-line-end">，</span>以及 temporal locality 和 spatial locality 对性能的影响<span class="mojikumi-line-end">。</span></p>
<p>在每级 cache 的容量处<span class="mojikumi-line-end">，</span>吞吐量会发生明显的突变<span class="mojikumi-line-end">。</span></p>
<p>在 size 相同时<span class="mojikumi-line-end">，</span>stride 越小吞吐量越高<span class="mojikumi-line-end">。</span>在 stride 接近 1 时变化尤其明显<span class="mojikumi-line-end">，</span>这和 Core i7 系统的 prefetching 技术息息相关<span class="mojikumi-line-end">，</span>处理器能够识别出 stride-1 reference pattern 并在实际访问到数据之前就进行 prefetch<span class="mojikumi-line-end">。</span></p>
<h3 id="矩阵乘法的循环顺序" class="heading"><a href="#矩阵乘法的循环顺序" class="heading-anchor" aria-label="章节： 矩阵乘法的循环顺序" tabindex="-1"></a><span>矩阵乘法的循环顺序</span></h3>
<p><span class="mojikumi-line-start">（</span>书上在这讲了半天<span class="mojikumi-line-end">，</span><s>感觉废话好多</s><span class="mojikumi-line-end">，</span>我就放个测试结果上来吧<span class="mojikumi">。</span><span class="mojikumi">）</span><wbr><span class="mojikumi-line-start">（</span>CS:APP Figure 6.46<span class="mojikumi-line-end">）</span></p>
<p><picture><img type="image/webp" srcset="/assets/csapp-fig6.46.827ccc19.webp" loading="lazy" src="/assets/csapp-fig6.46.827ccc19.webp" width="1233" height="767" alt="Core i7 矩阵乘法性能"></picture></p>]]></content:encoded>
            <category domain="https://ouuan.moe/tag/csapp">csapp</category>
            <category domain="https://ouuan.moe/tag/%E5%AD%A6%E4%B9%A0%E7%AC%94%E8%AE%B0">学习笔记</category>
        </item>
        <item>
            <title><![CDATA[CS:APP 第七章学习笔记]]></title>
            <link>https://ouuan.moe/post/2022/10/csapp-7</link>
            <guid>https://ouuan.moe/post/2022/10/csapp-7</guid>
            <pubDate>Mon, 31 Oct 2022 10:39:06 GMT</pubDate>
            <description><![CDATA[

<p><a href="https://csapp.cs.cmu.edu/">CS:APP</a> 第七章 <span class="mojikumi">“</span>Linking<span class="mojikumi">”</span> 的学习笔记<span class="mojikumi-line-end">。</span></p>
<p>这章的主要内容为程序的链接<span class="mojikumi-line-end">。</span>学习链接有助于<span class="mojikumi-line-end">：</span>理解链接报错<span class="mojikumi-line-end">，</span>避免链接相关的 bug<span class="mojikumi-line-end">，</span>理解变量<span class="mojikumi-line-start">（</span>函数<span class="mojikumi-line-end">）</span>的作用域<span class="mojikumi-line-end">，</span>理解程序运行过程中与链接相关的步骤<span class="mojikumi-line-end">，</span>了解如何使用共享库<span class="mojikumi-line-start">（</span>动态链接库<span class="mojikumi">）</span><span class="mojikumi-line-end">。</span></p>
]]></description>
            <content:encoded><![CDATA[

<p><a href="https://csapp.cs.cmu.edu/">CS:APP</a> 第七章 <span class="mojikumi">“</span>Linking<span class="mojikumi">”</span> 的学习笔记<span class="mojikumi-line-end">。</span></p>
<p>这章的主要内容为程序的链接<span class="mojikumi-line-end">。</span>学习链接有助于<span class="mojikumi-line-end">：</span>理解链接报错<span class="mojikumi-line-end">，</span>避免链接相关的 bug<span class="mojikumi-line-end">，</span>理解变量<span class="mojikumi-line-start">（</span>函数<span class="mojikumi-line-end">）</span>的作用域<span class="mojikumi-line-end">，</span>理解程序运行过程中与链接相关的步骤<span class="mojikumi-line-end">，</span>了解如何使用共享库<span class="mojikumi-line-start">（</span>动态链接库<span class="mojikumi">）</span><span class="mojikumi-line-end">。</span></p>

<h2 id="compiler-drivers" class="heading"><a href="#compiler-drivers" class="heading-anchor" aria-label="章节： Compiler Drivers" tabindex="-1"></a><span>Compiler Drivers</span></h2>
<p>编译源文件其实分成若干步骤<span class="mojikumi-line-end">，</span>compiler driver<span class="mojikumi-line-start">（</span>如 gcc<span class="mojikumi-line-end">）</span>会依次调用这些步骤<span class="mojikumi-line-end">，</span>可以用 <code>gcc -v</code> 来查看这些步骤的详细信息<span class="mojikumi-line-end">。</span></p>
<ol>
<li><code>cpp</code>: 预处理<span class="mojikumi-line-end">，</span>源代码 <code>.c</code> ->  intermediate file <code>.i</code></li>
<li><code>cc1</code>: <code>.i</code> -> 汇编代码 <code>.s</code></li>
<li><code>as</code>: <code>.s</code> -> relocatable object file <code>.o</code></li>
<li><code>ld</code>: 链接<span class="mojikumi-line-end">，</span>多个 <code>.o</code> (或 library) -> executable object file</li>
</ol>
<p>P.S. 中间步骤的文件也可以作为参数传递给 <code>gcc</code><span class="mojikumi-line-end">，</span>例如 <code>gcc a.s -o a</code><span class="mojikumi-line-end">。</span></p>
<h2 id="static-linking" class="heading"><a href="#static-linking" class="heading-anchor" aria-label="章节： Static Linking" tabindex="-1"></a><span>Static Linking</span></h2>
<p>静态链接主要有两个任务<span class="mojikumi-line-end">：</span></p>
<ol>
<li><i>Symbol resolution</i>: relocatable object file 中有很多 symbol<span class="mojikumi-line-end">，</span>包括函数<span class="mojikumi-line-end">、</span>全局变量<span class="mojikumi-line-end">、</span>静态变量等<span class="mojikumi-line-end">，</span>linker 需要将每个 symbol reference 对应到一个 symbol definition<span class="mojikumi-line-end">。</span></li>
<li><i>Relocation</i>: relocatable object file 中地址从 0 开始<span class="mojikumi-line-end">，</span>linker 需要将每个 symbol definition 重新分配到正确的地址<span class="mojikumi-line-end">，</span>并相应地修改每个 symbol reference<span class="mojikumi-line-end">。</span></li>
</ol>
<h2 id="object-files" class="heading"><a href="#object-files" class="heading-anchor" aria-label="章节： Object Files" tabindex="-1"></a><span>Object Files</span></h2>
<p>object file 分为三种<span class="mojikumi-line-end">：</span></p>
<ol>
<li>Relocatable object file</li>
<li>Executable object file</li>
<li>Shared object file: 一种特殊的 relocatable object file<span class="mojikumi-line-end">，</span>可以在 load time 或 run time 进行动态链接</li>
</ol>
<p>object file 有不同的格式<span class="mojikumi-line-end">，</span>Windows 使用 Portable Executable (PE) 格式<span class="mojikumi-line-end">，</span>macOS 使用 Mach-O 格式<span class="mojikumi-line-end">，</span>现代的 x86-64 Linux/Unix 系统使用 Executable and Linkable Format (ELF) 格式<span class="mojikumi-line-end">。</span>本章会基于 ELF-64<span class="mojikumi-line-end">。</span></p>
<h2 id="relocatable-object-files" class="heading"><a href="#relocatable-object-files" class="heading-anchor" aria-label="章节： Relocatable Object Files" tabindex="-1"></a><span>Relocatable Object Files</span></h2>
<p>ELF relocatable object file 通常包含以下 section<span class="mojikumi-line-end">：</span></p>
<ol>
<li><code>.text</code>: 程序的机器码</li>
<li><code>.rodata</code>: 只读的数据</li>
<li><code>.data</code>: 需要初始化的全局变量和静态变量</li>
<li><code>.bss</code>: 未初始化或初始化为零的全局变量和静态变量<span class="mojikumi-line-end">，</span>它们在运行时会以零为初值<span class="mojikumi-line-end">，</span>从而在 object file 中不占据文件大小</li>
<li><code>.symtab</code>: symbol table<span class="mojikumi-line-end">，</span>存储 symbol<span class="mojikumi-line-start">（</span>函数<span class="mojikumi-line-end">、</span>全局变量<span class="mojikumi-line-end">）</span>的信息<span class="mojikumi-line-end">，</span>不需要 <code>-g</code> 编译选项<span class="mojikumi-line-end">，</span>但不含局部变量的信息</li>
<li><code>.rel.text</code>: 列出了 <code>.text</code> 中在链接时需要修改的地方<span class="mojikumi-line-end">，</span>一般是调用外部函数或引用全局变量时需要修改<span class="mojikumi-line-end">，</span>而 <a href="/post/2022/09/csapp-3#jump-%E6%8C%87%E4%BB%A4%E7%BC%96%E7%A0%81">调用本地函数不需要修改</a></li>
<li><code>.rel.data</code>: 列出了 <code>.data</code> 中在链接时需要修改的地方<span class="mojikumi-line-end">，</span>一般是全局变量的值为其他全局变量或外部函数的地址时需要修改</li>
<li><code>.debug</code>: 调试信息<span class="mojikumi-line-end">，</span>包含局部变量的信息<span class="mojikumi-line-end">、</span>typedef 信息<span class="mojikumi-line-end">、</span>源代码等<span class="mojikumi-line-end">，</span>需要 <code>-g</code> 编译选项才有</li>
<li><code>.line</code>: 源代码与机器码行号间的对应关系<span class="mojikumi-line-end">，</span>需要 <code>-g</code> 编译选项才有</li>
<li><code>.strtab</code>: 一堆字符<span class="mojikumi-line-end">，</span>用于其它 section<span class="mojikumi-line-end">，</span>可以指向其中一个位置来表示一个字符串<span class="mojikumi-line-start">（</span>从这个位置起到 <code>\0</code> 为止<span class="mojikumi-line-end">）</span></li>
</ol>
<h2 id="symbols-and-symbol-tables" class="heading"><a href="#symbols-and-symbol-tables" class="heading-anchor" aria-label="章节： Symbols and Symbol Tables" tabindex="-1"></a><span>Symbols and Symbol Tables</span></h2>
<p>对 linker 来说<span class="mojikumi-line-end">，</span>symbol 有三种<span class="mojikumi-line-end">：</span></p>
<ol>
<li>本地定义<span class="mojikumi-line-end">，</span>可以被外部访问的: C 中非 <code>static</code> 的函数和全局变量</li>
<li>外部定义的<span class="mojikumi-line-end">，</span>例如 C 中 <code>extern</code> 的全局变量</li>
<li>本地定义<span class="mojikumi-line-end">，</span>外部不可访问的: C 中 <code>static</code> 的函数和变量</li>
</ol>
<p>一个 ELF64 symbol 包含如下信息<span class="mojikumi-line-start">（</span>CS:APP Figure 7.4<span class="mojikumi">）</span><span class="mojikumi-line-end">：</span></p>
<section class="code-block relative my-6 shadow" itemprop="hasPart" itemscope itemtype="https://schema.org/SoftwareSourceCode" data-v-c675dba6><div class="h-6 items-center rd-t-1 bg-area px-4 dark:bg-#2A313A media-screen:important-flex" style="display:none;" data-v-c675dba6><h3 class="text-3 text-footer" itemprop="programmingLanguage" aria-label="C 代码块" data-v-c675dba6>C</h3><ile-root id="ile-4"><button title="复制到剪贴板" class="copy-button b-footer text-footer" data-v-63dfb2af><span class="i-mdi-content-copy" data-v-63dfb2af></span><span class="sr-only" role="status" data-v-63dfb2af></span></button></ile-root><!--ISLAND_HYDRATION_PLACEHOLDER_ile-4--></div><div class="dark:hidden" itemprop="text" data-v-c675dba6><pre class="shiki light" style="background-color: #FBFBFB" tabindex="0"><code><span><span style="color: #994CC3">typedef</span><span style="color: #403F53"> </span><span style="color: #994CC3">struct</span></span>
<span><span style="color: #403F53">{</span></span>
<span><span style="color: #403F53">    </span><span style="color: #994CC3">int</span><span style="color: #403F53">   name;</span><span style="color: #989FB1">      /* String table offset */</span></span>
<span><span style="color: #403F53">    </span><span style="color: #994CC3">char</span><span style="color: #403F53">  type:</span><span style="color: #AA0982">4</span><span style="color: #403F53">,</span><span style="color: #989FB1">    /* Function or data (4 bits) */</span></span>
<span><span style="color: #403F53">          binding:</span><span style="color: #AA0982">4</span><span style="color: #403F53">;</span><span style="color: #989FB1"> /* Local or global (4 bits) */</span></span>
<span><span style="color: #403F53">    </span><span style="color: #994CC3">char</span><span style="color: #403F53">  reserved;</span><span style="color: #989FB1">  /* Unused */</span></span>
<span><span style="color: #403F53">    </span><span style="color: #994CC3">short</span><span style="color: #403F53"> section;</span><span style="color: #989FB1">   /* Section header index */</span></span>
<span><span style="color: #403F53">    </span><span style="color: #994CC3">long</span><span style="color: #403F53">  value;</span><span style="color: #989FB1">     /* Section offset or absolute address */</span></span>
<span><span style="color: #403F53">    </span><span style="color: #994CC3">long</span><span style="color: #403F53">  size;</span><span style="color: #989FB1">      /* Object size in bytes */</span></span>
<span><span style="color: #403F53">} Elf64_Symbol;</span></span></code></pre></div><div class="dark:important-block" style="display:none;" data-v-c675dba6><pre class="shiki dark" style="background-color: #011627" tabindex="0"><code><span><span style="color: #C792EA">typedef</span><span style="color: #D6DEEB"> </span><span style="color: #C792EA">struct</span></span>
<span><span style="color: #D6DEEB">{</span></span>
<span><span style="color: #D6DEEB">    </span><span style="color: #C792EA">int</span><span style="color: #D6DEEB">   name;</span><span style="color: #637777">      /* String table offset */</span></span>
<span><span style="color: #D6DEEB">    </span><span style="color: #C792EA">char</span><span style="color: #D6DEEB">  type:</span><span style="color: #F78C6C">4</span><span style="color: #D6DEEB">,</span><span style="color: #637777">    /* Function or data (4 bits) */</span></span>
<span><span style="color: #D6DEEB">          binding:</span><span style="color: #F78C6C">4</span><span style="color: #D6DEEB">;</span><span style="color: #637777"> /* Local or global (4 bits) */</span></span>
<span><span style="color: #D6DEEB">    </span><span style="color: #C792EA">char</span><span style="color: #D6DEEB">  reserved;</span><span style="color: #637777">  /* Unused */</span></span>
<span><span style="color: #D6DEEB">    </span><span style="color: #C792EA">short</span><span style="color: #D6DEEB"> section;</span><span style="color: #637777">   /* Section header index */</span></span>
<span><span style="color: #D6DEEB">    </span><span style="color: #C792EA">long</span><span style="color: #D6DEEB">  value;</span><span style="color: #637777">     /* Section offset or absolute address */</span></span>
<span><span style="color: #D6DEEB">    </span><span style="color: #C792EA">long</span><span style="color: #D6DEEB">  size;</span><span style="color: #637777">      /* Object size in bytes */</span></span>
<span><span style="color: #D6DEEB">} Elf64_Symbol;</span></span></code></pre></div></section>
<p><code>value</code> 在 relocatable object file 中是 symbol 的地址相对于 section 开头的 offset<span class="mojikumi-line-end">，</span>在 executable object file 中是 symbol 的绝对地址<span class="mojikumi-line-end">。</span></p>
<p><code>section</code> 是 object file 的 section 之一<span class="mojikumi-line-start">（</span>的 index<span class="mojikumi">）</span><span class="mojikumi-line-end">，</span>在 relocatable object file 中还可以是一个 pseudosection:</p>
<ul>
<li>ABS: 不应被 relocate 的 symbol</li>
<li>UNDEF: 未定义<span class="mojikumi-line-start">（</span>在其他 module 中定义<span class="mojikumi-line-end">）</span>的 symbol</li>
<li>COMMON: 多个 module 共用的 symbol<span class="mojikumi-line-start">（</span>见 <a href="#symbol-resolution">Symbol Resolution</a><span class="mojikumi">）</span><span class="mojikumi-line-end">，</span>此时 <code>value</code> 的值给出 data alignment 的要求<span class="mojikumi-line-end">，</span><code>size</code> 给出的是 minimum size</li>
</ul>
<p>未初始化的静态变量以及初始化为零的全局或静态变量会放在 <code>.bss</code><span class="mojikumi-line-end">。</span></p>
<p>未初始化的全局变量<span class="mojikumi-line-end">，</span>如果启用了 <code>-fcommon</code> 编译选项则会放在 COMMON<span class="mojikumi-line-end">，</span>否则放在 <code>.bss</code><span class="mojikumi-line-end">。</span>在 gcc 9 及之前默认选项是 <code>-fcommon</code><span class="mojikumi-line-end">，</span>而自 gcc 10 起默认选项是 <code>-<wbr>fno<wbr>-<wbr>common</code><span class="mojikumi-line-end">。</span>在 C++ 中 <code>-fcommon</code> 是无效的<span class="mojikumi-line-end">，</span>未初始化的全局变量总是放在 <code>.bss</code><span class="mojikumi-line-end">。</span></p>
<p>可以使用 <code>readelf -s a.o</code> 来查看 <code>a.o</code> 的 <code>.symtab</code><span class="mojikumi-line-end">。</span></p>
<h2 id="symbol-resolution" class="heading"><a href="#symbol-resolution" class="heading-anchor" aria-label="章节： Symbol Resolution" tabindex="-1"></a><span>Symbol Resolution</span></h2>
<p>Symbol resolution 即把每个 symbol reference 对应到一个 symbol definition<span class="mojikumi-line-end">。</span></p>
<p>local symbol 的 resolution 是容易的<span class="mojikumi-line-end">，</span>因为编译单个 module 时就保证了 local symbol 是唯一的<span class="mojikumi-line-end">。</span></p>
<p>global symbol 可能遇到几种情况<span class="mojikumi-line-end">：</span></p>
<ul>
<li>只有一个 module 里定义了这个 global symbol<span class="mojikumi-line-end">，</span>则使用这个 symbol</li>
<li>没有任何 module 里定义了这个 global symbol<span class="mojikumi-line-end">，</span>则报错 undefined reference</li>
<li>在多个 module 里定义了这个 global symbol<span class="mojikumi-line-end">，</span>则<span class="mojikumi-line-end">：</span>
<ul>
<li>如果其中有多个 symbol 不在 COMMON 段<span class="mojikumi-line-end">，</span>则报错 multiple definition</li>
<li>如果其中只有一个不在 COMMON 段<span class="mojikumi-line-end">，</span>则使用这个 symbol</li>
<li>如果这些 symbol 都在 COMMON 段<span class="mojikumi-line-end">，</span>则使用其中 <code>size</code> 最大的一个<span class="mojikumi-line-start">（</span>如果 <code>size</code> 相同则使用哪个是没有区别的<span class="mojikumi">）</span><span class="mojikumi-line-end">；</span>如果这些 symbol 有不一样的 <code>size</code><span class="mojikumi-line-end">，</span>linker 还会给出警告</li>
</ul>
</li>
</ul>
<p>也就是说<span class="mojikumi-line-end">，</span>若编译选项为 <code>-fcommon</code><span class="mojikumi-line-end">，</span>如果在多个 module 中定义了同一个全局变量且其中最多有一个初始化了<span class="mojikumi-line-end">，</span>则可能导致意外的结果<span class="mojikumi-line-end">。</span>可以理解为<span class="mojikumi-line-end">，</span>multiple definition 在本质上是 multiple initialization<span class="mojikumi-line-end">。</span></p>
<p>在 C++ 中<span class="mojikumi-line-end">，</span>函数重载<span class="mojikumi-line-end">、</span>类方法会通过 <i>mangling</i> 来使得函数的每种重载有独特的 symbol name<span class="mojikumi-line-end">。</span></p>
<h2 id="static-libraries" class="heading"><a href="#static-libraries" class="heading-anchor" aria-label="章节： Static Libraries" tabindex="-1"></a><span>Static Libraries</span></h2>
<p>Static library 其实就是一堆 object file 包装在一起<span class="mojikumi-line-end">，</span>它的好处是<span class="mojikumi-line-end">：</span></p>
<ol>
<li>不用每次重新编译<span class="mojikumi-line-start">（</span>比起提供源码<span class="mojikumi-line-end">）</span></li>
<li>使得库和编译器解耦<span class="mojikumi-line-start">（</span>比起将库函数内置到编译器中<span class="mojikumi-line-end">）</span></li>
<li>只需将用到的 object file 复制到最终的可执行文件中<span class="mojikumi-line-end">，</span>避免空间浪费<span class="mojikumi-line-start">（</span>比起提供单个 object file<span class="mojikumi-line-end">）</span></li>
<li>可以自动选择用到的 object file<span class="mojikumi-line-end">，</span>在编译命令中只需指定少量库的名称<span class="mojikumi-line-start">（</span>比起提供一堆 object file<span class="mojikumi-line-end">）</span></li>
</ol>
<p>可以使用类似 <code>ar rcs libabc.a a.o b.o c.o</code> 的命令来创建一个 static library<span class="mojikumi-line-end">。</span></p>
<p>在编译时<span class="mojikumi-line-end">，</span>有两种使用 static library 的方式<span class="mojikumi-line-end">：</span></p>
<ul>
<li>直接将 static library 的路径作为参数: <code>libabc.a</code></li>
<li>使用 <code>-lname</code> 来使用 <code>libname.a</code><span class="mojikumi-line-end">，</span>但需要使用 <code>-Ldir</code> 来将 <code>dir</code> 加入到 <code>-l</code> 的搜索路径之中: <code>-L. -labc</code></li>
</ul>
<p>特别地<span class="mojikumi-line-end">，</span>编译器会自动将 <code>libc.a</code> 提供给 linker<span class="mojikumi-line-end">，</span>不需要手动指定<span class="mojikumi-line-end">。</span></p>
<p>在链接时<span class="mojikumi-line-end">，</span>linker 会依次处理每个参数<span class="mojikumi-line-end">：</span></p>
<ul>
<li>如果一个参数是 object file 就一定会使用</li>
<li>如果是 static library<span class="mojikumi-line-end">，</span>则会依次查看其中包含的每一个 object file<span class="mojikumi-line-end">，</span>如果一个 object file 中定义了某个当前引用了但仍未定义的 symbol<span class="mojikumi-line-end">，</span>则会使用这个 object file<span class="mojikumi-line-end">，</span>而这样的过程会反复迭代进行直到没有新的 object file 被使用为止<span class="mojikumi-line-start">（</span>例如 <code>main.c</code> 引用了 <code>b.o</code> 而没有引用 <code>a.o</code><span class="mojikumi-line-end">，</span>而 <code>b.o</code> 中引用了 <code>a.o</code><span class="mojikumi-line-end">，</span>且在 <code>libabc.a</code> 中 <code>a.o</code> 位于 <code>b.o</code> 之前<span class="mojikumi-line-end">，</span>那么第一次迭代中只会使用 <code>b.o</code><span class="mojikumi-line-end">，</span>第二次迭代才会使用 <code>a.o</code><span class="mojikumi-line-end">，</span>而 <code>c.o</code> 不会被使用<span class="mojikumi-line-end">）</span></li>
</ul>
<p>这样的过程使得编译命令中参数的顺序以及 static library 中 object file 的顺序可能影响编译结果<span class="mojikumi-line-end">：</span></p>
<ul>
<li>一般来说需要将 library 放在编译命令的末尾<span class="mojikumi-line-end">，</span>否则处理一个 library 时还没有引用其中的 symbol<span class="mojikumi-line-end">，</span>就不会使用相应的 object file<span class="mojikumi-line-end">，</span>最后就会报错 undefined reference</li>
<li>如果多个 library 之间有依赖关系<span class="mojikumi-line-end">，</span>需要将被其他 library 依赖的 library 放在靠后的位置</li>
<li>如果多个 library 之间有循环依赖<span class="mojikumi-line-end">，</span>可能需要在编译命令中多次指定同一个 library<span class="mojikumi-line-start">（</span>或者也可以将这两个 library 合并成一个<span class="mojikumi-line-end">，</span>这样的话通过多次迭代就可以解决循环依赖<span class="mojikumi-line-end">）</span></li>
<li>library 的设计应当避免 multiple definition<span class="mojikumi-line-end">，</span>但理论上存在不同的参数顺序或 static library 中 object file 的顺序导致 multiple definition 的可能</li>
</ul>
<h2 id="relocation" class="heading"><a href="#relocation" class="heading-anchor" aria-label="章节： Relocation" tabindex="-1"></a><span>Relocation</span></h2>
<p>relocation 分为两步<span class="mojikumi-line-end">：</span></p>
<ol>
<li>给 symbol definition 重新分配内存地址</li>
<li>相应地修改 symbol reference</li>
</ol>
<p>第一步是简单的<span class="mojikumi-line-end">，</span>把各个 object file 中的各个 section 分别拼在一起即可<span class="mojikumi-line-end">。</span></p>
<p>为了让 linker 知道如何修改 symbol reference<span class="mojikumi-line-end">，</span>需要让 linker 知道<span class="mojikumi-line-end">：</span></p>
<ol>
<li>需要被修改的 symbol reference 在哪</li>
<li>需要修改成什么</li>
</ol>
<p>在 relocatable object file 的 <code>.rel.text</code> 和 <code>.rel.data</code> 中存放了相关的信息<span class="mojikumi-line-end">，</span>一条这样的信息称作一个 relocation entry<span class="mojikumi-line-end">，</span>包含的内容为<span class="mojikumi-line-end">：</span></p>
<ul>
<li><code>offset</code>: 这个 symbol reference 相对于其所在的 section 的偏移量<span class="mojikumi-line-end">。</span>也就是说<span class="mojikumi-line-end">，</span>在这个 reference 所在的 section 的地址的基础上加上 <code>offset</code> 就得到了这个 reference 的地址<span class="mojikumi-line-end">。</span></li>
<li><code>type</code>: 有很多种 relocation<span class="mojikumi-line-end">，</span>CS:APP 中只介绍其中的 <code>R_<wbr>X86_<wbr>64_<wbr>PC32</code> 和 <code>R_X86_64_32</code> 两种<span class="mojikumi-line-end">。</span></li>
<li><code>symbol</code>: 被 reference 的 symbol 在 symbol table 中的 index<span class="mojikumi-line-end">。</span></li>
<li><code>addend</code>: 计算 symbol 地址时加在最后的常数<span class="mojikumi-line-start">（</span>见后文<span class="mojikumi">）</span><span class="mojikumi-line-end">。</span></li>
</ul>
<p>简单来说<span class="mojikumi-line-end">，</span><code>R_X86_64_32</code> 使用绝对地址进行定位<span class="mojikumi-line-end">，</span><code>R_<wbr>X86_<wbr>64_<wbr>PC32</code> 使用相对于 PC 的地址进行定位<span class="mojikumi-line-end">，</span>且这两种类型的 relocation 都只支持 32 位的地址<span class="mojikumi-line-start">（</span>如果一个程序的大小超过 2GB<span class="mojikumi-line-end">，</span>就需要指定编译选项 <code>-<wbr>mcmodel<wbr>=<wbr>medium<wbr>/<wbr>large</code><span class="mojikumi">）</span><span class="mojikumi-line-end">。</span></p>
<ul>
<li><code>R_X86_64_32</code>: 修改后的 reference 为 symbol 的地址加上 <code>addend</code></li>
<li><code>R_<wbr>X86_<wbr>64_<wbr>PC32</code>: 修改后的 reference 为 symbol 的地址与 reference 的地址之差加上 <code>addend</code><span class="mojikumi-line-end">；</span>需要注意的是<span class="mojikumi-line-end">，</span>是与 reference 的地址之差<span class="mojikumi-line-end">，</span>而不是与执行到 reference 所在语句时的 PC 之差<span class="mojikumi-line-end">，</span>所以通常会需要通过 <code>addend</code> 来修正</li>
</ul>
<p>可以使用 <code>objdump -dx</code> 以在反汇编结果中显示 relocation entry<span class="mojikumi-line-end">，</span>或者使用 <code>readelf -r</code> 显示所有 relocation entry<span class="mojikumi-line-end">。</span></p>
<p>例如<span class="mojikumi-line-end">，</span>使用 GCC 8.5 编译</p>
<section class="code-block relative my-6 shadow" itemprop="hasPart" itemscope itemtype="https://schema.org/SoftwareSourceCode" data-v-c675dba6><div class="h-6 items-center rd-t-1 bg-area px-4 dark:bg-#2A313A media-screen:important-flex" style="display:none;" data-v-c675dba6><h3 class="text-3 text-footer" itemprop="programmingLanguage" aria-label="C 代码块" data-v-c675dba6>C</h3><ile-root id="ile-5"><button title="复制到剪贴板" class="copy-button b-footer text-footer" data-v-63dfb2af><span class="i-mdi-content-copy" data-v-63dfb2af></span><span class="sr-only" role="status" data-v-63dfb2af></span></button></ile-root><!--ISLAND_HYDRATION_PLACEHOLDER_ile-5--></div><div class="dark:hidden" itemprop="text" data-v-c675dba6><pre class="shiki light" style="background-color: #FBFBFB" tabindex="0"><code><span><span style="color: #994CC3">int</span><span style="color: #403F53"> </span><span style="color: #4876D6">foo</span><span style="color: #403F53">(</span><span style="color: #994CC3">int</span><span style="color: #403F53"> </span><span style="color: #0C969B">*</span><span style="color: #403F53">arr);</span></span>
<span></span>
<span><span style="color: #994CC3">int</span><span style="color: #403F53"> </span><span style="color: #4876D6">a</span><span style="color: #403F53">[</span><span style="color: #AA0982">3</span><span style="color: #403F53">] </span><span style="color: #994CC3">=</span><span style="color: #403F53"> {</span><span style="color: #AA0982">1</span><span style="color: #403F53">, </span><span style="color: #AA0982">2</span><span style="color: #403F53">, </span><span style="color: #AA0982">3</span><span style="color: #403F53">};</span></span>
<span><span style="color: #994CC3">int</span><span style="color: #403F53"> </span><span style="color: #0C969B">*</span><span style="color: #403F53">b </span><span style="color: #994CC3">=</span><span style="color: #403F53"> </span><span style="color: #0C969B">&amp;</span><span style="color: #4876D6">a</span><span style="color: #403F53">[</span><span style="color: #AA0982">2</span><span style="color: #403F53">];</span></span>
<span></span>
<span><span style="color: #994CC3">int</span><span style="color: #403F53"> </span><span style="color: #4876D6">bar</span><span style="color: #403F53">()</span></span>
<span><span style="color: #403F53">{</span></span>
<span><span style="color: #403F53">    </span><span style="color: #994CC3">return</span><span style="color: #403F53"> </span><span style="color: #4876D6">foo(</span><span style="color: #0C969B">&amp;</span><span style="color: #4876D6">a[</span><span style="color: #AA0982">1</span><span style="color: #4876D6">])</span><span style="color: #403F53">;</span></span>
<span><span style="color: #403F53">}</span></span></code></pre></div><div class="dark:important-block" style="display:none;" data-v-c675dba6><pre class="shiki dark" style="background-color: #011627" tabindex="0"><code><span><span style="color: #C792EA">int</span><span style="color: #D6DEEB"> </span><span style="color: #82AAFF">foo</span><span style="color: #D6DEEB">(</span><span style="color: #C792EA">int</span><span style="color: #D6DEEB"> </span><span style="color: #7FDBCA">*</span><span style="color: #D7DBE0">arr</span><span style="color: #D6DEEB">);</span></span>
<span></span>
<span><span style="color: #C792EA">int</span><span style="color: #D6DEEB"> </span><span style="color: #C5E478">a</span><span style="color: #D6DEEB">[</span><span style="color: #F78C6C">3</span><span style="color: #D6DEEB">] </span><span style="color: #C792EA">=</span><span style="color: #D6DEEB"> {</span><span style="color: #F78C6C">1</span><span style="color: #D6DEEB">, </span><span style="color: #F78C6C">2</span><span style="color: #D6DEEB">, </span><span style="color: #F78C6C">3</span><span style="color: #D6DEEB">};</span></span>
<span><span style="color: #C792EA">int</span><span style="color: #D6DEEB"> </span><span style="color: #7FDBCA">*</span><span style="color: #D6DEEB">b </span><span style="color: #C792EA">=</span><span style="color: #D6DEEB"> </span><span style="color: #7FDBCA">&amp;</span><span style="color: #C5E478">a</span><span style="color: #D6DEEB">[</span><span style="color: #F78C6C">2</span><span style="color: #D6DEEB">];</span></span>
<span></span>
<span><span style="color: #C792EA">int</span><span style="color: #D6DEEB"> </span><span style="color: #82AAFF">bar</span><span style="color: #D6DEEB">()</span></span>
<span><span style="color: #D6DEEB">{</span></span>
<span><span style="color: #D6DEEB">    </span><span style="color: #C792EA">return</span><span style="color: #D6DEEB"> </span><span style="color: #82AAFF">foo(</span><span style="color: #7FDBCA">&amp;</span><span style="color: #C5E478">a</span><span style="color: #82AAFF">[</span><span style="color: #F78C6C">1</span><span style="color: #82AAFF">])</span><span style="color: #D6DEEB">;</span></span>
<span><span style="color: #D6DEEB">}</span></span></code></pre></div></section>
<p><code>readelf -r</code>:</p>
<section class="code-block relative my-6 shadow" data-v-c675dba6><div class="h-6 items-center rd-t-1 bg-area px-4 dark:bg-#2A313A media-screen:important-flex" style="display:none;" data-v-c675dba6><h3 class="text-3 text-footer" aria-label="plain text 代码块" data-v-c675dba6>plain text</h3><ile-root id="ile-6"><button title="复制到剪贴板" class="copy-button b-footer text-footer" data-v-63dfb2af><span class="i-mdi-content-copy" data-v-63dfb2af></span><span class="sr-only" role="status" data-v-63dfb2af></span></button></ile-root><!--ISLAND_HYDRATION_PLACEHOLDER_ile-6--></div><div class="dark:hidden" data-v-c675dba6><pre class="shiki light" style="background-color: #FBFBFB" tabindex="0"><samp><span><span style="color: #403f53">Relocation section &#39;.rela.text&#39; at offset 0x250 contains 2 entries:</span></span>
<span><span style="color: #403f53">  Offset          Info           Type           Sym. Value    Sym. Name + Addend</span></span>
<span><span style="color: #403f53">000000000001  000a0000000a R_X86_64_32       0000000000000008 a + 4</span></span>
<span><span style="color: #403f53">000000000006  000b00000002 R_X86_64_PC32     0000000000000000 foo - 4</span></span>
<span><span style="color: #403f53"></span></span>
<span><span style="color: #403f53">Relocation section &#39;.rela.data&#39; at offset 0x280 contains 1 entry:</span></span>
<span><span style="color: #403f53">  Offset          Info           Type           Sym. Value    Sym. Name + Addend</span></span>
<span><span style="color: #403f53">000000000000  000a00000001 R_X86_64_64       0000000000000008 a + 8</span></span></samp></pre></div><div class="dark:important-block" style="display:none;" data-v-c675dba6><pre class="shiki dark" style="background-color: #011627" tabindex="0"><samp><span><span style="color: #d6deeb">Relocation section &#39;.rela.text&#39; at offset 0x250 contains 2 entries:</span></span>
<span><span style="color: #d6deeb">  Offset          Info           Type           Sym. Value    Sym. Name + Addend</span></span>
<span><span style="color: #d6deeb">000000000001  000a0000000a R_X86_64_32       0000000000000008 a + 4</span></span>
<span><span style="color: #d6deeb">000000000006  000b00000002 R_X86_64_PC32     0000000000000000 foo - 4</span></span>
<span><span style="color: #d6deeb"></span></span>
<span><span style="color: #d6deeb">Relocation section &#39;.rela.data&#39; at offset 0x280 contains 1 entry:</span></span>
<span><span style="color: #d6deeb">  Offset          Info           Type           Sym. Value    Sym. Name + Addend</span></span>
<span><span style="color: #d6deeb">000000000000  000a00000001 R_X86_64_64       0000000000000008 a + 8</span></span></samp></pre></div></section>
<p>在 <code>.rela.text</code> 中<span class="mojikumi-line-end">，</span><code>a</code> 的 <code>addend</code> 是 <code>4</code><span class="mojikumi-line-end">，</span>是直接得到 <code>a[1]</code> 而非 <code>a[0]</code> 的地址<span class="mojikumi-line-end">；</span><code>foo</code> 的 <code>addend</code> 是 <code>-4</code><span class="mojikumi-line-end">，</span>是因为 reference 的地址是 reference 所在的 <code>jmp</code> 指令的下一条指令的地址减 4<span class="mojikumi-line-end">，</span>导致 PC 的地址加上 <code>foo</code> 的地址减去 reference 的地址得到的是 <code>foo</code> 的地址加 4<span class="mojikumi-line-end">，</span>需要 <code>addend</code> 来修正<span class="mojikumi-line-end">。</span></p>
<h2 id="executable-object-files" class="heading"><a href="#executable-object-files" class="heading-anchor" aria-label="章节： Executable Object Files" tabindex="-1"></a><span>Executable Object Files</span></h2>
<p>可执行文件的内容大体上和 relocatable object file 类似<span class="mojikumi-line-end">，</span>主要的区别是<span class="mojikumi-line-end">：</span></p>
<ul>
<li>在 ELF header 中指定了程序的 entry point</li>
<li>有一个 <code>.init</code> section<span class="mojikumi-line-end">，</span>定义了一个简单的函数<span class="mojikumi-line-end">，</span>用来初始化程序</li>
<li>有一个 program header table<span class="mojikumi-line-end">，</span>描述了程序文件与内存的对应关系<span class="mojikumi-line-end">，</span>即要把文件的哪一段映射到内存的哪一段<span class="mojikumi-line-end">，</span>地址如何对齐<span class="mojikumi-line-end">，</span>以及每一段的权限<span class="mojikumi-line-start">（</span><code>.init</code><span class="mojikumi-line-end">、</span><code>.text</code><span class="mojikumi-line-end">、</span><code>.rodata</code> 的权限为 <code>r-x</code><span class="mojikumi-line-end">，</span><code>.data</code> 和 <code>.bss</code> 的权限为 <code>rw-</code><span class="mojikumi-line-end">）</span></li>
<li><code>.symtab</code><span class="mojikumi-line-end">、</span><code>.debug</code><span class="mojikumi-line-end">、</span><code>.line</code><span class="mojikumi-line-end">、</span><code>.strtab</code> 在执行时不会加载到内存中</li>
<li>如果 fully linked<span class="mojikumi-line-end">，</span>则没有 <code>.rel</code> section</li>
</ul>
<h2 id="loading-executable-object-files" class="heading"><a href="#loading-executable-object-files" class="heading-anchor" aria-label="章节： Loading Executable Object Files" tabindex="-1"></a><span>Loading Executable Object Files</span></h2>
<p>在程序运行时<span class="mojikumi-line-end">，</span>run-time memory image 大致如下图<span class="mojikumi-line-start">（</span>CS:APP Figure 7.15<span class="mojikumi-line-end">）</span>所示<span class="mojikumi-line-end">：</span></p>
<p><picture><img type="image/webp" srcset="/assets/csapp-fig7.15.ce83adc4.webp" loading="lazy" src="/assets/csapp-fig7.15.ce83adc4.webp" width="648" height="619" alt="Linux x86-64 run-time memory image"></picture></p>
<p><span class="mojikumi-line-start">（</span><a href="https://csapp.cs.cmu.edu/3e/errata.html">errata</a> 中指出<span class="mojikumi-line-end">，</span>栈的起始地址并不是 <span class="math math-inline"><span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msup><mn>2</mn><mn>48</mn></msup><mo>−</mo><mn>1</mn></mrow><annotation encoding="application/x-tex">2^{48}-1</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.8974em;vertical-align:-0.0833em;"></span><span class="mord"><span class="mord">2</span><span class="msupsub"><span class="vlist-t"><span class="vlist-r"><span class="vlist" style="height:0.8141em;"><span style="top:-3.063em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mtight">48</span></span></span></span></span></span></span></span></span><span class="mspace" style="margin-right:0.2222em;"></span><span class="mbin">−</span><span class="mspace" style="margin-right:0.2222em;"></span></span><span class="base"><span class="strut" style="height:0.6444em;"></span><span class="mord">1</span></span></span></span></span><span class="mojikumi">。</span><span class="mojikumi-line-end">）</span></p>
<p>因为地址对齐<span class="mojikumi-line-end">、</span>address-space layout randomization 等原因<span class="mojikumi-line-end">，</span>实际上的内存结构会与上图有一定的差异<span class="mojikumi-line-end">，</span>但每一段的相对位置是和图中一致的<span class="mojikumi-line-end">。</span></p>
<p>loader 加载可执行文件时<span class="mojikumi-line-end">，</span>首先创建 memory image<span class="mojikumi-line-end">，</span>然后根据 program header table 将可执行文件的内容映射到内存中<span class="mojikumi-line-end">，</span>最后跳转到程序的 entry point<span class="mojikumi-line-end">。</span>C 语言程序的 entry point 是 <code>_start</code> 函数<span class="mojikumi-line-start">（</span>在 <code>crt1.o</code> 中定义<span class="mojikumi-line-end">）</span>的地址<span class="mojikumi-line-end">，</span><code>_start</code> 又会调用 <code>_<wbr>_<wbr>libc_<wbr>start_<wbr>main</code> 函数<span class="mojikumi-line-start">（</span>在 <code>libc.so</code> 中定义<span class="mojikumi">）</span><span class="mojikumi-line-end">，</span>进行运行环境的初始化<span class="mojikumi-line-end">，</span>然后调用 <code>main</code> 函数<span class="mojikumi-line-end">，</span>最后对返回值进行处理<span class="mojikumi-line-end">。</span></p>
<h2 id="dynamic-linking-with-shared-libraries" class="heading"><a href="#dynamic-linking-with-shared-libraries" class="heading-anchor" aria-label="章节： Dynamic Linking with Shared Libraries" tabindex="-1"></a><span>Dynamic Linking with Shared Libraries</span></h2>
<p>静态库有一些问题<span class="mojikumi-line-end">：</span></p>
<ol>
<li>更新静态库需要重新链接</li>
<li>每个程序都有一份库的拷贝<span class="mojikumi-line-end">，</span>会造成空间的浪费</li>
</ol>
<p>共享库 (shared library) 可以解决这些问题<span class="mojikumi-line-end">。</span>共享库可以在 run time 或者 load time 被动态链接<span class="mojikumi-line-end">。</span>动态链接由 dynamic linker 完成<span class="mojikumi-line-end">。</span>共享库也被称作 shared object<span class="mojikumi-line-end">，</span>在 Linux 中后缀名为 <code>.so</code><span class="mojikumi-line-end">，</span>在 Windows 中被叫做 DLL<span class="mojikumi-line-end">。</span></p>
<p>共享库在两个层面上被共享<span class="mojikumi-line-end">：</span></p>
<ol>
<li>在文件系统上只有一份 <code>.so</code> 文件<span class="mojikumi-line-end">，</span>而在可执行文件中没有库的拷贝</li>
<li>在内存中共享库的 <code>.text</code> section 的单份拷贝可以被多个进程同时使用</li>
</ol>
<p>可以用类似于 <code>gcc -shared -fpic a.c b.c c.c -o libabc.so</code> 的命令构建共享库<span class="mojikumi-line-end">。</span>编译选项中 <code>-shared</code> 告诉编译器要生成 shared object<span class="mojikumi-line-end">，</span><code>-fpic</code> 用来生成 <a href="#position-independent-code-pic">position-independent code</a><span class="mojikumi-line-end">。</span></p>
<p>可以用类似 <code>gcc<wbr> <wbr>main<wbr>.<wbr>c<wbr> ./<wbr>libabc<wbr>.<wbr>so<wbr> -<wbr>o<wbr> <wbr>main</code> 的命令来使用共享库<span class="mojikumi-line-end">。</span></p>
<p>运行 <code>main</code> 时<span class="mojikumi-line-end">，</span>loader 会在进入 entry point 前在 <code>.interp</code> section 中发现 dynamic linker <code>ld<wbr>-<wbr>linux<wbr>.<wbr>so</code><span class="mojikumi-line-end">，</span>于是让 dynamic linker 完成共享库的 relocation 并修改程序中的 symbol reference<span class="mojikumi-line-end">，</span>最后将控制权交还给程序<span class="mojikumi-line-end">。</span></p>
<h2 id="loading-and-linking-shared-libraries-from-applications" class="heading"><a href="#loading-and-linking-shared-libraries-from-applications" class="heading-anchor" aria-label="章节： Loading and Linking Shared Libraries from Applications" tabindex="-1"></a><span>Loading and Linking Shared Libraries from Applications</span></h2>
<p>除了在编译时指定要链接到的共享库并在 load time 链接<span class="mojikumi-line-end">，</span>也可以在 run time 加载并使用共享库<span class="mojikumi-line-end">。</span></p>
<p>C 语言中的相关函数放在 <code>dlfcn.h</code> 头文件中<span class="mojikumi-line-end">，</span>编译时需要 <code>-ldl</code> 来使用这些函数<span class="mojikumi-line-end">：</span></p>
<ul>
<li>
<p><code>void<wbr> *<wbr>dlopen<wbr>(<wbr>const<wbr> <wbr>char<wbr> *<wbr>filename<wbr>, <wbr>int<wbr> <wbr>flag<wbr>)</code>: 加载共享库</p>
<ul>
<li>
<p>返回值: 成功加载则返回 handle<span class="mojikumi-line-end">，</span>否则返回 <code>NULL</code></p>
</li>
<li>
<p><code>filename</code>: 共享库文件名</p>
</li>
<li>
<p><code>flag</code>: 影响如何处理共享库中引用的 external symbol<span class="mojikumi-line-end">，</span>必须包含 <code>RTLD_NOW</code> 和 <code>RTLD_LAZY</code> 两者之一</p>
<ul>
<li><code>RTLD_NOW</code>: 立即 resolve 所有 external symbol</li>
<li><code>RTLD_LAZY</code>: 等到运行共享库中的代码时再 resolve external symbol</li>
<li><code>RTLD_GLOBAL</code>: 之后给其他共享库 resolve external symbol 时可以使用当前这个共享库</li>
</ul>
<p>如果编译程序时启用了 <code>-rdynamic</code> 选项<span class="mojikumi-line-end">，</span>在 resolve external symbol 时<span class="mojikumi-line-end">，</span>除了使用其他加载时启用了 <code>RTLD_GLOBAL</code> 选项的共享库<span class="mojikumi-line-end">，</span>也可以使用程序自身的 global symbol<span class="mojikumi-line-end">。</span></p>
</li>
</ul>
</li>
<li>
<p><code>void<wbr> *<wbr>dlsym<wbr>(<wbr>void<wbr> *<wbr>handle<wbr>, <wbr>char<wbr> *<wbr>symbol<wbr>)</code>: 获得共享库中某个 symbol 的地址</p>
<ul>
<li>
<p><code>symbol</code>: symbol 名称</p>
</li>
<li>
<p>返回值: 成功获取则返回 symbol 地址<span class="mojikumi-line-end">，</span>否则返回 <code>NULL</code></p>
</li>
</ul>
</li>
<li>
<p><code>int<wbr> <wbr>dlclose<wbr>(<wbr>void<wbr> *<wbr>handle<wbr>)</code>: 关闭共享库</p>
<ul>
<li>返回值: 成功关闭则返回 <code>0</code><span class="mojikumi-line-end">，</span>出错则返回 <code>-1</code></li>
</ul>
</li>
<li>
<p><code>const<wbr> <wbr>char<wbr> *<wbr>dlerror<wbr>(<wbr>void<wbr>)</code>: 获取最后一次调用 <code>dlopen</code> / <code>dlsym</code> / <code>dlclose</code> 的报错信息<span class="mojikumi-line-end">，</span>如果最后一次调用没有出错则返回 <code>NULL</code></p>
</li>
</ul>
<p>CS:APP 给出了一份参考代码 <a href="https://csapp.cs.cmu.edu/3e/ics3/code/link/dll.c">code/link/dll.c</a><span class="mojikumi-line-end">。</span></p>
<h2 id="position-independent-code-pic" class="heading"><a href="#position-independent-code-pic" class="heading-anchor" aria-label="章节： Position-Independent Code (PIC)" tabindex="-1"></a><span>Position-Independent Code (PIC)</span></h2>
<p>共享库的一条重要性质是它的代码段在内存中只有一份<span class="mojikumi-line-end">，</span>而可以被多个进程共享<span class="mojikumi-line-end">，</span>这就使得它的代码中的 symbol reference 不能在动态链接时被修改<span class="mojikumi-line-end">，</span>适用于静态链接的 relocation 无法完成<span class="mojikumi-line-end">，</span>所以共享库的代码需要是 position-independent 的<span class="mojikumi-line-end">。</span></p>
<p>PIC 的主要思路基于以下两点<span class="mojikumi-line-end">：</span></p>
<ol>
<li>虽然共享库的代码段是共享的<span class="mojikumi-line-end">，</span>但数据段是每个进程各有一份的</li>
<li>无论整个共享库被放到内存的哪个位置<span class="mojikumi-line-end">，</span>代码段和数据段地址的距离是固定的<span class="mojikumi-line-start">（</span>这与上一条不矛盾<span class="mojikumi-line-end">，</span>应该是因为虚存<span class="mojikumi-line-end">）</span></li>
</ol>
<p>因此<span class="mojikumi-line-end">，</span>可以在数据段中存放效果相当于 relocation 的信息<span class="mojikumi-line-end">，</span>来间接达到 relocation 的效果<span class="mojikumi-line-end">。</span>说白了就是<span class="mojikumi-line-end">，</span>因为没法修改代码段<span class="mojikumi-line-end">，</span>所以把 symbol 的地址放到数据段里<span class="mojikumi-line-end">。</span>具体实现中<span class="mojikumi-line-end">，</span>数据段的开头有一个 <i>global offset table</i> (GOT)<span class="mojikumi-line-end">，</span>表中每一项都是一个地址<span class="mojikumi-line-end">，</span>可以由 dynamic linker 进行修改<span class="mojikumi-line-end">，</span>而由于代码段和数据段的地址距离固定<span class="mojikumi-line-end">，</span>就可以用 PC-relative 的方式寻址到表中的项<span class="mojikumi-line-end">。</span></p>
<p>PIC data reference 是简单的<span class="mojikumi-line-end">，</span>只要在 GOT 中为每个 data symbol (全局或 static 变量) 创建一个表项<span class="mojikumi-line-end">，</span>在动态链接时由 dynamic linker 修改这些项<span class="mojikumi-line-end">，</span>而在代码中通过这个表项来间接地进行 data reference<span class="mojikumi-line-end">，</span>例如 (CS:APP Figure 7.18<span class="mojikumi-line-end">，</span><code>GOT[3]</code> 中存放了全局变量 <code>x</code> 的地址)<span class="mojikumi-line-end">：</span></p>
<section class="code-block relative my-6 shadow" itemprop="hasPart" itemscope itemtype="https://schema.org/SoftwareSourceCode" data-v-c675dba6><div class="h-6 items-center rd-t-1 bg-area px-4 dark:bg-#2A313A media-screen:important-flex" style="display:none;" data-v-c675dba6><h3 class="text-3 text-footer" itemprop="programmingLanguage" aria-label="Assembly 代码块" data-v-c675dba6>Assembly</h3><ile-root id="ile-7"><button title="复制到剪贴板" class="copy-button b-footer text-footer" data-v-63dfb2af><span class="i-mdi-content-copy" data-v-63dfb2af></span><span class="sr-only" role="status" data-v-63dfb2af></span></button></ile-root><!--ISLAND_HYDRATION_PLACEHOLDER_ile-7--></div><div class="dark:hidden" itemprop="text" data-v-c675dba6><pre class="shiki light" style="background-color: #FBFBFB" tabindex="0"><code><span><span style="color: #403F53">    </span><span style="color: #0C969B">movq</span><span style="color: #403F53"> Ox2OO8b9(%</span><span style="color: #4876D6">rip</span><span style="color: #403F53">), %</span><span style="color: #4876D6">rax</span><span style="color: #403F53">  </span><span style="color: #989FB1"> # %rax = *GOT[3] = &amp;x</span></span>
<span><span style="color: #403F53">    addl $0x1, (%</span><span style="color: #4876D6">rax</span><span style="color: #403F53">)          </span><span style="color: #989FB1"> # ++x</span></span></code></pre></div><div class="dark:important-block" style="display:none;" data-v-c675dba6><pre class="shiki dark" style="background-color: #011627" tabindex="0"><code><span><span style="color: #D6DEEB">    </span><span style="color: #7FDBCA">movq</span><span style="color: #D6DEEB"> Ox2OO8b9(%</span><span style="color: #82AAFF">rip</span><span style="color: #D6DEEB">), %</span><span style="color: #82AAFF">rax</span><span style="color: #D6DEEB">  </span><span style="color: #637777"> # %rax = *GOT[3] = &amp;x</span></span>
<span><span style="color: #D6DEEB">    addl $0x1, (%</span><span style="color: #82AAFF">rax</span><span style="color: #D6DEEB">)          </span><span style="color: #637777"> # ++x</span></span></code></pre></div></section>
<p>如果是本地定义的变量<span class="mojikumi-line-end">，</span>也可以使用 PC-relative 的定位方式直接引用<span class="mojikumi-line-end">，</span>而只对外部定义的变量使用 GOT<span class="mojikumi-line-end">，</span>但编译器也可能选择不做这样的区分而是使用统一的方法来处理<span class="mojikumi-line-end">。</span></p>
<p>PIC procedure call 也可以和 data reference 一样处理<span class="mojikumi-line-start">（</span>可以用 <code>-fno-plt</code> 编译选项来这样做<span class="mojikumi">）</span><span class="mojikumi-line-end">，</span>但实际上会使用名为 <i>lazy binding</i> 的技术进行优化<span class="mojikumi-line-end">。</span></p>
<p>这是因为<span class="mojikumi-line-end">，</span>链接到一个共享库时<span class="mojikumi-line-end">，</span>往往最终会调用的只是它提供的大量函数中的一小部分<span class="mojikumi-line-end">，</span>如果对整个共享库用到的外部函数都在动态链接时计算出相应的 offset<span class="mojikumi-line-end">，</span>就可能造成浪费<span class="mojikumi-line-end">。</span>而 lazy binding 则是在第一次调用某个外部函数时绑定这个函数的地址<span class="mojikumi-line-end">。</span></p>
<p>lazy binding 基于一个名为 <i>procedure linkage table</i> (PLT) 的结构<span class="mojikumi-line-end">。</span>PLT 位于代码段中<span class="mojikumi-line-end">，</span>表中的每一项其实是三条指令<span class="mojikumi-line-end">，</span>如 CS:APP Figure 7.19 所示<span class="mojikumi-line-end">：</span></p>
<p><picture><img type="image/webp" srcset="/assets/csapp-fig7.19.389efa4c.webp" loading="lazy" src="/assets/csapp-fig7.19.389efa4c.webp" width="995" height="602" alt="PLT 原理示意图"></picture></p>
<p>整个流程就是<span class="mojikumi-line-end">：</span></p>
<ol>
<li>调用 <code>addvec</code> 时<span class="mojikumi-line-end">，</span>实际上调用的是 <code>PLT[2]</code> 的地址</li>
<li><code>PLT[2]</code> 的第一条指令会跳转到 <code>GOT[4]</code><span class="mojikumi-line-end">，</span><code>GOT[4]</code> 里一开始放的是 <code>PLT[2]</code> 的第二条指令<span class="mojikumi-line-end">，</span>所以首次调用 <code>PLT[2]</code> 时就从第一条指令跳到第二条指令</li>
<li>第二条指令是往栈里压入 <code>addvec</code> 的 ID<span class="mojikumi-line-end">，</span>是用来告诉 dynamic linker 这是哪个函数</li>
<li>第三条指令会跳转到 <code>PLT[0]</code></li>
<li><code>PLT[0]</code> 的第一条指令是往栈里压入 relocation entries 的地址<span class="mojikumi-line-end">，</span>第二条指令是跳转到 dynamic linker</li>
<li>dynamic linker 通过放在栈中的函数的 ID 以及 relocation entries 计算出 <code>addvec</code> 的地址<span class="mojikumi-line-end">，</span>放在 <code>GOT[4]</code><span class="mojikumi-line-end">，</span>然后跳转到 <code>addvec</code></li>
<li>因为一路上都是 <code>jmp</code><span class="mojikumi-line-end">，</span>跳转到 <code>addvec</code> 后可以正常返回到调用 <code>PLT[2]</code> 的位置</li>
<li>第二次调用 <code>PLT[2]</code> 时<span class="mojikumi-line-end">，</span><code>GOT[4]</code> 里已经是 <code>addvec</code> 的地址<span class="mojikumi-line-end">，</span>所以就在 <code>PLT[2]</code> 的第一条指令处跳转到了 <code>addvec</code></li>
</ol>
<h2 id="library-interpositioning" class="heading"><a href="#library-interpositioning" class="heading-anchor" aria-label="章节： Library Interpositioning" tabindex="-1"></a><span>Library Interpositioning</span></h2>
<p>Linux 的链接器支持一个名为 <i>library interpositioning</i> 的技术<span class="mojikumi-line-end">，</span>可以用来把共享库的函数替换掉<span class="mojikumi-line-end">，</span>一般会换成一个 wrapper function 用来 trace<span class="mojikumi-line-end">，</span>但也可以换成完全不同的东西<span class="mojikumi-line-end">。</span></p>
<p>看了下中文版 CS:APP<span class="mojikumi-line-end">，</span>这个东西竟然叫<span class="mojikumi-line-start">“</span>库打桩<span class="mojikumi">”</span><wbr><span class="mojikumi-line-start">（</span></p>
<p>编译时的 library interpositioning 就是用 <code>#define</code> 换掉某个函数 <s><span class="mojikumi-line-end">，</span>在机房里被 <code>#<wbr>define<wbr> <wbr>sort<wbr> <wbr>random_<wbr>shuffle</code> 过的大家想必对此非常熟悉</s><span class="mojikumi-line-end">。</span></p>
<p>链接时的 library interpositioning 是给 linker 传参 <code>--wrap foo</code><span class="mojikumi-line-end">，</span>然后调用 <code>foo</code> 就会实际上调用 <code>__wrap_foo</code><span class="mojikumi-line-end">，</span>而调用 <code>__real_foo</code> 则会调用原本的 <code>foo</code><span class="mojikumi-line-end">。</span>一般会给 <code>gcc</code> 而非 <code>ld</code> 传参<span class="mojikumi-line-end">，</span>就是用 <code>gcc<wbr> -<wbr>Wl<wbr>,--<wbr>wrap<wbr>,<wbr>foo</code> 代替 <code>ld --wrap foo</code><span class="mojikumi-line-end">，</span>其中 <code>-Wl</code> 表示给 linker 传参<span class="mojikumi-line-end">，</span>后面的逗号会被换成空格<span class="mojikumi-line-end">。</span></p>
<p>运行时的 library interpositioning 是在运行程序时设置环境变量 <code>LD_<wbr>PRELOAD<wbr>="/<wbr>path<wbr>/<wbr>to<wbr>/<wbr>wrapper<wbr>.<wbr>so<wbr> /<wbr>path<wbr>/<wbr>to<wbr>/<wbr>anotherwrapper<wbr>.<wbr>so<wbr>"</code><span class="mojikumi-line-end">，</span>然后在使用任意共享库中的函数之前就会优先尝试使用 <code>wrapper.so</code> 和 <code>anotherwrapper<wbr>.<wbr>so</code><span class="mojikumi-line-end">。</span>这时<span class="mojikumi-line-end">，</span>为了能在 wrapper function 中调用原本的函数<span class="mojikumi-line-end">，</span>就需要 <a href="#loading-and-linking-shared-libraries-from-applications">在运行时加载共享库</a><span class="mojikumi-line-end">。</span></p>
<p>如果想看具体实现<span class="mojikumi-line-end">，</span>可以参考 CS:APP<span class="mojikumi-line-end">。</span></p>
<p>编译时的 library interpositioning 需要修改源代码<span class="mojikumi-line-end">，</span>链接时的 library interpositioning 需要获取到 object file 并重新链接得到可执行文件<span class="mojikumi-line-end">，</span>而运行时 library interpositioning 只需要设置环境变量<span class="mojikumi-line-end">，</span>不需要对可执行文件进行任何修改<span class="mojikumi-line-end">，</span>可以方便地对很多不同程序的某个函数调用进行跟踪<span class="mojikumi-line-end">。</span></p>]]></content:encoded>
            <category domain="https://ouuan.moe/tag/csapp">csapp</category>
            <category domain="https://ouuan.moe/tag/%E5%AD%A6%E4%B9%A0%E7%AC%94%E8%AE%B0">学习笔记</category>
        </item>
        <item>
            <title><![CDATA[CS:APP 第四章学习笔记]]></title>
            <link>https://ouuan.moe/post/2022/10/csapp-4</link>
            <guid>https://ouuan.moe/post/2022/10/csapp-4</guid>
            <pubDate>Mon, 17 Oct 2022 02:30:45 GMT</pubDate>
            <description><![CDATA[









<p><a href="https://csapp.cs.cmu.edu/">CS:APP</a> 第四章 <span class="mojikumi">“</span>Processor Architecture<span class="mojikumi">”</span> 的学习笔记<span class="mojikumi-line-end">。</span></p>
<p>这章的主要内容为一个简化的指令集 Y86-64 的设计以及 Y86-64 处理器的实现<span class="mojikumi-line-start">（</span>顺序实现和 pipeline 实现<span class="mojikumi">）</span><span class="mojikumi-line-end">。</span></p>
]]></description>
            <content:encoded><![CDATA[









<p><a href="https://csapp.cs.cmu.edu/">CS:APP</a> 第四章 <span class="mojikumi">“</span>Processor Architecture<span class="mojikumi">”</span> 的学习笔记<span class="mojikumi-line-end">。</span></p>
<p>这章的主要内容为一个简化的指令集 Y86-64 的设计以及 Y86-64 处理器的实现<span class="mojikumi-line-start">（</span>顺序实现和 pipeline 实现<span class="mojikumi">）</span><span class="mojikumi-line-end">。</span></p>

<h2 id="the-y86-64-instruction-set-architecture" class="heading"><a href="#the-y86-64-instruction-set-architecture" class="heading-anchor" aria-label="章节： The Y86-64 Instruction Set Architecture" tabindex="-1"></a><span>The Y86-64 Instruction Set Architecture</span></h2>
<p>这部分定义了在这一章中用作演示的名为 <span class="mojikumi">“</span>Y86-64<span class="mojikumi">”</span> 的玩具 ISA<span class="mojikumi-line-end">。</span></p>
<h3 id="y86-64-程序状态" class="heading"><a href="#y86-64-程序状态" class="heading-anchor" aria-label="章节： Y86-64 程序状态" tabindex="-1"></a><span>Y86-64 程序状态</span></h3>
<ul>
<li>15 个寄存器<span class="mojikumi-line-start">（</span>x86-64 的寄存器除去 <code>%r15</code><span class="mojikumi-line-end">，</span>为了简化编码<span class="mojikumi-line-end">）</span></li>
<li>3 个 status flag: <code>ZF</code><span class="mojikumi-line-end">、</span><code>SF</code><span class="mojikumi-line-end">、</span><code>OF</code></li>
<li>program counter: <code>PC</code></li>
<li>memory</li>
<li>status code: <code>Stat</code><span class="mojikumi-line-end">，</span>用来表示程序正常运行或发生了异常</li>
</ul>
<h3 id="y86-64-指令" class="heading"><a href="#y86-64-指令" class="heading-anchor" aria-label="章节： Y86-64 指令" tabindex="-1"></a><span>Y86-64 指令</span></h3>
<p>Y86-64 指令大致上是 x86-64 的一个子集<span class="mojikumi-line-end">，</span>但在 operand 等方面有一些简化或区别<span class="mojikumi-line-end">。</span></p>
<p>operand 与 x86-64 的区别是<span class="mojikumi-line-end">：</span></p>
<ul>
<li>Immediate<span class="mojikumi-line-end">、</span>Register<span class="mojikumi-line-end">、</span>Memory 都只有 64 位的版本</li>
<li>Register 只有 15 个</li>
<li>Memory 不支持 <code>(, ri, s)</code> 的部分<span class="mojikumi-line-end">，</span>只能是 <code>Imm</code>/<code>(rb)</code>/<code>Imm(rb)</code></li>
</ul>
<p>condition code 只有六个<span class="mojikumi-line-end">，</span>即 signed compare: <code>le</code>/<code>l</code>/<code>e</code>/<code>ne</code>/<code>ge</code>/<code>g</code></p>
<p>指令列表<span class="mojikumi-line-end">，</span>以及与 x86-64 的区别<span class="mojikumi-line-end">：</span></p>
<ul>
<li><code>irmovq</code>/<code>rrmovq</code>/<code>mrmovq</code>/<code>rmmovq</code><span class="mojikumi-line-end">，</span>即将 <code>movq</code> 按 operand 类型拆成了四个指令</li>
<li><code>addq</code>/<code>subq</code>/<code>andq</code>/<code>xorq</code><span class="mojikumi-line-end">，</span>它们只接受寄存器作为 operand<span class="mojikumi-line-end">，</span>且只设置 <code>ZF</code><span class="mojikumi-line-end">、</span><code>SF</code><span class="mojikumi-line-end">、</span><code>OF</code> 三个 status flag</li>
<li><code>jmp</code>/<code>jle</code>/<code>jl</code>/<code>je</code>/<code>jne</code>/<code>jge</code>/<code>jg</code><span class="mojikumi-line-end">，</span>包括 <code>jmp</code> 在内都只能跳转到固定的地址<span class="mojikumi-line-end">，</span>不接受寄存器作为 operand<span class="mojikumi-line-end">，</span>且这个地址是绝对地址而非相对于 PC 的地址</li>
<li><code>cmovle</code>/<code>cmovl</code>/<code>cmove</code>/<code>cmovne</code>/<code>cmovge</code>/<code>cmovg</code><span class="mojikumi-line-end">，</span>它们只接受寄存器作为 operand</li>
<li><code>call</code>: 地址是绝对地址</li>
<li><code>ret</code><span class="mojikumi-line-end">、</span><code>pushq</code><span class="mojikumi-line-end">、</span><code>popq</code><span class="mojikumi-line-end">、</span><code>nop</code>: 与 x86-64 基本相同</li>
<li><code>halt</code>: 停止运行<span class="mojikumi-line-end">，</span>将 status code 设为 <code>HLT</code></li>
</ul>
<h3 id="y86-64-指令编码" class="heading"><a href="#y86-64-指令编码" class="heading-anchor" aria-label="章节： Y86-64 指令编码" tabindex="-1"></a><span>Y86-64 指令编码</span></h3>
<p>Y86-64 通过对指令的简化<span class="mojikumi-line-end">，</span>同时也使编码得到了简化<span class="mojikumi-line-end">，</span>但相应地使得编码不紧凑<span class="mojikumi-line-end">，</span>会有浪费<span class="mojikumi-line-end">。</span></p>
<p>CS:APP Figure 4.2 简明地展示了 Y86-64 的指令编码<span class="mojikumi-line-end">：</span></p>
<p><picture><img type="image/webp" srcset="/assets/csapp-fig4.2.429f9e8d.webp" loading="lazy" src="/assets/csapp-fig4.2.429f9e8d.webp" width="976" height="730" alt="Y86-64 指令编码示意图"></picture></p>
<h4 id="指令类型的编码" class="heading"><a href="#指令类型的编码" class="heading-anchor" aria-label="章节： 指令类型的编码" tabindex="-1"></a><span>指令类型的编码</span></h4>
<p>指令编码的第一个 byte 表示指令的类型<span class="mojikumi-line-end">。</span>这个 byte 的高位叫做 <i>code</i><span class="mojikumi-line-end">，</span>低位叫做 <i>function</i><span class="mojikumi-line-end">，</span>其中 function 只在某几个指令有用<span class="mojikumi-line-end">。</span>特别地<span class="mojikumi-line-end">，</span><code>rrmovq</code> 和 <code>cmovXX</code> 的 code 是相同的<span class="mojikumi-line-end">，</span>这表示 <code>rrmovq</code> 可以看作一种特殊的 <code>cmovXX</code><span class="mojikumi-line-end">。</span></p>
<p>算术运算的 function: <code>add</code> 0, <code>sub</code> 1, <code>and</code> 2, <code>xor</code> 3</p>
<p>condition code 的 function: <code>le</code> 1, <code>l</code> 2, <code>e</code> 3, <code>ne</code> 4, <code>ge</code> 5, <code>g</code> 6<span class="mojikumi-line-end">；</span><code>jmp</code> 的 function 为 0</p>
<h4 id="register-specifier-byte" class="heading"><a href="#register-specifier-byte" class="heading-anchor" aria-label="章节： Register Specifier Byte" tabindex="-1"></a><span>Register Specifier Byte</span></h4>
<p>除了 <code>jXX</code> 和 <code>call</code><span class="mojikumi-line-end">，</span>指令编码的第二个 byte<span class="mojikumi-line-start">（</span>如果有<span class="mojikumi-line-end">）</span>的高低位分别表示一个 register identifier<span class="mojikumi-line-end">。</span></p>
<p>register identifier 从 <code>%rax</code> 为 <code>0</code> 到 <code>%r14</code> 为 <code>E</code><span class="mojikumi-line-end">；</span><code>F</code> 表示不是寄存器<span class="mojikumi-line-end">。</span></p>
<h4 id="constant-word" class="heading"><a href="#constant-word" class="heading-anchor" aria-label="章节： Constant Word" tabindex="-1"></a><span>Constant Word</span></h4>
<p>在 <code>irmovq</code><span class="mojikumi-line-end">、</span><code>rmmovq</code>/<code>mrmovq</code><span class="mojikumi-line-end">、</span><code>jXX</code>/<code>call</code> 中<span class="mojikumi-line-end">，</span>分别有一个 8-byte 的 constant word<span class="mojikumi-line-end">，</span>用来表示 immediate value 或地址<span class="mojikumi-line-end">，</span>byte ordering 是 little endian<span class="mojikumi-line-end">。</span></p>
<h3 id="y86-64-异常" class="heading"><a href="#y86-64-异常" class="heading-anchor" aria-label="章节： Y86-64 异常" tabindex="-1"></a><span>Y86-64 异常</span></h3>
<p>status code <code>Stat</code> 有四种可能的取值<span class="mojikumi-line-end">：</span></p>
<ul>
<li><code>AOK</code>: 正常</li>
<li><code>HLT</code>: 执行了 <code>halt</code> 指令</li>
<li><code>ADR</code>: 访问了不合法的地址</li>
<li><code>INS</code>: 指令编码不合法</li>
</ul>
<p>在 Y86-64 中<span class="mojikumi-line-end">，</span>遇到异常后处理器会立即停止运行<span class="mojikumi-line-end">。</span></p>
<h3 id="y86-64-程序" class="heading"><a href="#y86-64-程序" class="heading-anchor" aria-label="章节： Y86-64 程序" tabindex="-1"></a><span>Y86-64 程序</span></h3>
<p>CS:APP Figure 4.8 展示了一个完整的 Y86-64 程序<span class="mojikumi-line-end">：</span></p>
<p><picture><img type="image/webp" srcset="/assets/csapp-fig4.8.7c8b0ffb.webp" loading="lazy" src="/assets/csapp-fig4.8.7c8b0ffb.webp" width="930" height="1075" alt="完整 Y86-64 程序的汇编与机器码"></picture></p>
<p>可以下载 <a href="http://csapp.cs.cmu.edu/3e/sim.tar">Y86-64 tools</a> 并使用 <code>yas</code> 进行汇编<span class="mojikumi-line-end">，</span>使用 <code>yis</code> 模拟运行<span class="mojikumi-line-end">。</span>编译 <code>yas</code> 时 <a href="https://stackoverflow.com/questions/63152352/fail-to-compile-the-y86-simulatur-csapp">需要添加 <code>-fcommon</code> 编译选项</a><span class="mojikumi-line-end">。</span></p>
<h3 id="对-rsp-进行-pushpop" class="heading"><a href="#对-rsp-进行-pushpop" class="heading-anchor" aria-label="章节： 对 %rsp 进行 push/pop" tabindex="-1"></a><span>对 %rsp 进行 push/pop</span></h3>
<p><code>pushq %rsp</code><span class="mojikumi-line-end">、</span><code>popq %rsp</code> 这两条指令虽然没什么用<span class="mojikumi-line-end">，</span>但它们的行为可能有歧义<span class="mojikumi-line-end">，</span>所以在设计 ISA 时明确规定它们的行为是有必要的<span class="mojikumi-line-end">。</span></p>
<p>Y86-64 遵循和 x86-64 相同的规则<span class="mojikumi-line-end">：</span><code>pushq %rsp</code> 会将旧的<span class="mojikumi-line-start">（</span>没有减 8 的<span class="mojikumi-line-end">）</span><code>%rsp</code> 的值入栈<span class="mojikumi-line-end">，</span><code>popq %rsp</code> 相当于 <code>mrmovq (%rsp), %rsp</code><span class="mojikumi-line-end">。</span></p>
<h2 id="logic-design-and-the-hardware-control-language-hcl" class="heading"><a href="#logic-design-and-the-hardware-control-language-hcl" class="heading-anchor" aria-label="章节： Logic Design and the Hardware Control Language HCL" tabindex="-1"></a><span>Logic Design and the Hardware Control Language HCL</span></h2>
<p>这一章中使用玩具语言 HCL (hardware control language) 来描述 Y86-64 处理器的逻辑设计<span class="mojikumi">。</span><wbr><span class="mojikumi-line-start">（</span>与之类似但不是玩具的语言<span class="mojikumi-line-end">，</span>例如 VHDL<span class="mojikumi-line-end">、</span>Verilog 等<span class="mojikumi-line-end">，</span>叫做 <span class="mojikumi">“</span><a href="https://en.wikipedia.org/wiki/Hardware_description_language">hardware description language (HDL)</a><span class="mojikumi">”</span><span class="mojikumi">。</span><span class="mojikumi-line-end">）</span></p>
<h3 id="逻辑门" class="heading"><a href="#逻辑门" class="heading-anchor" aria-label="章节： 逻辑门" tabindex="-1"></a><span>逻辑门</span></h3>
<p>CSAPP Figure 4.9:</p>
<p><picture><img type="image/webp" srcset="/assets/csapp-fig4.9.15f29f6a.webp" loading="lazy" src="/assets/csapp-fig4.9.15f29f6a.webp" width="546" height="168" alt="与或非逻辑门"></picture></p>
<ul>
<li>图中只展示了输入个数为 2 的 AND 和 OR<span class="mojikumi-line-end">，</span>但可以有更多输入</li>
<li>一旦输入改变<span class="mojikumi-line-end">，</span>逻辑门的输出很快就会随之改变</li>
</ul>
<h3 id="组合逻辑电路" class="heading"><a href="#组合逻辑电路" class="heading-anchor" aria-label="章节： 组合逻辑电路" tabindex="-1"></a><span>组合逻辑电路</span></h3>
<p>组合逻辑电路即由若干逻辑门组合而成的电路<span class="mojikumi-line-end">，</span>它的特点是无状态<span class="mojikumi-line-end">，</span>输出仅与输入有关<span class="mojikumi-line-end">，</span>输入改变后输出很快就会随之改变<span class="mojikumi-line-end">。</span></p>
<p>在 HCL 中<span class="mojikumi-line-end">，</span>用逻辑表达式来表示组合逻辑电路<span class="mojikumi-line-end">，</span>例如 <code>bool eq = (a &#x26;&#x26; b) || (!a &#x26;&#x26; !b)</code> 表示计算 <code>a</code><span class="mojikumi-line-end">、</span><code>b</code> 是否相等的电路<span class="mojikumi-line-end">。</span>因为它表示的是电路而不是计算<span class="mojikumi-line-end">，</span>在这条语句之后<span class="mojikumi-line-end">，</span>一旦 <code>a</code><span class="mojikumi-line-end">、</span><code>b</code> 的值发生改变<span class="mojikumi-line-end">，</span><code>eq</code> 的输出也会改变<span class="mojikumi-line-start">（</span>和 Vue 的 computed 类似<span class="mojikumi">）</span><span class="mojikumi-line-end">。</span></p>
<h3 id="以-word-为单位进行操作的电路" class="heading"><a href="#以-word-为单位进行操作的电路" class="heading-anchor" aria-label="章节： 以 word 为单位进行操作的电路" tabindex="-1"></a><span>以 word 为单位进行操作的电路</span></h3>
<p>在处理器的设计中<span class="mojikumi-line-end">，</span>经常需要对一个 word 而非单个 bit 进行操作<span class="mojikumi-line-end">。</span></p>
<p>在 HCL 中<span class="mojikumi-line-end">，</span>一般使用大写的名称表示 word<span class="mojikumi-line-end">，</span>例如: <code>bool Eq = (A == B)</code> 表示计算 word <code>A</code><span class="mojikumi-line-end">、</span><code>B</code> 是否相等的电路<span class="mojikumi-line-end">，</span>可以实现为判断每个 bit 是否相等再 AND<span class="mojikumi-line-end">。</span></p>
<h3 id="multiplexor-mux" class="heading"><a href="#multiplexor-mux" class="heading-anchor" aria-label="章节： Multiplexor (MUX)" tabindex="-1"></a><span>Multiplexor (MUX)</span></h3>
<p>multiplexor (MUX) 的功能是通过信号输入的值来在其它输入中选择一个作为输出<span class="mojikumi-line-end">，</span>word-level 的 MUX 电路如图 (CSAPP Figure 4.13)<span class="mojikumi-line-end">：</span></p>
<p><picture><img type="image/webp" srcset="/assets/csapp-fig4.13.47e7678b.webp" loading="lazy" src="/assets/csapp-fig4.13.47e7678b.webp" width="886" height="759" alt="word-level MUX 电路"></picture></p>
<p>在 HCL 中<span class="mojikumi-line-end">，</span>使用 <i>case expressions</i> 表示 MUX<span class="mojikumi-line-end">，</span>例如</p>
<section class="code-block relative my-6 shadow" itemprop="hasPart" itemscope itemtype="https://schema.org/SoftwareSourceCode" data-v-c675dba6><div class="h-6 items-center rd-t-1 bg-area px-4 dark:bg-#2A313A media-screen:important-flex" style="display:none;" data-v-c675dba6><h4 class="text-3 text-footer" itemprop="programmingLanguage" aria-label="HCL (CS:APP) 代码块" data-v-c675dba6>HCL (CS:APP)</h4><ile-root id="ile-8"><button title="复制到剪贴板" class="copy-button b-footer text-footer" data-v-63dfb2af><span class="i-mdi-content-copy" data-v-63dfb2af></span><span class="sr-only" role="status" data-v-63dfb2af></span></button></ile-root><!--ISLAND_HYDRATION_PLACEHOLDER_ile-8--></div><div class="dark:hidden" itemprop="text" data-v-c675dba6><pre class="shiki light" style="background-color: #FBFBFB" tabindex="0"><code><span><span style="color: #994CC3">word</span><span style="color: #403F53"> </span><span style="color: #4876D6">Mux</span><span style="color: #403F53"> = [</span></span>
<span><span style="color: #403F53">    !</span><span style="color: #4876D6">s1</span><span style="color: #403F53"> &amp;&amp; !</span><span style="color: #4876D6">s0</span><span style="color: #403F53">: </span><span style="color: #4876D6">A</span><span style="color: #403F53">;</span></span>
<span><span style="color: #403F53">    !</span><span style="color: #4876D6">s1</span><span style="color: #403F53">: </span><span style="color: #4876D6">B</span><span style="color: #403F53">;</span></span>
<span><span style="color: #403F53">    !</span><span style="color: #4876D6">s0</span><span style="color: #403F53">: </span><span style="color: #4876D6">C</span><span style="color: #403F53">;</span></span>
<span><span style="color: #403F53">    </span><span style="color: #AA0982">1</span><span style="color: #403F53">: </span><span style="color: #4876D6">D</span><span style="color: #403F53">;</span></span>
<span><span style="color: #403F53">];</span></span></code></pre></div><div class="dark:important-block" style="display:none;" data-v-c675dba6><pre class="shiki dark" style="background-color: #011627" tabindex="0"><code><span><span style="color: #C792EA">word</span><span style="color: #D6DEEB"> </span><span style="color: #C5E478">Mux</span><span style="color: #D6DEEB"> = [</span></span>
<span><span style="color: #D6DEEB">    !</span><span style="color: #C5E478">s1</span><span style="color: #D6DEEB"> &amp;&amp; !</span><span style="color: #C5E478">s0</span><span style="color: #D6DEEB">: </span><span style="color: #C5E478">A</span><span style="color: #D6DEEB">;</span></span>
<span><span style="color: #D6DEEB">    !</span><span style="color: #C5E478">s1</span><span style="color: #D6DEEB">: </span><span style="color: #C5E478">B</span><span style="color: #D6DEEB">;</span></span>
<span><span style="color: #D6DEEB">    !</span><span style="color: #C5E478">s0</span><span style="color: #D6DEEB">: </span><span style="color: #C5E478">C</span><span style="color: #D6DEEB">;</span></span>
<span><span style="color: #D6DEEB">    </span><span style="color: #F78C6C">1</span><span style="color: #D6DEEB">: </span><span style="color: #C5E478">D</span><span style="color: #D6DEEB">;</span></span>
<span><span style="color: #D6DEEB">];</span></span></code></pre></div></section>
<p>表示一个由 <code>s0</code> 和 <code>s1</code> 控制的<span class="mojikumi-line-end">、</span>在 <code>A</code><span class="mojikumi-line-end">、</span><code>B</code><span class="mojikumi-line-end">、</span><code>C</code><span class="mojikumi-line-end">、</span><code>D</code> 中选一个作为输出的 MUX<span class="mojikumi-line-end">。</span></p>
<p>case expression 在逻辑上的语义是依次判断每个条件<span class="mojikumi-line-end">，</span>以第一个满足的条件作为输出<span class="mojikumi-line-end">，</span>类似于 Rust 的 match<span class="mojikumi-line-end">。</span></p>
<p>下面的 HCL 代码表示计算 <code>A</code><span class="mojikumi-line-end">、</span><code>B</code><span class="mojikumi-line-end">、</span><code>C</code> 中的最小值<span class="mojikumi-line-end">：</span></p>
<section class="code-block relative my-6 shadow" itemprop="hasPart" itemscope itemtype="https://schema.org/SoftwareSourceCode" data-v-c675dba6><div class="h-6 items-center rd-t-1 bg-area px-4 dark:bg-#2A313A media-screen:important-flex" style="display:none;" data-v-c675dba6><h4 class="text-3 text-footer" itemprop="programmingLanguage" aria-label="HCL (CS:APP) 代码块" data-v-c675dba6>HCL (CS:APP)</h4><ile-root id="ile-9"><button title="复制到剪贴板" class="copy-button b-footer text-footer" data-v-63dfb2af><span class="i-mdi-content-copy" data-v-63dfb2af></span><span class="sr-only" role="status" data-v-63dfb2af></span></button></ile-root><!--ISLAND_HYDRATION_PLACEHOLDER_ile-9--></div><div class="dark:hidden" itemprop="text" data-v-c675dba6><pre class="shiki light" style="background-color: #FBFBFB" tabindex="0"><code><span><span style="color: #994CC3">word</span><span style="color: #403F53"> </span><span style="color: #4876D6">Min3</span><span style="color: #403F53"> = [</span></span>
<span><span style="color: #403F53">    </span><span style="color: #4876D6">A</span><span style="color: #403F53"> &lt;= </span><span style="color: #4876D6">B</span><span style="color: #403F53"> &amp;&amp; </span><span style="color: #4876D6">A</span><span style="color: #403F53"> &lt;= </span><span style="color: #4876D6">C</span><span style="color: #403F53">: </span><span style="color: #4876D6">A</span><span style="color: #403F53">;</span></span>
<span><span style="color: #403F53">    </span><span style="color: #4876D6">B</span><span style="color: #403F53"> &lt;= </span><span style="color: #4876D6">C</span><span style="color: #403F53">: </span><span style="color: #4876D6">B</span><span style="color: #403F53">;</span></span>
<span><span style="color: #403F53">    </span><span style="color: #AA0982">1</span><span style="color: #403F53">: </span><span style="color: #4876D6">C</span><span style="color: #403F53">;</span></span>
<span><span style="color: #403F53">];</span></span></code></pre></div><div class="dark:important-block" style="display:none;" data-v-c675dba6><pre class="shiki dark" style="background-color: #011627" tabindex="0"><code><span><span style="color: #C792EA">word</span><span style="color: #D6DEEB"> </span><span style="color: #C5E478">Min3</span><span style="color: #D6DEEB"> = [</span></span>
<span><span style="color: #D6DEEB">    </span><span style="color: #C5E478">A</span><span style="color: #D6DEEB"> &lt;= </span><span style="color: #C5E478">B</span><span style="color: #D6DEEB"> &amp;&amp; </span><span style="color: #C5E478">A</span><span style="color: #D6DEEB"> &lt;= </span><span style="color: #C5E478">C</span><span style="color: #D6DEEB">: </span><span style="color: #C5E478">A</span><span style="color: #D6DEEB">;</span></span>
<span><span style="color: #D6DEEB">    </span><span style="color: #C5E478">B</span><span style="color: #D6DEEB"> &lt;= </span><span style="color: #C5E478">C</span><span style="color: #D6DEEB">: </span><span style="color: #C5E478">B</span><span style="color: #D6DEEB">;</span></span>
<span><span style="color: #D6DEEB">    </span><span style="color: #F78C6C">1</span><span style="color: #D6DEEB">: </span><span style="color: #C5E478">C</span><span style="color: #D6DEEB">;</span></span>
<span><span style="color: #D6DEEB">];</span></span></code></pre></div></section>
<h3 id="arithmeticlogic-unit-alu" class="heading"><a href="#arithmeticlogic-unit-alu" class="heading-anchor" aria-label="章节： Arithmetic/logic unit (ALU)" tabindex="-1"></a><span>Arithmetic/logic unit (ALU)</span></h3>
<p>ALU 是用来进行算术/逻辑运算的组合逻辑电路元件<span class="mojikumi-line-end">，</span>它接收两个 data input 以及一个表示进行何种运算的 control input<span class="mojikumi-line-end">，</span>输出运算的结果<span class="mojikumi-line-end">。</span></p>
<h3 id="测试值是否属于集合" class="heading"><a href="#测试值是否属于集合" class="heading-anchor" aria-label="章节： 测试值是否属于集合" tabindex="-1"></a><span>测试值是否属于集合</span></h3>
<p>在 HCL 中<span class="mojikumi-line-end">，</span>可以使用 <code>in</code> 来表示测试值是否属于集合的电路<span class="mojikumi-line-end">，</span>例如:</p>
<section class="code-block relative my-6 shadow" itemprop="hasPart" itemscope itemtype="https://schema.org/SoftwareSourceCode" data-v-c675dba6><div class="h-6 items-center rd-t-1 bg-area px-4 dark:bg-#2A313A media-screen:important-flex" style="display:none;" data-v-c675dba6><h4 class="text-3 text-footer" itemprop="programmingLanguage" aria-label="HCL (CS:APP) 代码块" data-v-c675dba6>HCL (CS:APP)</h4><ile-root id="ile-10"><button title="复制到剪贴板" class="copy-button b-footer text-footer" data-v-63dfb2af><span class="i-mdi-content-copy" data-v-63dfb2af></span><span class="sr-only" role="status" data-v-63dfb2af></span></button></ile-root><!--ISLAND_HYDRATION_PLACEHOLDER_ile-10--></div><div class="dark:hidden" itemprop="text" data-v-c675dba6><pre class="shiki light" style="background-color: #FBFBFB" tabindex="0"><code><span><span style="color: #994CC3">bool</span><span style="color: #403F53"> </span><span style="color: #4876D6">s1</span><span style="color: #403F53"> = </span><span style="color: #4876D6">code</span><span style="color: #403F53"> </span><span style="color: #994CC3">in</span><span style="color: #403F53"> { </span><span style="color: #AA0982">2</span><span style="color: #403F53">, </span><span style="color: #AA0982">3</span><span style="color: #403F53"> };</span></span>
<span><span style="color: #994CC3">bool</span><span style="color: #403F53"> </span><span style="color: #4876D6">s0</span><span style="color: #403F53"> = </span><span style="color: #4876D6">code</span><span style="color: #403F53"> </span><span style="color: #994CC3">in</span><span style="color: #403F53"> { </span><span style="color: #AA0982">1</span><span style="color: #403F53">, </span><span style="color: #AA0982">3</span><span style="color: #403F53"> };</span></span></code></pre></div><div class="dark:important-block" style="display:none;" data-v-c675dba6><pre class="shiki dark" style="background-color: #011627" tabindex="0"><code><span><span style="color: #C792EA">bool</span><span style="color: #D6DEEB"> </span><span style="color: #C5E478">s1</span><span style="color: #D6DEEB"> = </span><span style="color: #C5E478">code</span><span style="color: #D6DEEB"> </span><span style="color: #C792EA">in</span><span style="color: #D6DEEB"> { </span><span style="color: #F78C6C">2</span><span style="color: #D6DEEB">, </span><span style="color: #F78C6C">3</span><span style="color: #D6DEEB"> };</span></span>
<span><span style="color: #C792EA">bool</span><span style="color: #D6DEEB"> </span><span style="color: #C5E478">s0</span><span style="color: #D6DEEB"> = </span><span style="color: #C5E478">code</span><span style="color: #D6DEEB"> </span><span style="color: #C792EA">in</span><span style="color: #D6DEEB"> { </span><span style="color: #F78C6C">1</span><span style="color: #D6DEEB">, </span><span style="color: #F78C6C">3</span><span style="color: #D6DEEB"> };</span></span></code></pre></div></section>
<h3 id="memory-and-clocking" class="heading"><a href="#memory-and-clocking" class="heading-anchor" aria-label="章节： Memory and Clocking" tabindex="-1"></a><span>Memory and Clocking</span></h3>
<p>组合逻辑电路是无状态且实时更新的<span class="mojikumi-line-end">；</span>与之相对<span class="mojikumi-line-end">，</span>memory 可以存储状态<span class="mojikumi-line-end">，</span>但更新由 clock 控制<span class="mojikumi-line-end">。</span></p>
<p>这一章中会用到的 memory 有两大种三小种<span class="mojikumi-line-end">：</span></p>
<ul>
<li>clocked register: 存储一个值<span class="mojikumi-line-end">，</span>有一个输入和一个输出<span class="mojikumi-line-end">。</span>输出即存储的值<span class="mojikumi-line-end">，</span>而每次 clock rise 时会将存储的值修改为输入<span class="mojikumi-line-end">。</span></li>
<li>random access memory:
<ul>
<li>register file: 存储 15 个值<span class="mojikumi-line-start">（</span>在 Y86-64 处理器中<span class="mojikumi">）</span><span class="mojikumi-line-end">，</span>有两个 read port 和一个 write port<span class="mojikumi-line-end">：</span>
<ul>
<li>每个 read port 有一个输入 <code>src</code> 表示 register identifier<span class="mojikumi-line-end">，</span>有一个输出 <code>val</code> 表示这个 register 存储的值<span class="mojikumi-line-end">，</span>且 <code>src</code> 改变后 <code>val</code> 会立刻改变<span class="mojikumi-line-end">。</span></li>
<li>write port 有一个输入 <code>dst</code> 表示 register identifier<span class="mojikumi-line-end">，</span>另有一个输入 <code>val</code> 用于写入<span class="mojikumi-line-end">。</span>每次 clock rise 时<span class="mojikumi-line-end">，</span>如果 <code>dst</code> 不是 <code>F</code> 就会将 <code>val</code> 写入相应的 register<span class="mojikumi-line-end">。</span></li>
</ul>
</li>
<li>data memory: 存储很多个值<span class="mojikumi-line-end">，</span>用地址进行索引<span class="mojikumi-line-end">。</span>
<ul>
<li>有一个地址输入 <code>address</code><span class="mojikumi-line-end">。</span></li>
<li>有一个信号输入 <code>write</code> 表示进行写入而非读取<span class="mojikumi-line-end">。</span></li>
<li>有一个数据输出 <code>data out</code><span class="mojikumi-line-end">。</span>若 <code>write</code> 为 0<span class="mojikumi-line-end">，</span><code>data out</code> 会立刻输出 <code>address</code> 处存储的值<span class="mojikumi-line-end">。</span></li>
<li>有一个数据输入 <code>data in</code><span class="mojikumi-line-end">。</span>若 <code>write</code> 为 1<span class="mojikumi-line-end">，</span>在 clock rise 时会将 <code>data in</code> 写入 <code>address</code> 处<span class="mojikumi-line-end">。</span></li>
<li>有一个信号输出 <code>error</code><span class="mojikumi-line-end">，</span>在 <code>address</code> 不是合法地址时输出 1<span class="mojikumi-line-end">。</span></li>
</ul>
</li>
</ul>
</li>
</ul>
<p>可以看到<span class="mojikumi-line-end">，</span>这几种 memory 的共同点是读取是实时的<span class="mojikumi-line-end">，</span>但写入由 clock 控制<span class="mojikumi-line-end">。</span></p>
<p>在 Y86-64 的程序状态中<span class="mojikumi-line-end">，</span>寄存器存在 register file 中<span class="mojikumi-line-end">，</span>status flags<span class="mojikumi-line-end">、</span>program counter<span class="mojikumi-line-end">、</span>status code 存在 clocked register 中<span class="mojikumi-line-end">，</span>memory 存在 data memory 中<span class="mojikumi-line-end">。</span></p>
<p>Y86-64 处理器还有一个额外的 read-only instruction memory 用来读取指令<span class="mojikumi-line-end">，</span>而在真实的处理器中这是和内存一体的<span class="mojikumi-line-end">。</span></p>
<a id="data-memory-的-read-信号" name="data-memory-的-read-信号" aria-hidden="true"></a>
<aside role="note" data-v-a2ab257f><div class="shadow-md rd-1 b-l-6 my-6 bg-purple-2 dark:bg-purple-9 b-purple-5" data-v-a2ab257f><div class="p-3 flex justify-between items-center" data-v-a2ab257f><h4 class="flex items-center gap-1 font-bold" data-v-a2ab257f><span class="text-5 i-mdi-help-circle-outline text-purple" data-v-a2ab257f></span><span class="sr-only" data-v-a2ab257f>Question: </span><span data-v-a2ab257f>data memory 的 read 信号</span></h4><!--v-if--></div><div class="overflow-auto rd-br-1 bg-card px-6 dark:bg-bghover" data-v-a2ab257f><p>383 页的图中 data memory 还有一个 <code>read</code> 信号<span class="mojikumi-line-end">，</span>但在文字说明中没有提到它的作用<span class="mojikumi-line-end">，</span>而对 <code>write</code> 信号的说明似乎使得 <code>read</code> 信号无用 🤔</p></div></div></aside>
<h2 id="sequential-y86-64-implementations" class="heading"><a href="#sequential-y86-64-implementations" class="heading-anchor" aria-label="章节： Sequential Y86-64 Implementations" tabindex="-1"></a><span>Sequential Y86-64 Implementations</span></h2>
<p>这一节会实现一个名为 SEQ 的顺序执行的处理器<span class="mojikumi-line-end">。</span>在这个处理器中<span class="mojikumi-line-end">，</span>指令是按顺序一条接着一条执行的<span class="mojikumi-line-end">，</span>且每条指令都会在一个 clock cycle 内执行完毕<span class="mojikumi-line-end">，</span>这要求 clock cycle 很长<span class="mojikumi-line-end">，</span>会导致处理器的执行很慢<span class="mojikumi-line-end">，</span>下两节将对此进行优化<span class="mojikumi-line-end">。</span></p>
<h3 id="指令执行的阶段划分与具体操作" class="heading"><a href="#指令执行的阶段划分与具体操作" class="heading-anchor" aria-label="章节： 指令执行的阶段划分与具体操作" tabindex="-1"></a><span>指令执行的阶段划分与具体操作</span></h3>
<p>将指令的执行划分为多个阶段<span class="mojikumi-line-end">，</span>可以使行为有很大差别的不同指令有一定的统一性<span class="mojikumi-line-end">，</span>方便硬件实现<span class="mojikumi-line-end">。</span></p>
<p>本节会将指令执行划分为六个阶段<span class="mojikumi-line-end">：</span></p>
<ol>
<li>Fetch: 将指令编码中不同部分的值读取出来</li>
<li>Decode: 读取寄存器的值<span class="mojikumi-line-start">（</span>我感觉 fetch 和 decode 这两个名字互换一下才比较对 🤔<span class="mojikumi-line-end">）</span></li>
<li>Execute: 执行运算</li>
<li>Memory: 写入或读取内存</li>
<li>Write back: 写入寄存器</li>
<li>PC update: 更新 program counter</li>
</ol>
<p>每个指令每阶段的具体操作如图<span class="mojikumi-line-start">（</span>CS:APP Figure 4.18~4.21<span class="mojikumi-line-end">、</span>Solution 4.17<span class="mojikumi">）</span><span class="mojikumi-line-end">：</span></p>
<p><picture><img type="image/webp" srcset="/assets/csapp-fig4.18.1d452c61.webp" loading="lazy" src="/assets/csapp-fig4.18.1d452c61.webp" width="1091" height="613" alt="OPq, rrmovq, irmovq"></picture></p>
<p><picture><img type="image/webp" srcset="/assets/csapp-fig4.19.9f043a86.webp" loading="lazy" src="/assets/csapp-fig4.19.9f043a86.webp" width="773" height="611" alt="rmmovq, mrmovq"></picture></p>
<p><picture><img type="image/webp" srcset="/assets/csapp-fig4.20.3d7b8ab1.webp" loading="lazy" src="/assets/csapp-fig4.20.3d7b8ab1.webp" width="782" height="619" alt="pushq, popq"></picture></p>
<p><picture><img type="image/webp" srcset="/assets/csapp-fig4.21.304f2868.webp" loading="lazy" src="/assets/csapp-fig4.21.304f2868.webp" width="1113" height="562" alt="jXX, call, ret"></picture></p>
<p><picture><img type="image/webp" srcset="/assets/csapp-sol4.17.4eb57d44.webp" loading="lazy" src="/assets/csapp-sol4.17.4eb57d44.webp" width="461" height="432" alt="cmovXX"></picture></p>
<h3 id="seq-的主体电路" class="heading"><a href="#seq-的主体电路" class="heading-anchor" aria-label="章节： SEQ 的主体电路" tabindex="-1"></a><span>SEQ 的主体电路</span></h3>
<p>CS:APP Figure 4.23 大致展示了 SEQ 的主体电路<span class="mojikumi-line-end">：</span></p>
<p><picture><img type="image/webp" srcset="/assets/csapp-fig4.23.ed95a333.webp" loading="lazy" src="/assets/csapp-fig4.23.ed95a333.webp" width="798" height="1065" alt="SEQ 主体电路"></picture></p>
<p>其中蓝色的元件是 black box<span class="mojikumi-line-end">，</span>灰色的元件会在后面进行设计<span class="mojikumi-line-end">，</span>还有部分电路连接没有画出来<span class="mojikumi-line-end">。</span></p>
<p>这个电路大概看着有个印象即可<span class="mojikumi-line-end">，</span>细节会在后面说明<span class="mojikumi-line-end">。</span></p>
<h3 id="seq-的时序控制" class="heading"><a href="#seq-的时序控制" class="heading-anchor" aria-label="章节： SEQ 的时序控制" tabindex="-1"></a><span>SEQ 的时序控制</span></h3>
<p>在 SEQ 中<span class="mojikumi-line-end">，</span>每个时钟周期执行一条指令<span class="mojikumi-line-end">，</span>而时钟控制的只有各种 memory 的写入<span class="mojikumi-line-end">，</span>memory 的读取和运算都是用组合逻辑电路实现的<span class="mojikumi-line-end">，</span>虽然在逻辑上有执行顺序<span class="mojikumi-line-end">，</span>在电路上却是同时执行的<span class="mojikumi-line-end">，</span>可以看成一个关于 memory 的函数<span class="mojikumi-line-end">。</span></p>
<p>也就是说<span class="mojikumi-line-end">，</span>整个执行过程是<span class="mojikumi-line-end">：</span>读取 memory 并计算出需要写入 memory 的值<span class="mojikumi-line-end">，</span>然后在 clock rise 时执行写入<span class="mojikumi-line-end">，</span>从而读取到新的 memory 的值而执行下一条指令<span class="mojikumi-line-end">。</span></p>
<p>为了这个设计能够实现<span class="mojikumi-line-end">，</span>一条重要的原则是 <span class="mojikumi">“</span>No reading back<span class="mojikumi">”</span><span class="mojikumi-line-end">，</span>即一条指令不能先更新再读取同一个值<span class="mojikumi-line-end">。</span>例如<span class="mojikumi-line-end">，</span>在 <code>pushq</code> 中<span class="mojikumi-line-end">，</span>不是先更新 <code>R[%rsp]</code> 再写入 <code>M[R[%rsp]]</code><span class="mojikumi-line-end">，</span>而是先算出 <code>valE</code><span class="mojikumi-line-end">，</span>再写入 <code>M[valE]</code><span class="mojikumi-line-end">，</span>最后将 <code>valE</code> 写入 <code>R[%rsp]</code><span class="mojikumi-line-end">。</span>又例如<span class="mojikumi-line-end">，</span>有的指令会修改 status flags<span class="mojikumi-line-end">，</span>有的会读取<span class="mojikumi-line-end">，</span>但没有指令既修改又读取<span class="mojikumi-line-end">。</span></p>
<p>因为运算都是同时进行的<span class="mojikumi-line-end">，</span>执行的六个阶段实际上是六个部分<span class="mojikumi-line-end">。</span></p>
<h3 id="seq-的具体实现" class="heading"><a href="#seq-的具体实现" class="heading-anchor" aria-label="章节： SEQ 的具体实现" tabindex="-1"></a><span>SEQ 的具体实现</span></h3>]]></content:encoded>
            <category domain="https://ouuan.moe/tag/csapp">csapp</category>
            <category domain="https://ouuan.moe/tag/%E5%AD%A6%E4%B9%A0%E7%AC%94%E8%AE%B0">学习笔记</category>
            <category domain="https://ouuan.moe/tag/WIP">WIP</category>
        </item>
    </channel>
</rss>