Rust 编译器与运行时揭秘

第13章 FFI：与 C 世界的桥梁

作者杨艺韬 · 14,048 字

第13章 FFI：与 C 世界的桥梁

“ABI 是两种语言之间的握手协议——参数怎么传、返回值放哪里、谁来清理栈、谁来释放内存。” —— 系统编程格言

本章要点

extern "C" 告诉编译器使用 C ABI 调用约定：参数通过特定寄存器和栈传递，遵循平台标准
repr(C) 保证 struct 的内存布局与 C 编译器完全一致——字段顺序、对齐、padding 都可预测
Rust 的 String/&str 是 (ptr, len) 胖指针，C 的字符串是 null 结尾的 char*——需要 CStr/CString 转换
#[no_mangle] 阻止编译器对函数名进行 name mangling，使 C 代码可以通过原始名称链接
回调函数通过函数指针 + void* 上下文的模式在 Rust 与 C 之间传递
FFI 边界的内存管理遵循谁分配谁释放原则——不要混用分配器
安全封装（Safe Wrapper Pattern）是将 unsafe FFI 调用包装为安全 Rust API 的标准做法

13.1 为什么需要 FFI

Rust 诞生在一个以 C 为基础的世界中。操作系统内核用 C 写成，数十年积累的高质量库——从 SQLite 到 OpenSSL——都以 C API 的形式存在。FFI（Foreign Function Interface）解决三个核心需求：

调用 C 库。 当你需要使用 SQLite、OpenSSL 或任何已有的 C 库时，FFI 让 Rust 直接调用这些函数，无需重写。

调用系统 API。 POSIX API、Windows API 几乎全部以 C ABI 暴露。Rust 标准库在 library/std/src/sys/ 目录下大量使用 FFI 调用底层系统函数——文件系统、网络 I/O、线程、随机数生成，全部通过 extern "C" 块实现。

将 Rust 嵌入其他语言。 Python、C/C++、Go、Ruby 都可以通过 C ABI 调用 Rust 编写的共享库。

C ABI 是系统编程世界的通用语言。 几乎所有语言都能生成或调用符合 C ABI 的代码，Rust 选择以此作为与外部世界交互的桥梁。

13.2 extern “C” 与 ABI 规范：编译器做了什么

ABI（Application Binary Interface）定义了二进制层面的交互规则：参数放哪些寄存器、返回值怎么传、谁清理栈、哪些寄存器需要保存。

// Rust ABI——编译器自由优化，不保证跨版本兼容
fn rust_abi(a: i32, b: i64) -> i32 { a + b as i32 }

// C ABI——遵循平台标准，可以被 C/Python/Go 调用
extern "C" fn c_abi(a: i32, b: i64) -> i32 { a + b as i32 }

在 rustc_target/src/spec/abi_map.rs 中，AbiMap 负责将源码中的 ABI 标注映射为规范化的 CanonAbi：

// 来自 rustc_target/src/spec/abi_map.rs（简化）
pub fn canonize_abi(&self, extern_abi: ExternAbi, has_c_varargs: bool) -> AbiMapping {
    match (extern_abi, arch) {
        (ExternAbi::C { .. }, _) => CanonAbi::C,
        (ExternAbi::Rust | ExternAbi::RustCall, _) => CanonAbi::Rust,
        (ExternAbi::System { .. }, ArchKind::X86)
            if os == OsKind::Windows && !has_c_varargs =>
        {
            CanonAbi::X86(X86Call::Stdcall)  // Windows x86 上 System = Stdcall
        }
        (ExternAbi::System { .. }, _) => CanonAbi::C, // 其他平台 System = C
        // ...
    }
}

rustc_target/src/callconv/mod.rs 中的 FnAbi 结构描述函数在特定 ABI 下的完整调用信息，每个参数的传递方式由 PassMode 枚举描述：

// 来自 rustc_target/src/callconv/mod.rs
pub enum PassMode {
    Ignore,                                         // ZST，忽略
    Direct(ArgAttributes),                          // 直接通过寄存器
    Pair(ArgAttributes, ArgAttributes),              // ScalarPair，两个寄存器
    Cast { pad_i32: bool, cast: Box<CastTarget> },  // 类型转换后传递
    Indirect { attrs: ArgAttributes, meta_attrs: Option<ArgAttributes>, on_stack: bool },
}

flowchart TD
    subgraph "编译器 ABI 处理流水线"
        A["源码: extern &quot;C&quot; fn foo(a: i32, b: MyStruct)"]
        B["AbiMap::canonize_abi() → CanonAbi::C"]
        C["FnAbi 构建: 为每个参数确定 PassMode"]
        D["adjust_for_foreign_abi(): 平台特定调整"]
        E["最终: args=[Direct(edi), Indirect(rdi)], ret=Direct(eax)"]
    end

    A --> B --> C --> D --> E

    style A fill:#3b82f6,color:#fff,stroke:none
    style B fill:#10b981,color:#fff,stroke:none
    style E fill:#f59e0b,color:#fff,stroke:none

在 x86-64 上，classify_arg 按 System V ABI 规则将参数分为 Int（通用寄存器 rdi, rsi, rdx, rcx, r8, r9）和 Sse（SSE 寄存器 xmm0-xmm7）两类。超出寄存器容量的参数通过栈传递。

13.2.1 `AbiMapping` 的 `Deprecated` 三态与 “force listing” 自文档化设计

上面的 canonize_abi 伪代码把返回值暗示成普通的 CanonAbi。但真实签名是——AbiMapping——一个 3 变体 enum（rustc_target/src/spec/abi_map.rs:17）：

pub enum AbiMapping {
    /// this ABI is exactly mapped for this platform
    Direct(CanonAbi),
    /// we don't yet warn on this, but we will
    Deprecated(CanonAbi),
    /// ABI we do not map for this platform: it must not reach codegen
    Invalid,
}

Deprecated 这一档非常关键——它代表”当前还能编译、未来将变成错误”的 ABI 标注。例如 extern "stdcall" 在非 Windows 系统上、或 extern "win64" 在 x86 上——这些历史上被接受但语义模糊的 ABI、rustc 决定慢慢收紧：

当前版本：Deprecated(CanonAbi) 返回规范化 ABI，不警告、不报错，继续能编译
未来版本：这些 case 会升级成 Invalid，真 lint warn → 硬 error

源码注释原文 “we don’t yet warn on this, but we will” 很直白——ABI 生态演化的明示缓冲带。对跨语言项目维护者有实用价值：升级 rustc 时 Invalid 突然出现意味着你依赖的 extern "..." 被禁用了、需要改代码。而 Deprecated 阶段只是信号、可以从容准备。

AbiMap 的 struct 注释本身（line 7-8）反映了 rustc 作者的自我认知：

/// A maybe-transitional structure circa 2025 for hosting future experiments in
/// encapsulating arch-specific ABI lowering details to make them more testable.

“maybe-transitional” + “circa 2025” + “future experiments” 三个词串在一起——rustc 源码里这么自谦的注释不多见。作者明确标注这个结构可能还会大改。读者看到这种注释要知道：AbiMap 是 WIP、将来可能被重构；不要在外部代码里依赖它的具体形状。这也是读 rustc 源码时的一条通用规则——找这种 “transitional” / “experimental” 标记能帮你判断哪些 API 相对稳定、哪些处于”改它不算 breaking change”的状态。

from_target 里的”force listing”自文档化设计（line 49）：

pub fn from_target(target: &Target) -> Self {
    // the purpose of this little exercise is to force listing what affects these mappings
    let arch = match target.arch {
        Arch::AArch64 => ArchKind::Aarch64,
        Arch::AmdGpu => ArchKind::Amdgpu,
        Arch::Arm => ArchKind::Arm(...),
        Arch::Avr => ArchKind::Avr,
        Arch::LoongArch32 | Arch::LoongArch64 => ArchKind::LoongArch,
        Arch::Msp430 => ArchKind::Msp430,
        Arch::Nvptx64 => ArchKind::Nvptx,
        Arch::RiscV32 | Arch::RiscV64 => ArchKind::Riscv,
        Arch::X86 => ArchKind::X86,
        Arch::X86_64 => ArchKind::X86_64,
        _ => ArchKind::Other,
    };

源码注释 “the purpose of this little exercise is to force listing what affects these mappings” 点破了这段设计意图——穷举 match 不是简单翻译、而是有意用 exhaustive match 做自文档。把”哪些 target arch 影响 ABI 决定”显式列出来——任何 rustc 贡献者未来加新 arch 时、Rust 编译器会在这里 error（如果未来把 _ => ArchKind::Other 改成严格匹配）、强迫贡献者思考这个新架构是否影响 ABI。

这和第 2 章 elaborate_drops 里 debug_assert_ne! 捕获 rustc bug、第 6 章 CollectionMode 双模式用 enum exhaustive 防止未来漏处理是同一条工程神经——用类型系统和编程规范的强制性、把”将来别忘了”这件事从人脑责任转移到编译器责任。这也解释了为什么 rustc 源码看起来”啰嗦”——大量 match 和显式列表的代码、其实在扮演”将来自己犯错时会被编译器抓住”的保险丝角色。

13.3 类型映射：Rust 类型与 C 类型

core/src/ffi/primitives.rs 中定义了与 C 类型精确对应的类型别名：

C 类型	Rust FFI 类型	说明
`char`	`c_char`	平台相关：ARM Linux 为 `u8`，x86 Linux 为 `i8`
`int` / `unsigned int`	`c_int` / `c_uint`	通常 `i32` / `u32`
`long`	`c_long`	Linux 64 位: `i64`, Windows: `i32`
`double`	`c_double`	通常 `f64`
`void*`	`*mut c_void`	通用指针
`size_t`	`usize`	指针大小的无符号整数

c_char 的符号性差异值得注意——编译器源码中有大段注释引用各平台的 ABI 文档。Apple 平台即使在 ARM 上也强制 signed char，而 ARM Linux 默认 unsigned char。

字符串转换：CStr 与 CString

Rust 标准库 library/std/src/ffi/mod.rs 中详细说明了两种字符串体系的差异：Rust 字符串保证 UTF-8、存储长度、可含内部 \0；C 字符串无编码保证、以 \0 结尾、不可含内部 \0。

graph TB
    subgraph "Rust → C"
        R1["&str / String"] -->|"CString::new()"| R2["CString"]
        R2 -->|".as_ptr()"| R3["*const c_char"]
    end

    subgraph "C → Rust"
        C1["*const c_char"] -->|"CStr::from_ptr()"| C2["&CStr"]
        C2 -->|".to_str()"| C3["&str"]
    end

    style R1 fill:#3b82f6,color:#fff,stroke:none
    style R2 fill:#10b981,color:#fff,stroke:none
    style R3 fill:#f59e0b,color:#fff,stroke:none
    style C1 fill:#f59e0b,color:#fff,stroke:none
    style C2 fill:#10b981,color:#fff,stroke:none
    style C3 fill:#3b82f6,color:#fff,stroke:none

use std::ffi::{CStr, CString, c_char};

fn rust_to_c() {
    let c_string = CString::new("Hello, C!").expect("包含内部 \\0");
    unsafe { puts(c_string.as_ptr()); }
}

unsafe fn c_to_rust(c_buf: *const c_char) {
    let c_str = CStr::from_ptr(c_buf);
    // to_str() 做 UTF-8 验证；to_string_lossy() 做有损转换
    match c_str.to_str() {
        Ok(s) => println!("{}", s),
        Err(_) => println!("{}", c_str.to_string_lossy()),
    }
}

// Rust 1.77+ 的 C 字符串字面量——编译期保证末尾 \0 和无内部 \0
let greeting: &CStr = c"Hello, C world!";

CString::new 返回 Result 是因为内部 \0 会导致 C 截断字符串——这是编译器无法静态检查的运行时约束。

13.3.1 源码核对：CStr::from_ptr 的 strlen + from_raw_parts 双步实现

§13.3 给了字符串转换的概念。打开 library/core/src/ffi/c_str.rs:253 的 CStr::from_ptr 真实代码：

#[inline] // inline is necessary for codegen to see strlen.
#[must_use]
#[stable(feature = "rust1", since = "1.0.0")]
#[rustc_const_stable(feature = "const_cstr_from_ptr", since = "1.81.0")]
pub const unsafe fn from_ptr<'a>(ptr: *const c_char) -> &'a CStr {
    // SAFETY: The caller has provided a pointer that points to a valid C
    // string with a NUL terminator less than `isize::MAX` from `ptr`.
    let len = unsafe { strlen(ptr) };

    // SAFETY: The caller has provided a valid pointer with length less than
    // `isize::MAX`, so `from_raw_parts` is safe. The content remains valid
    // and doesn't change for the lifetime of the returned `CStr`. This
    // means the call to `from_bytes_with_nul_unchecked` is correct.
    //
    // The cast from c_char to u8 is ok because a c_char is always one byte.
    unsafe { Self::from_bytes_with_nul_unchecked(slice::from_raw_parts(ptr.cast(), len + 1)) }
}

5 个值得读懂的细节：

1、#[inline] // inline is necessary for codegen to see strlen.——注释明确点出inline 不是性能优化，是正确性需要。inline 让 LLVM 能识别这是个 strlen 调用、做向量化优化。如果不 inline、strlen 调用变成普通函数调用、LLVM 看不到内部、可能错失 SSE/AVX strlen 优化。#[inline] 在这里是 codegen quality 而非编译时间的取舍——值得记住的 Rust idiom。

2、pub const unsafe fn——同时是 const fn + unsafe fn。const 让它能在常量上下文求值（编译期 strlen）；unsafe 因为调用者必须保证 ptr 是 valid C string。#[rustc_const_stable(feature = "const_cstr_from_ptr", since = "1.81.0")] 显示这条 const 化在 1.81 stable——比函数本身（1.0 stable）晚 14 年。慢慢扩展 const 能力的过程反映 Rust const eval 引擎的逐年成熟。

3、'a lifetime 是任意的——caller 决定。这条函数签名最危险的部分——返回的 &'a CStr 的 lifetime 完全由调用者选择、Rust 不验证。如果你写 'static 但 ptr 实际只 valid 几秒、之后 use after free。这是 unsafe FFI 的根本性陷阱——文档明确要求 caller 保证 lifetime 正确。

4、strlen(ptr) + from_raw_parts(ptr.cast(), len + 1) 两步——先 O(N) 扫一遍找 NUL、再用 len+1 构造 byte slice（包含 NUL）。N 是字符串长度——所以 from_ptr 是 O(N) 而不是 O(1)。如果你已经知道 length（来自其他 source）——用 from_bytes_with_nul_unchecked 直接 O(1) 构造、避免扫描。

5、The cast from c_char to u8 is ok because a c_char is always one byte——注释里的小确认。c_char 在不同平台是 i8 或 u8 但总是 1 字节——所以 ptr.cast() 是安全的。这种”看起来 obvious 但不写出来未来 contributor 可能踩坑”的注释贯穿 rustc 源码——保护未来自己的工程纪律。

13.3.2 源码核对：from_bytes_until_nul 的 memchr 优化

CStr 还有一个有趣 API——from_bytes_until_nul（c_str.rs:298）：

pub const fn from_bytes_until_nul(bytes: &[u8]) -> Result<&CStr, FromBytesUntilNulError> {
    let nul_pos = memchr::memchr(0, bytes);
    match nul_pos {
        Some(nul_pos) => {
            // FIXME(const-hack) replace with range index
            let subslice = unsafe { crate::slice::from_raw_parts(bytes.as_ptr(), nul_pos + 1) };
            Ok(unsafe { CStr::from_bytes_with_nul_unchecked(subslice) })
        }
        None => Err(FromBytesUntilNulError(())),
    }
}

应用场景：你从一个 stack buffer 读 C string：

let mut buf = [0u8; 256];
read_into(&mut buf);
let c = CStr::from_bytes_until_nul(&buf)?;

这比 from_ptr 安全（不依赖外部 lifetime），也比 from_bytes_with_nul 灵活（不需要 buffer 末尾恰好是 NUL）。

memchr::memchr(0, bytes) 不是普通 byte iteration——是高度优化的 SIMD search。在 x86-64 上单条 PCMPEQB 指令能一次扫 16 字节、AVX-512 能扫 64 字节。比手写 for b in bytes { if *b == 0 ... } 快 10-20x。

FIXME(const-hack) 注释暗示——bytes[..nul_pos+1] 这种 range index 在 const fn 里还不支持（const fn 限制持续放宽中）——所以用 from_raw_parts 绕过去。未来某个 Rust 版本里这条 FIXME 会被删、代码改成更直观的 &bytes[..nul_pos+1]。

读 rustc/std 源码时遇到 FIXME(const-hack) 这种 tag 你能立刻知道——这是 const fn 限制还没解决前的临时绕路——未来会消失、不算技术债。

13.4 repr(C)：确保 C 兼容的内存布局

repr(C) 强制使用 C 的布局规则：字段按声明顺序排列，对齐遵循 C ABI 标准。

#[repr(C)]
struct Misaligned {
    a: u8,     // 偏移 0
    // 3 字节 padding
    b: u32,    // 偏移 4
    c: u8,     // 偏移 8
    // 3 字节 padding
}
// 总大小 12 字节（6 字节 padding）

#[repr(C)]
enum Color { Red = 0, Green = 1, Blue = 2 }
// 等价于 C 的 enum，通常 4 字节

#[repr(C, u8)]
enum SmallColor { Red = 0, Green = 1, Blue = 2 }
// 指定底层类型为 u8，大小 1 字节

注意：repr(C) 不允许编译器重排字段——如果你需要减少 padding，必须手动调整字段顺序（将大对齐字段放前面）。

13.3.3 源码核对：CString::into_raw 与 from_raw 的所有权转移协议

CString::into_raw（library/alloc/src/ffi/c_str.rs:455）只有 3 行：

#[inline]
#[must_use = "`self` will be dropped if the result is not used"]
#[stable(feature = "cstr_memory", since = "1.4.0")]
pub fn into_raw(self) -> *mut c_char {
    Box::into_raw(self.into_inner()) as *mut c_char
}

但这 3 行藏着 FFI 内存管理最重要的一条契约——所有权完整转移给 raw pointer：

#[must_use = "...will be dropped if not used"]——一个救命的 lint。如果用户写 cs.into_raw();（忽略返回值）——CString 的 Box 会被 leak、内存永远释放不了。must_use 让 rustc 至少 warn 用户。ownership transfer 函数的 must_use 是 Rust API 设计的标准 idiom——你写自己的 unsafe FFI binding 时也该加。

Box::into_raw 是真正的”放手”操作——Box 不再 drop 内部数据、所有权转给 raw pointer。对应的 CString::from_raw 才会重新接管：

“It should only be called with a pointer that was earlier obtained by calling CString::into_raw. Other usage (e.g., trying to take ownership of a string that was allocated by foreign code) is likely to lead to undefined behavior or allocator corruption.”

只有 from into_raw 出来的 pointer 才能传 from_raw——绝不能用 C 代码 malloc 的指针！为什么？因为 Rust 的 Box 用全局 GlobalAlloc（默认是 system malloc，但用户可以换）——如果你用 C 的 malloc 分配、用 Rust 的 Box drop（内部走 dealloc）、可能导致 allocator state corruption。这条规则是 §13.9 “谁分配谁释放”的源码层证据——不是 best practice、是必要约束。

13.3.4 源码核对：CString::into_string 的失败回退路径

CString::into_string（c_str.rs:478）展示了 Rust API 设计里失败时不丢数据的典型 pattern：

pub fn into_string(self) -> Result<String, IntoStringError> {
    String::from_utf8(self.into_bytes()).map_err(|e| IntoStringError {
        error: e.utf8_error(),
        inner: unsafe { Self::_from_vec_unchecked(e.into_bytes()) },
    })
}

如果 CString 内容不是合法 UTF-8——返回 IntoStringError、但**inner 字段把原 CString 还回去**。用户可以：

match cstring.into_string() {
    Ok(s) => use_string(s),
    Err(err) => {
        let original_cstring = err.into_cstring();  // 拿回原对象
        // ... 用其他方式处理（比如保留 raw bytes）
    }
}

Rust 的 Result 本身只能携带 Ok/Err 两种值——错误时输入数据通常丢失。IntoStringError 这种”错误对象 carry 原数据”的 pattern 让用户能从失败中恢复——比 Java 风格 throws IllegalArgumentException("...") 把数据吞掉强 100 倍。

这条 pattern 在 std 里反复出现：Vec::from_iter 失败时 carry partial Vec、String::from_utf8 失败时 carry 原 Vec<u8>、File::open 失败时 error 带 OS error code。Rust API 设计里”never lose data on failure”是显式追求——读完 std 你会感受到这条规则的一致性。

repr(C) 联合体

FFI 中经常需要联合体来表示 C 的 union：

#[repr(C)]
union Value {
    i: i32,
    f: f32,
    b: bool,
}
// 等价于 C 的 union Value { int i; float f; bool b; };
// 大小 = max(4, 4, 1) = 4 字节

// 访问联合体字段是 unsafe 的——编译器无法追踪当前活跃的字段
let v = Value { i: 42 };
let i = unsafe { v.i }; // OK，因为我们知道是 i 被设置的

13.4.1 源码核对：repr(C) 关闭字段重排——`inhibit_struct_field_reordering`

§13.4 给了 repr(C) 内存布局保证的高层概念。打开 compiler/rustc_abi/src/layout.rs:1177 能看到具体实现：

let optimize_field_order = !repr.inhibit_struct_field_reordering();

inhibit_struct_field_reordering() 在 repr(C) 时返回 true——意味着 optimize_field_order = false——字段保持源码顺序、不被编译器重排。

普通 Rust struct（不带 repr）默认走 univariant_biased 的 optimize_field_order = true 路径——LLVM/rustc 会按字段大小/对齐度重排字段、最小化 padding。比如：

struct Foo {
    a: u8,    // 1 byte
    b: u64,   // 8 bytes
    c: u8,    // 1 byte
}

普通 Rust：编译器重排为 b, a, c + 6 byte padding——total 16 bytes。 repr(C)：保持 a, b, c + 7+7=14 byte padding——total 24 bytes。

多 50% 内存代价换 C 兼容性——这是 repr(C) 的真实成本。

inhibit_struct_field_reordering 还把 repr(packed) 作为 inhibit 触发条件——packed struct 当然不能重排（user 显式要求 packed 布局）。

13.4.2 源码核对：-Z randomize-layout 的 RNG 字段重排——unsafe 假设的 fuzzing 工具

layout.rs:1186-1202 有段非常有趣的代码：

if optimize_field_order && fields.len() > 1 {
    // If `-Z randomize-layout` was enabled for the type definition we can shuffle
    // the field ordering to try and catch some code making assumptions about layouts
    // we don't guarantee.
    if repr.can_randomize_type_layout() && cfg!(feature = "randomize") {
        #[cfg(feature = "randomize")]
        {
            use rand::SeedableRng;
            use rand::seq::SliceRandom;
            let mut rng = rand_xoshiro::Xoshiro128StarStar::seed_from_u64(
                field_seed.wrapping_add(repr.field_shuffle_seed).as_u64(),
            );
            // Shuffle the ordering of the fields.
            optimizing.shuffle(&mut rng);
        }
    } else {
        // Otherwise we just leave things alone and actually optimize the type's fields
        ...
    }
}

-Z randomize-layout 是一个 nightly compiler flag——启用后普通 Rust struct 的字段顺序会被随机洗牌。目的是：测试用户的 unsafe 代码是否错误地假设了字段顺序。

如果你的 unsafe 代码做 *(ptr as *const u8).add(8) 假设 8 字节偏移是某个字段——randomize-layout 可能让那个偏移指向另一个字段——你的 test 立刻挂。这是 rustc 给 unsafe Rust 用户的 fuzzing 工具——主动制造混乱让你发现自己的 implicit assumption。

Xoshiro128StarStar 是个高质量 PRNG——速度快、统计性好、有完整 test suite。rand_xoshiro crate 提供它的实现——rustc 自己用、应用代码也能用。

field_seed.wrapping_add(repr.field_shuffle_seed)——种子是两部分组合：基于字段内容的哈希 + 类型级 seed。这让同一个 type 在多次编译里得到一致的 shuffle——不会”昨天编译跑过、今天编译挂”。Reproducibility 是 fuzzing 工具的基础。

这条 randomize-layout 是 Rust 团队对 unsafe 生态的主动质量投入——你写 unsafe 代码时建议跑一次 nightly + -Z randomize-layout 看测试是否还过。这是 rustc 给 unsafe 工程的”安全网”——比每个 unsafe 库自己写 fuzzing 强 10 倍。

13.4.3 源码核对：alignment_group_key 的字段排序启发——niche-aware 优化

继续读 layout.rs:1219-1260 的 alignment_group_key——它是字段重排算法的”排序 key”：

let alignment_group_key = |layout: &F| {
    if let Some(pack) = pack {
        // Return the packed alignment in bytes.
        layout.align.abi.min(pack).bytes()
    } else {
        // Returns `log2(effective-align)`.
        let align = layout.align.bytes();
        let size = layout.size.bytes();
        let niche_size = layout.largest_niche.map(|n| n.available(dl)).unwrap_or(0);
        // Group [u8; 4] with align-4 or [u8; 6] with align-2 fields.
        let size_as_align = align.max(size).trailing_zeros();
        let size_as_align = if largest_niche_size > 0 {
            match niche_bias {
                NicheBias::Start => { ... }
                NicheBias::End => { ... }
            }
        };
        ...
    }
};

四条精巧设计：

1、按 alignment 分组——同 align 的字段放一起、避免重新对齐 padding。把 (u8, u64, u8) 重排成 (u64, u8, u8)——u64 align 8、两个 u8 后面只需 6 字节 padding（不是 7+7=14）。

2、size_as_align 处理 [u8; N] 这种 array——array 的 align 是 element align 但 size 大、按 size 排序更合理。代码用 align.max(size).trailing_zeros()——把 align 和 size 取 max 再取 log2、统一编码进一个数字。

3、niche-aware 排序——niche（“未使用的 bit pattern”）让 Option<T> 等 enum 能 0 成本表达。Niche 通常在字段的开头或结尾——niche_bias 控制 niche 在前还是在后。niche bias 影响字段排序选择——为了让 enum discriminant 能 reuse niche、字段位置要配合。

4、注释里的具体例子——“Given A(u8, [u8; 16]) and B(bool, [u8; 16]) we want to bump the array to the front in the first case … but keep the bool in front in the second case for its niches”——bool 有 niche（不是 0 也不是 1 的 byte 都不可能、能塞 enum discriminant）、u8 没 niche——所以 B 的 bool 必须放前面、A 的 u8 可以放后面。

这条字段排序算法是 Rust 编译器几年优化的结晶——各种 niche optimization、size optimization、alignment grouping 都纠缠在一起。让 Rust 的 enum 能做到 zero-cost optimization 比 C++ 强——C++ 没有 niche optimization、std::optional<bool> 至少 2 字节、Rust Option<bool> 可以 1 字节。

13.5 从 Rust 调用 C：extern 块、链接与 bindgen

use std::ffi::{c_int, c_double, c_void};

extern "C" {
    fn abs(x: c_int) -> c_int;
    fn sqrt(x: c_double) -> c_double;
    fn malloc(size: usize) -> *mut c_void;
    fn free(ptr: *mut c_void);
}

fn main() {
    unsafe {
        assert_eq!(abs(-42), 42);
        println!("sqrt(2) = {}", sqrt(2.0));
    }
}

通过 build.rs 控制链接：

// build.rs
fn main() {
    println!("cargo:rustc-link-lib=static=foo");       // 静态链接 libfoo.a
    println!("cargo:rustc-link-search=native=/path");
    println!("cargo:rustc-link-lib=dylib=bar");         // 动态链接 libbar.so
}

bindgen 从 C 头文件自动生成 Rust 绑定，在 build.rs 中集成后每次构建自动更新：

// build.rs
fn main() {
    let bindings = bindgen::Builder::default()
        .header("wrapper.h")
        .generate().expect("生成绑定失败");
    let out = std::path::PathBuf::from(std::env::var("OUT_DIR").unwrap());
    bindings.write_to_file(out.join("bindings.rs")).unwrap();
}

13.6 从 C 调用 Rust：#[no_mangle]、extern “C” fn 与 cdylib

#[no_mangle]
pub extern "C" fn rust_add(a: i32, b: i32) -> i32 { a + b }

编译器源码 rustc_symbol_mangling/src/lib.rs 中记录了 #[no_mangle] 的处理逻辑：

// 来自 rustc_symbol_mangling/src/lib.rs（简化）
fn compute_symbol_name(tcx, instance, ...) -> String {
    let attrs = tcx.codegen_instance_attrs(instance.def);
    if let Some(name) = attrs.symbol_name { return name.to_string(); }
    if attrs.flags.contains(CodegenFnAttrFlags::NO_MANGLE) {
        return tcx.item_name(def_id).to_string();  // 保留原始名称
    }
    // 否则使用 v0 mangling scheme 编码 crate 路径、泛型参数等
}

没有 #[no_mangle]，rust_add 会变成类似 _RNvCskwGfYPst2Cb_7my_crate8rust_add 的符号名——C 链接器找不到。

要构建 C 可用的库，在 Cargo.toml 中指定：

[lib]
crate-type = ["cdylib"]    # .so (Linux) / .dylib (macOS) / .dll (Windows)
# 或者
crate-type = ["staticlib"]  # .a (Unix) / .lib (Windows)

编译后，C 代码通过头文件声明函数签名并链接：

// 声明 Rust 函数
extern int rust_add(int a, int b);
int main() { return rust_add(3, 4); }  // 链接: gcc main.c -L target/release -lmy_lib

13.7 回调函数：将 Rust 闭包传递给 C

C 库常接受函数指针回调——排序的比较器、事件循环的处理器、遍历操作的访问器。好的 C API 遵循函数指针 + void* 上下文的模式，void* 让调用者传递任意数据给回调函数，避免全局变量。

无状态回调

最简单的情况——回调不需要访问任何外部状态：

type CCallback = extern "C" fn(event: i32, user_data: *mut std::ffi::c_void);
extern "C" { fn register_callback(cb: CCallback, user_data: *mut std::ffi::c_void); }

extern "C" fn simple_callback(event: i32, _data: *mut std::ffi::c_void) {
    println!("收到事件: {}", event);
}

fn register_simple() {
    unsafe { register_callback(simple_callback, std::ptr::null_mut()); }
}

有状态回调：通过 void* 传递上下文

struct Context { name: String, count: u32 }

extern "C" fn stateful_callback(event: i32, user_data: *mut std::ffi::c_void) {
    let ctx = unsafe { &mut *(user_data as *mut Context) };
    ctx.count += 1;
    println!("[{}] 事件 {} (第 {} 次)", ctx.name, event, ctx.count);
}

fn register_stateful() {
    let ctx = Box::new(Context { name: "监控".into(), count: 0 });
    let raw = Box::into_raw(ctx); // 转移所有权，防止被 drop
    unsafe { register_callback(stateful_callback, raw as *mut std::ffi::c_void); }
    // 重要：稍后需要用 Box::from_raw(raw) 回收内存！
}

传递闭包：trampoline 模式

Rust 闭包捕获了环境变量，不是裸函数指针。通过泛型 trampoline 函数实现转换：

extern "C" fn closure_trampoline<F: FnMut(i32)>(event: i32, data: *mut std::ffi::c_void) {
    let closure = unsafe { &mut *(data as *mut F) };
    closure(event);
}

fn register_closure<F: FnMut(i32) + 'static>(closure: F) {
    let boxed = Box::new(closure);
    let raw = Box::into_raw(boxed);
    unsafe {
        register_callback(closure_trampoline::<F>, raw as *mut std::ffi::c_void);
    }
}

// 使用
let mut counter = 0;
register_closure(move |event| {
    counter += 1;
    println!("闭包: 事件 {}, 计数 {}", event, counter);
});

trampoline 函数是泛型的，Rust 为每种闭包类型生成特化版本——利用单态化保证类型安全。关键注意点：通过 void* 传递的上下文对象必须在回调期间保持存活，通常用 Box::into_raw 延长生命周期。

13.8 跨 FFI 边界的错误处理

Rust 的 Result 和 C 的错误码需要精心桥接。最关键的原则：panic 不能跨越 FFI 边界。

#[no_mangle]
pub extern "C" fn safe_parse(input: *const std::ffi::c_char, result: *mut i32) -> i32 {
    std::panic::catch_unwind(|| {
        if input.is_null() || result.is_null() { return -1; }
        let s = unsafe { std::ffi::CStr::from_ptr(input) };
        match s.to_str().ok().and_then(|s| s.parse::<i32>().ok()) {
            Some(n) => { unsafe { *result = n; } 0 }
            None => -2,
        }
    }).unwrap_or(-99) // panic 被捕获，返回错误码
}

推荐的完整模式：定义 #[repr(C)] 错误码枚举，配合线程局部的错误消息缓冲区（类似 errno + strerror）：

/// FFI 错误码
#[repr(C)]
pub enum ErrorCode { Success = 0, NullPointer = -1, InvalidUtf8 = -2, InternalError = -3 }

thread_local! {
    static LAST_ERROR: std::cell::RefCell<Option<String>> = std::cell::RefCell::new(None);
}

fn set_last_error(msg: String) {
    LAST_ERROR.with(|e| *e.borrow_mut() = Some(msg));
}

/// C 端调用获取错误详情
#[no_mangle]
pub extern "C" fn get_last_error(buf: *mut u8, buf_len: i32) -> i32 {
    LAST_ERROR.with(|e| {
        match e.borrow().as_ref() {
            None => 0,
            Some(msg) => {
                let bytes = msg.as_bytes();
                if buf.is_null() || bytes.len() + 1 > buf_len as usize { return -1; }
                unsafe {
                    std::ptr::copy_nonoverlapping(bytes.as_ptr(), buf, bytes.len());
                    *buf.add(bytes.len()) = 0;
                }
                bytes.len() as i32
            }
        }
    })
}

C 端可以这样使用：先调用业务函数检查返回值，失败时调用 get_last_error 获取详细信息。这个模式在 Rust 生态的 C 绑定库中被广泛采用。

13.9 跨 FFI 的内存管理：谁分配，谁释放

核心原则：谁分配，谁释放。 Rust 和 C 使用不同的分配器，混用是未定义行为。

flowchart LR
    subgraph "正确"
        A["Rust Box::new()"] -->|"Box::into_raw()"| B["C 使用"]
        B -->|"传回"| C["Rust Box::from_raw()"]

        D["C malloc()"] -->|"传给 Rust"| E["Rust 使用"]
        E -->|"传回"| F["C free()"]
    end

    subgraph "错误"
        G["Rust Box::new()"] -->|"传给 C"| H["C free() → UB"]
        I["C malloc()"] -->|"传给 Rust"| J["Rust drop → UB"]
    end

    style A fill:#3b82f6,color:#fff,stroke:none
    style C fill:#3b82f6,color:#fff,stroke:none
    style D fill:#f59e0b,color:#fff,stroke:none
    style F fill:#f59e0b,color:#fff,stroke:none
    style H fill:#ef4444,color:#fff,stroke:none
    style J fill:#ef4444,color:#fff,stroke:none

模式一：Rust 分配，Rust 释放（不透明句柄）

最安全的模式。C 代码持有不透明指针，创建和销毁都由 Rust 完成：

pub struct Database { /* Rust 内部结构 */ }

#[no_mangle]
pub extern "C" fn db_open() -> *mut Database {
    Box::into_raw(Box::new(Database { /* ... */ }))
}

#[no_mangle]
pub unsafe extern "C" fn db_close(db: *mut Database) {
    if !db.is_null() { let _ = Box::from_raw(db); } // 重建 Box 触发 drop
}

模式二：C 分配，C 释放

使用 C 库返回的内存时，必须用 C 库提供的释放函数：

extern "C" {
    fn strdup(s: *const std::ffi::c_char) -> *mut std::ffi::c_char;
    fn free(ptr: *mut std::ffi::c_void);
}

unsafe {
    let copy = strdup(c"hello".as_ptr());
    // copy 是 C 的 malloc 分配的，必须用 free 释放
    free(copy as *mut std::ffi::c_void);
}

模式三：调用者分配缓冲区

由调用方分配缓冲区，被调用方填充数据——避免了跨边界的所有权转移：

#[no_mangle]
pub extern "C" fn format_number(val: i64, buf: *mut u8, buf_len: i32) -> i32 {
    if buf.is_null() || buf_len <= 0 { return -1; }
    let s = format!("{}", val);
    if s.len() + 1 > buf_len as usize { return -1; }
    unsafe {
        std::ptr::copy_nonoverlapping(s.as_ptr(), buf, s.len());
        *buf.add(s.len()) = 0;
    }
    s.len() as i32
}

13.10 安全封装：将 C API 包装为安全的 Rust API

flowchart TB
    subgraph "用户代码（安全）"
        U["let mut store = KvStore::open(path)?;<br/>store.set(key, value)?;<br/>// 自动 drop"]
    end

    subgraph "安全封装层"
        S["struct KvStore(NonNull&lt;ffi::kv_store&gt;)<br/>impl Drop: 调用 ffi::kv_close<br/>所有方法返回 Result"]
    end

    subgraph "底层 FFI（unsafe）"
        F["extern &quot;C&quot; { fn kv_open/close/set/get... }"]
    end

    U --> S --> F

    style U fill:#10b981,color:#fff,stroke:none
    style S fill:#3b82f6,color:#fff,stroke:none
    style F fill:#f59e0b,color:#fff,stroke:none

安全封装的要素：

mod ffi {
    #[repr(C)]
    pub struct kv_store { _opaque: [u8; 0] }  // 不透明类型

    extern "C" {
        pub fn kv_open(path: *const std::ffi::c_char) -> *mut kv_store;
        pub fn kv_close(store: *mut kv_store);
        pub fn kv_set(s: *mut kv_store, k: *const std::ffi::c_char, v: *const std::ffi::c_char) -> i32;
        pub fn kv_get(s: *mut kv_store, k: *const std::ffi::c_char) -> *const std::ffi::c_char;
    }
}

pub struct KvStore { inner: std::ptr::NonNull<ffi::kv_store> }

impl KvStore {
    pub fn open(path: &str) -> Result<Self, KvError> {
        let c_path = std::ffi::CString::new(path).map_err(|_| KvError::InvalidPath)?;
        let ptr = unsafe { ffi::kv_open(c_path.as_ptr()) };
        std::ptr::NonNull::new(ptr).map(|p| KvStore { inner: p }).ok_or(KvError::OpenFailed)
    }

    pub fn get(&self, key: &str) -> Result<String, KvError> {
        let c_key = std::ffi::CString::new(key).map_err(|_| KvError::InvalidKey)?;
        let ptr = unsafe { ffi::kv_get(self.inner.as_ptr() as *mut _, c_key.as_ptr()) };
        if ptr.is_null() { return Err(KvError::NotFound); }
        // 关键：立即复制，因为 C 指针可能在下次调用后失效
        let s = unsafe { std::ffi::CStr::from_ptr(ptr) };
        s.to_str().map(|s| s.to_string()).map_err(|_| KvError::InvalidUtf8)
    }
}

impl Drop for KvStore {
    fn drop(&mut self) { unsafe { ffi::kv_close(self.inner.as_ptr()); } }
}

检查清单： (1) 所有 FFI 调用集中在 ffi 模块；(2) RAII 管理 C 资源；(3) NonNull 代替裸指针；(4) 立即复制借用的 C 数据；(5) C 错误码转 Result；(6) 字符串转换处理 \0 和 UTF-8；(7) 考虑线程安全性。

13.11 平台特定代码：cfg(target_os) 与条件编译

FFI 代码天然与平台绑定。Rust 的条件编译系统为不同平台编写不同绑定：

#[cfg(target_os = "linux")]
extern "C" { fn epoll_create1(flags: i32) -> i32; }

#[cfg(target_os = "macos")]
extern "C" { fn kqueue() -> i32; }

#[cfg(target_os = "windows")]
extern "system" { fn CreateIoCompletionPort(/* ... */) -> *mut std::ffi::c_void; }

注意 Windows 使用 extern "system" 而非 extern "C"——编译器会将其映射为 Stdcall（x86）或 C ABI（x86-64）。

Rust 标准库在 library/std/src/sys/pal/ 下按平台分离了所有 FFI 实现，通过 PAL（Platform Abstraction Layer）对上层暴露统一接口：

library/std/src/sys/
├── pal/           # 平台抽象层
│   ├── unix/      # Unix (Linux, macOS, BSDs...)
│   ├── windows/   # Windows
│   └── ...
├── fs/            # 文件系统（内部按平台分发）
├── net/           # 网络
├── thread/        # 线程
├── sync/          # 同步原语
└── random/        # 随机数

这种架构模式值得学习：底层按平台分离 FFI 绑定，上层通过统一 trait 抽象差异。 用户代码只面对统一的 Rust API，无需关心底层是 epoll、kqueue 还是 IOCP。

常用的 cfg 条件：

cfg 条件	说明	示例值
`target_os`	操作系统	`"linux"`, `"macos"`, `"windows"`
`target_arch`	CPU 架构	`"x86_64"`, `"aarch64"`
`target_family`	平台家族	`"unix"`, `"windows"`
`target_env`	工具链环境	`"gnu"`, `"musl"`, `"msvc"`
`target_pointer_width`	指针宽度	`"32"`, `"64"`

13.12 实战示例：封装 C 加密库

综合运用前面所有知识点，假设封装一个 C 加密库：

// ffi 层
mod ffi {
    #[repr(C)]
    pub struct crypto_ctx { _opaque: [u8; 0], _marker: std::marker::PhantomData<*mut ()> }

    extern "C" {
        pub fn crypto_new() -> *mut crypto_ctx;
        pub fn crypto_free(ctx: *mut crypto_ctx);
        pub fn crypto_set_key(ctx: *mut crypto_ctx, key: *const u8, len: usize) -> i32;
        pub fn crypto_encrypt(ctx: *mut crypto_ctx, input: *const u8, in_len: usize,
                              out: *mut u8, out_len: *mut usize) -> i32;
        pub fn crypto_error_msg(code: i32) -> *const std::ffi::c_char;
    }
}

// 安全封装层
pub struct CryptoContext { inner: std::ptr::NonNull<ffi::crypto_ctx> }

impl CryptoContext {
    pub fn new() -> Result<Self, CryptoError> {
        let ptr = unsafe { ffi::crypto_new() };
        std::ptr::NonNull::new(ptr).map(|p| Self { inner: p }).ok_or(CryptoError::InitFailed)
    }

    pub fn set_key(&mut self, key: &[u8]) -> Result<(), CryptoError> {
        let code = unsafe { ffi::crypto_set_key(self.inner.as_ptr(), key.as_ptr(), key.len()) };
        if code == 0 { Ok(()) } else { Err(CryptoError::from_code(code)) }
    }

    pub fn encrypt(&self, plaintext: &[u8]) -> Result<Vec<u8>, CryptoError> {
        let mut out = vec![0u8; plaintext.len() + 256];
        let mut out_len = out.len();
        let code = unsafe {
            ffi::crypto_encrypt(self.inner.as_ptr() as *mut _, plaintext.as_ptr(),
                                plaintext.len(), out.as_mut_ptr(), &mut out_len)
        };
        if code != 0 { return Err(CryptoError::from_code(code)); }
        out.truncate(out_len);
        Ok(out)
    }
}

impl Drop for CryptoContext {
    fn drop(&mut self) { unsafe { ffi::crypto_free(self.inner.as_ptr()); } }
}

// 用户代码——零 unsafe
fn main() -> Result<(), Box<dyn std::error::Error>> {
    let mut ctx = CryptoContext::new()?;
    ctx.set_key(b"my-32-byte-secret-key-here!!!!!!")?;
    let encrypted = ctx.encrypt(b"Hello, secure world!")?;
    println!("密文 {} 字节", encrypted.len());
    Ok(())
}

13.12.1 源码核对：System V x86_64 ABI 的 classify_arg 算法

§13.2 提到 x86_64 上参数走 Int/Sse 寄存器。打开 compiler/rustc_target/src/callconv/x86_64.rs:29 看真实实现：

fn classify_arg<'a, Ty, C>(...) -> Result<[Option<Class>; MAX_EIGHTBYTES], Memory> {
    fn classify(cx, layout, cls, off) -> Result<(), Memory> {
        if !off.is_aligned(layout.align.abi) {
            if !layout.is_zst() {
                return Err(Memory);   // ← 未对齐 → 走栈
            }
            return Ok(());
        }

        let mut c = match layout.backend_repr {
            BackendRepr::Scalar(scalar) => match scalar.primitive() {
                Primitive::Int(..) | Primitive::Pointer(_) => Class::Int,
                Primitive::Float(_) => Class::Sse,
            },
            BackendRepr::SimdVector { .. } => Class::Sse,
            BackendRepr::ScalarPair(..) | BackendRepr::Memory { .. } => {
                // 复合类型：递归 classify 每个字段
                for i in 0..layout.fields.count() {
                    classify(cx, layout.field(cx, i), cls, off + layout.fields.offset(i))?;
                }
                ...
                return Ok(());
            }
        };

        // 把 cls[first..=last] 填上分类
        let first = (off.bytes() / 8) as usize;
        let last = ((off.bytes() + layout.size.bytes() - 1) / 8) as usize;
        for cls in &mut cls[first..=last] {
            *cls = Some(cls.map_or(c, |old| old.min(c)));   // 类已经定就取较小者
            if c == Class::Sse {
                c = Class::SseUp;   // 后续 eightbyte 是 upper half
            }
        }
        Ok(())
    }
    ...
}

这条算法实现了 System V AMD64 ABI 的”eightbyte classification”——把每个参数按 8 字节切分、每个 8 字节属于 Int / Sse / SseUp 类——决定用哪些寄存器。

四条值得读懂的细节：

1、未对齐 → Memory（走栈）——SSE 寄存器要求 16 字节对齐、Int 寄存器要求自然对齐。如果一个字段位置不对齐——直接降级走栈、不能用寄存器。

2、Pointer = Class::Int——Rust 指针在 x86_64 ABI 里走整数寄存器（rdi/rsi/…）、不是 SSE。这条让 Box<T> 作为参数传递时占用整数寄存器、和 C 的 pointer 一样。

3、ScalarPair / Memory 递归 classify 字段——Rust 的 Option<&T> 是 ScalarPair（discriminant + pointer）——递归 classify 后每个字段独立分类。如果两个字段都是 Int——一个 ScalarPair 占 2 个 Int 寄存器（rdi + rsi）传递。

4、cls.map_or(c, |old| old.min(c))——同一个 eightbyte 槽位多个 sub-field 时取较小者。Class 顺序是 Int < Sse < SseUp（具体值在 enum 定义里）。这个 min 操作隐含一条 ABI 规则——“任何 eightbyte 里有 Int 就整个走 Int 寄存器”。这是为了简化调用者/被调用者的协调。

MAX_EIGHTBYTES 通常是 8——意味着 64 字节是单参数能用寄存器传递的上限、超过就走栈。这条决定了为什么 Rust 中超大 struct 作为参数传递性能差——它必然走栈、寄存器优化失效。

读懂这条算法你能解释为什么某些 Rust 函数签名比看起来快/慢——参数小且对齐 → 寄存器传递、参数大 → 栈传递。Rust API 设计需要考虑 ABI 性能——经常返回 Box<HugeStruct> 比按值返回 HugeStruct 快、因为 Box 是 8 字节走 Int 寄存器、而 HugeStruct 走栈。

13.12.2 源码核对：MAX_EIGHTBYTES 上限 + Memory 降级回退路径

x86_64.rs:100-128 的算法主体处理 eightbyte > 2 的情况：

let n = arg.layout.size.bytes().div_ceil(8) as usize;
if n > MAX_EIGHTBYTES {
    return Err(Memory);    // ← 太大、走栈
}

let mut cls = [None; MAX_EIGHTBYTES];
classify(cx, arg.layout, &mut cls, Size::ZERO)?;
if n > 2 {
    // 大于 16 字节的复合类型：必须开头是 Sse 才能 vectorize 走 SSE 寄存器
    if cls[0] != Some(Class::Sse) {
        return Err(Memory);
    }
    if cls[1..n].iter().any(|&c| c != Some(Class::SseUp)) {
        return Err(Memory);
    }
} else {
    // 1-16 字节：normalize SseUp 序列
    let mut i = 0;
    while i < n {
        if cls[i] == Some(Class::SseUp) {
            cls[i] = Some(Class::Sse);   // 孤立的 SseUp 升级成 Sse
        } else if cls[i] == Some(Class::Sse) {
            i += 1;
            while i != n && cls[i] == Some(Class::SseUp) {
                i += 1;
            }
        } else {
            i += 1;
        }
    }
}

两条规则：

1、超过 16 字节的非纯 SSE 复合类型 → Memory——比如一个 (u64, f32, u8) 这种”int + float 混合”的大 struct——按 ABI 规则不能纯走寄存器、降级走栈。这意味着返回值是大 struct 时性能糟糕——总是走栈分配。

2、SseUp 的孤立处理——SseUp 是”前一个 Sse 的 upper half”概念。如果 cls 里有孤立的 SseUp（前面没有 Sse），算法把它升级成 Sse。这是为了处理某些特殊的复合类型布局。

这套算法是 System V ABI 规范的精确实现——AMD 公布的规范文档里有详细的 classification 算法、rustc 一字不差地照抄。Rust 调用 C 库要走完全相同的 ABI——一个 byte 不对都会出现 ABI 不匹配的诡异 bug（参数错位、栈崩溃）。

读这段代码让你理解一条工程哲学——协议级实现必须严格按 spec、不能 “差不多就行”。X86_64 SysV ABI 这种文档化协议给了 1990 年代的工程团队彻底搞清楚一切的机会、Rust 团队 2015 年起照搬实现——站在巨人肩膀上。

13.13 cbindgen：从 Rust 代码生成 C 头文件

bindgen 将 C 头文件转为 Rust 绑定，cbindgen 做相反的事——从 Rust 代码生成 C/C++ 头文件。

cargo install cbindgen
cbindgen --lang c --output include/my_lib.h

给定以下 Rust 代码：

#[repr(C)]
pub struct Point { pub x: f64, pub y: f64 }

#[no_mangle]
pub extern "C" fn point_distance(a: *const Point, b: *const Point) -> f64 {
    if a.is_null() || b.is_null() { return -1.0; }
    let (a, b) = unsafe { (&*a, &*b) };
    ((a.x - b.x).powi(2) + (a.y - b.y).powi(2)).sqrt()
}

cbindgen 生成：

#ifndef MY_LIB_H
#define MY_LIB_H
#include <stdint.h>

typedef struct Point { double x; double y; } Point;

double point_distance(const Point *a, const Point *b);

#endif

在 build.rs 中集成后，每次 cargo build 时头文件自动更新：

// build.rs
fn main() {
    cbindgen::Builder::new()
        .with_crate(std::env::var("CARGO_MANIFEST_DIR").unwrap())
        .with_language(cbindgen::Language::C)
        .generate().expect("cbindgen failed")
        .write_to_file("include/my_lib.h");
}

配置文件 cbindgen.toml 可以控制命名风格、包含守卫、自动生成警告等细节，保证 Rust 与 C 接口始终同步。

13.12.4.5.3 写 Rust FFI 的 5 个常见心理陷阱

总结读完本章后你应该建立的几个心理纠正——FFI 学习者最容易陷入的认知陷阱：

1、“看起来工作 = 正确”——FFI 代码可能在你机器上跑 100 次都没问题、生产 4 小时崩。unsafe 代码的正确性是逻辑证明问题——不是测试问题。任何 unsafe 块都该有详细 SAFETY 注释解释为什么这条规则成立。

2、“加 unsafe 就能编译就能用”——unsafe 是契约、不是免罪牌。每个 unsafe 函数有 caller 必须遵守的前提条件——文档要明确写、否则用户必踩坑。

3、“性能优化就该用 transmute”——transmute 是 Rust 里最危险的操作。99% 的”用 transmute 加速”的场景都有更安全的替代（unwrap_unchecked、from_raw、cast）。transmute 是最后手段。

4、“C 库说支持多线程我就放心多线程调用”——很多 C 库的”thread-safe”是 conditional 的（比如要每线程独立 init）。用 Send/Sync 包装时仔细读 C 库文档——错了就是 race condition。

5、“FFI 错误就是 SIGSEGV”——unsafe 错误的真实表现千奇百怪：silent data corruption、看似随机的逻辑错误、几小时后 panic、memory leak、performance degradation。永远先怀疑 unsafe 代码——sound Rust 代码不会出这种问题。

这 5 个陷阱都是从踩坑经验提炼出来的——记住它们能让你写 FFI 时少 50% 的 bug。

13.12.4.5.4 异步 FFI：tokio + io_uring/IOCP 的特殊难点

读完本章 sync FFI 后再考虑 async FFI——会遇到一些独特挑战：

1、async runtime 不能 block 在 FFI 调用上——如果 C 函数会阻塞（比如 fopen 一个慢盘文件）、整个 worker thread 卡住、其他 task 饿死。两条解法：

把 sync C call 包在 tokio::task::spawn_blocking 里——专用 blocking thread pool 跑
用纯 async 的底层 API（io_uring、IOCP）——天生异步、不阻塞

2、callback-based async C API 的 Rust 包装很难——比如 libuv 的”注册一个 callback、io 完成时调用”模型——需要把 Rust Future 转成 callback。常用 pattern：用 oneshot channel——callback 里 send、Future 里 await receive。

3、Pin 安全——FFI 函数可能保存 self pointer 用于回调（比如把 C lib 的”on_complete”hook 指向 Rust struct）——这要求 Rust struct 不能被 move。需要 Pin<Box<MyStruct>>、构造时立刻 Pin。

4、cancellation 难传播——Rust task drop 时希望 cancel 进行中的 FFI call——但很多 C API 没有 cancel 接口。最佳实践：尽量用支持 cancel 的 API（io_uring 的 IORING_OP_ASYNC_CANCEL）；不行就 detach（task drop 但 FFI call 继续跑、结果丢弃）。

这些 pattern 在 tokio-uring、quinn、deno_core 等项目里都能看到——是 async Rust + FFI 的高级技能。能熟练写这种代码的工程师在团队里很稀缺、值钱。

13.12.4.5.5 真实生态里的 Rust FFI 大型项目

读完本章后想看 production-grade FFI 代码——下面 5 个开源项目都值得读：

1、tokio-uring：基于 io_uring 的 Rust 封装。完整展示了 syscall 级 FFI——SQE/CQE 的内存共享、kernel/user space 的 ring buffer、submit_and_wait 的 syscall 调用。学 Rust ↔ Linux kernel 互操作的最佳教材。

2、wgpu：WebGPU 的 Rust 实现——内部封装 Vulkan/Metal/DX12 三套 native graphics API。每个 backend 都是几千行 FFI——展示怎么把 stateful C API（GPU command buffer）封装成 type-safe Rust API。

3、rustls：Rust 实现的 TLS——但底层加密算法（AES-NI、AVX-512 ChaCha20）通过 FFI 调 ring/aws-lc。展示Rust 怎么和高度优化的 C 加密库共存。

4、polars：Rust 数据分析框架——通过 PyO3 暴露给 Python。展示怎么把复杂 Rust API（DataFrame）映射到 Python 用户友好的 API。

5、deno_core：Deno runtime 的 V8 binding。完整的 V8 ↔ Rust FFI——比 Rust 自己的 std 用 FFI 多 10 倍。对 unsafe 代码组织的工业级示范。

5 个项目代码量从几千到几十万行——按你的兴趣选 1-2 个深入。读优秀 FFI 代码比写自己的更有助于建立 sense——很多 best practice 只在大型项目里才能看到。

13.12.4.6 与全书其他章节的呼应

本章 FFI 看似独立、实则与书中其他章节深度咬合：

与第 2 章（HIR/MIR）的关系：本章的 extern "C" 在 HIR 阶段被识别为 FnSig::abi、在 MIR 阶段被翻译成带正确 calling convention 的 Call terminator。没有第 2 章的 IR 设计、本章的 ABI map 无法应用。

与第 6 章（单态化）的关系：generic FFI 函数（如 extern "C" fn foo<T>）在单态化时被实例化成具体类型的 extern “C” 函数。单态化和 ABI 处理是同一阶段、互相影响。

与第 8 章（unsafe Rust）的关系：FFI 几乎全是 unsafe——本章的 #[no_mangle]、extern "C"、CStr::from_ptr 都需要 unsafe 上下文。第 8 章是本章的语言基础。

与第 11 章（链接）的关系：FFI 函数的符号最终通过 linker 解析——#[no_mangle] 影响符号名、#[link(...)] 影响 linker 行为。第 11 章讲 linker 是怎么工作的、本章告诉你 FFI 怎么 hook 进去。

与第 14 章（const fn）的关系：本章 §13.3.1 的 CStr::from_ptr 是 const fn——const eval 引擎要能解释 strlen 调用。const fn 的能力扩展直接影响 FFI helper 能不能 const 化。

与第 17 章（标准库实现）的关系：std 大量用 FFI 调 OS API（read、write、socket、thread）——本章的 ABI 知识是看 std unix/windows 模块源码的前置课。

读完本章 + 这 6 个接口、你就能把 rustc + std + ABI + linker 串成一个完整心智模型——Rust 程序从源码到 OS syscall 的全链路。这是从”会写 Rust”到”懂 Rust”的最后一公里。

13.12.4.7 写自己的 cdylib + Python 调用的 50 行实操

读完本章想立即上手——下面是一个 50 行能跑的 demo：

Cargo.toml:

[package]
name = "mylib"
version = "0.1.0"
edition = "2021"

[lib]
crate-type = ["cdylib"]

src/lib.rs:

use std::ffi::{CStr, CString, c_char};

#[no_mangle]
pub extern "C" fn add(a: i32, b: i32) -> i32 {
    a + b
}

#[no_mangle]
pub extern "C" fn greet(name: *const c_char) -> *mut c_char {
    let name = unsafe {
        if name.is_null() { return std::ptr::null_mut(); }
        match CStr::from_ptr(name).to_str() {
            Ok(s) => s,
            Err(_) => return std::ptr::null_mut(),
        }
    };
    let msg = format!("Hello, {}!", name);
    CString::new(msg).map(|c| c.into_raw()).unwrap_or(std::ptr::null_mut())
}

#[no_mangle]
pub extern "C" fn free_string(s: *mut c_char) {
    if s.is_null() { return; }
    unsafe { let _ = CString::from_raw(s); }
}

Python 调用:

import ctypes
lib = ctypes.CDLL("./target/release/libmylib.dylib")
lib.add.argtypes = [ctypes.c_int, ctypes.c_int]
lib.add.restype = ctypes.c_int
print(lib.add(2, 3))   # → 5

lib.greet.argtypes = [ctypes.c_char_p]
lib.greet.restype = ctypes.c_void_p   # 注意是 void_p 不是 c_char_p（避免 ctypes 自动 free）
lib.free_string.argtypes = [ctypes.c_void_p]
ptr = lib.greet(b"World")
print(ctypes.cast(ptr, ctypes.c_char_p).value)   # → b"Hello, World!"
lib.free_string(ptr)   # 必须主动 free！

50 行内涵盖了本章讲过的所有核心 pattern——#[no_mangle] + extern "C" + CStr/CString + 所有权转移 + 释放函数。自己跑通这个 demo 比读 5 遍本章更有帮助——很多隐藏的细节（比如 ctypes 的 c_char_p auto-free 陷阱）只有动手才能感受到。

加分练习：让 Python 传一个 callback 给 Rust——practice §13.7 的 trampoline 模式。

13.12.4 几条 FFI 反直觉但真实的事实

读完本章你应该获得几条反直觉但真实的事实——这些在普通 FFI 教程里不会讲：

1、#[repr(C)] 不保证和 C 编译器完全一致——只保证字段顺序和对齐规则一致。但 C compiler 自己有 padding、bitfield、anonymous struct 等扩展、Rust 不支持。真正 100% 兼容的 struct 必须用 bindgen 生成、不能手写。

2、extern "C" 在不同平台 ABI 不同——x86_64 Linux 是 SysV ABI、x86_64 Windows 是 MS ABI、ARM 是 AAPCS——同一份 Rust 代码在不同平台编译出的函数调用约定不同。这就是 §13.2 讲过的 canonize_abi 要做的事——把 source-level “C” 映射到 target-specific canonical ABI。

3、size_of::<&str>() == size_of::<*const u8>() * 2——Rust 的 &str 是胖指针 (ptr, len) 共 16 字节、不是普通指针。跨 FFI 必须 destructure 成两个参数——不能 extern "C" fn foo(s: &str) 直接传。

4、bool 在 ABI 上是 1 字节但只能是 0 或 1——其他值是 UB。如果你 transmute<u8, bool>(2)——Rust 编译器假设 bool 永远是 0/1、可能优化出诡异行为（比如分支预测错乱）。FFI 接收 C 的 bool 时永远要 match c_bool { 0 => false, _ => true } 而不是 transmute。

5、extern "C" fn 类型不等于 unsafe extern "C" fn 类型——两者签名不同、不能互转。注意你定义的 callback 签名要和 C 库期望的精确匹配。

6、#[no_mangle] 函数在 LTO 时可能被 inline——despite 名字里有”no mangle”。#[no_mangle] 只保证符号名不被改、不阻止其他 LTO 优化。如果想阻止 inline 加 #[inline(never)]——是另一个 attribute。

这 6 条事实里至少 3 条在你写 FFI 代码时会遇到——理解它们能避免最痛苦的 debug 之夜。

13.12.4.5 ABI 演进的 4 个时间节点

读完本章 ABI 部分你应该理解 ABI 不是一个静态规范、而是有明确的演化历史：

1980s - C ABI 确立：System V x86 ABI (1990) 定下了”前 6 个整数参数走 rdi/rsi/rdx/rcx/r8/r9”等核心规则。这套规则成为后续所有语言互操作的”通用底”。

1990s - Windows 引入差异：Microsoft 自己定义了 Windows x64 ABI——前 4 个参数走 rcx/rdx/r8/r9（不是 6 个）、shadow stack space 32 字节（SysV 没有）。两套 ABI 的存在是 Windows 上跨平台代码痛苦的根源。

2010s - Rust 选择不固定 ABI：Rust 的 “Rust ABI”（不是 extern “C”）至今没有 stable 化——rustc 可以自由优化 layout、寄存器使用。这是 Rust 性能优势的来源（编译器有完全自由）、也是 dynamic linking Rust 库困难的原因。

2020s - WebAssembly Component Model 出现：把”语言无关 ABI”做到 wasm 层——任何语言编译到 wasm 都用同一套 ABI。Rust + Swift + Go 都能 0 成本互调用。这可能是 Rust 之后最重要的 ABI 演化——让 dynamic linking 重新变得可行。

理解这 4 个节点让你看到 ABI 不是天上掉下来的——是几代工程师在不同时代的工程妥协累积。rustc 团队对 ABI 的处理保守而精确——既要兼容 SysV/Windows ABI 让 Rust 能调 C 库、又要让 Rust 自己有优化自由——这是 §13.2.1 讲的 AbiMapping 三态设计的根本动机。

13.12.5 几个真实生产 FFI bug 的 root cause

读完本章源码细节后，下面 5 个典型 FFI bug 都能用本章知识精确定位：

Bug 1：调 C 库返回 corrupt struct——repr 没标 C

症状：C 函数返回的 struct 字段值不对、看起来像内存被踩
根因：Rust 端定义对应 struct 没加 #[repr(C)]、字段顺序被编译器重排
本章定位：§13.4.1 的 inhibit_struct_field_reordering——加上 #[repr(C)] 立刻好

Bug 2：cstring 传给 C 函数后 C 端读到乱码

症状：传 “hello” C 端打印出 "" 或乱字符
根因：用 &str.as_ptr() 直接传——Rust 字符串没有 NUL 终结符
本章定位：§13.3.1——必须用 CString 或 c”hello” 字面量

Bug 3：调 C library 闭包 callback 段错误

症状：注册 Rust 闭包给 C 库做 callback、callback 触发时 SIGSEGV
根因：闭包不是普通 fn pointer——需要 trampoline 模式 + void* 上下文
本章定位：§13.7——用 unsafe extern "C" fn + 把 closure boxed 进 void*

Bug 4：Rust string 跨 FFI 边界后变 corrupt

症状：Rust 端 println! 看到的字符串和 C 端 printf 看到的不一致
根因：String 是 (ptr, len, cap) 胖指针不能跨 FFI、必须先 into_raw 或 CString::new
本章定位：§13.3——String 的内存 layout 不是 ABI stable

Bug 5：用 C malloc 分配的内存被 Box::from_raw 释放

症状：Rust 程序运行几十分钟后随机崩——free(): invalid pointer 或 heap corruption
根因：分配器不匹配——Rust 的 GlobalAlloc 默认是 system malloc 但用户可换、C 一定用 system malloc
本章定位：§13.3.3 + §13.9——所有权完整转移、不要混用分配器

这 5 个 bug 的共同点——没有读过 ABI 源码很难定位——segfault / corrupt data 这些症状都很模糊、但根因都在本章源码里。读懂本章后再遇到这种问题、能直接定位到具体源码片段。

13.12.6 给 FFI 学习者的进阶路径

读完本章后想继续深入——5 条值得走的进阶路径：

1、读 SQLite Rust binding（rusqlite）：rusqlite 是 sqlite3 C API 的 Rust 安全封装。1 万行代码、把 100+ 个 C 函数包成安全 Rust API。读一遍能学到 unique 的 trick——比如 prepared statement 的 lifetime 管理、blob 的零拷贝读取。

2、读 Tokio 的 mio 子系统：mio 是 epoll/kqueue/IOCP 的跨平台抽象。100% unsafe FFI——但封装得用户感受不到。学怎么把不同 OS 的 IO multiplexing API 抽象成统一 trait。

3、写一个自己的 C 库的 Rust binding：选一个你熟悉的 C 库（比如 libcurl、libpng）——用 bindgen 生成 raw bindings、再手写一层 safe wrapper。实践是检验对 FFI 理解的唯一标准。

4、读 wasmtime 的 wasi 实现：wasmtime 是 wasm runtime、需要把 wasm 程序的 ABI 调用映射到宿主 OS 调用。完整的 ABI 实现教程——从 wasm 类型系统到 SysV 寄存器都覆盖。

5、看 RFC 0079 (extern crate ABI strings)：Rust ABI 历史上的关键 RFC——从最早只有 "C" 和 "Rust" 演化到现在的 20+ ABI strings。理解 RFC 让你看懂 rustc 源码里很多 enum variant 的命名缘由。

5 条路径有不同时间预算——按你的兴趣深度选。FFI 是 Rust 工程师”高级技能”的分水岭——能写 unsafe 安全 API 的工程师在团队里很少、值钱。

13.12.3 FFI 性能成本的具体数字

读完 ABI 章节后给你一组实际性能数据——感受 FFI 调用的真实成本：

纯 Rust 函数调用 vs extern "C" 函数调用：

普通 Rust call：~0.5-1ns（被 inline 时为 0）
extern “C” 跨 FFI call：~2-5ns（无 inline）
通过函数指针的间接 call：~5-10ns（需 cache miss 时更慢）

字符串转换成本：

CStr::from_ptr(short_str)（10 字节）：~10ns（strlen + slice 构造）
CStr::from_ptr(long_str)（10KB）：~1us（strlen 是 O(N)）
CString::new(rust_str)：~50ns + 一次堆分配

内存分配跨 FFI：

Rust 分配 + Rust 释放：~30ns（jemalloc/system malloc）
Rust 分配 + 用 from_raw 重新 reclaim：~30ns（同分配器）
错误地用 C free 释放 Rust 分配：UB、可能立即 crash 或几小时后崩

对比 Go cgo：

Go cgo 调 C：~200-500ns（保存/恢复 Go runtime state）
Rust extern “C”：~2-5ns（无 runtime 切换）

Rust 比 Go 快 100x 的根本原因：Rust 没 GC、没 goroutine scheduler、extern "C" 就是普通 call instruction。Go 的 cgo 要保存 G 状态、切栈到 C 栈、调用、切回——overhead 巨大。

这条数据让你能合理评估”我的 hot loop 调 100 万次 FFI 函数能不能接受”——Rust 答案是 yes（5ms 总开销）、Go 答案是 no（500ms 总开销）。Rust 在 cgo 频繁场景下的优势是结构性的、不是优化技巧能弥补的。

13.12.4 FFI 在不同生态里的对比

读完 Rust FFI 后看其他语言的对应实现——对比让你看到 Rust 的设计取舍：

Python ctypes：纯运行时——通过 dlopen + ABI 函数签名解析在运行时调 C 函数。慢但灵活——能动态调任意 C 库不需要重编译。Python 程序的 cgo 开销：~微秒级（10x Rust）。

Go cgo：编译时检测 C 调用、生成 wrapper。但 wrapper 要保存 goroutine state、切到 C 栈——开销大。Go 团队官方建议”少用 cgo”——能用纯 Go 解决就别用 C。

Java JNI：通过 javah 生成 .h 文件、C 函数命名约定（Java_ClassName_methodName）。和 Rust 一样需要编译期生成 wrapper、运行时 overhead 类似 Go cgo。

Node.js N-API：通过 napi 提供稳定 ABI 让 native module 不依赖 V8 内部。Rust 的 napi-rs 让 Rust + Node 互操作变得简单——编译 Rust 成 Node 加载的 .node 文件。

WebAssembly：通过 Component Model 提供”语言无关 ABI”——Rust/C++/Swift 编译到 wasm 后能互相调。WebAssembly 是”未来的 C ABI”——rustc 已经把 wasm32 当一等 target 支持。

5 种生态对比让你看到一条规律：Rust 在 FFI 性能 + 安全 + 易用上接近最优——这是它在 cloud-native (Cloudflare Workers / Fastly Compute@Edge / Lambda 等场景) 比 Go/Java 更受欢迎的根本原因之一。

13.13.1 本章源码定位索引

为便于按图索骥（基于 rust nightly 2024-2025 master）：

主题	源文件	关键行号/位置
AbiMapping enum	`compiler/rustc_target/src/spec/abi_map.rs`	17
AbiMap::from_target	同上	49+
canonize_abi	同上	关键映射逻辑
FnAbi + PassMode enum	`compiler/rustc_target/src/callconv/mod.rs`	主体
classify_arg x86_64	`compiler/rustc_target/src/callconv/x86_64.rs`	29-128
CStr::from_ptr	`library/core/src/ffi/c_str.rs`	253-265
CStr::from_bytes_until_nul	同上	298-311
CString::into_raw	`library/alloc/src/ffi/c_str.rs`	455-457
CString::into_string	同上	478-483
LayoutCalculator::univariant_biased	`compiler/rustc_abi/src/layout.rs`	1160+
inhibit_struct_field_reordering	`compiler/rustc_abi/src/lib.rs`	ReprOptions impl
`-Z randomize-layout`	`compiler/rustc_abi/src/layout.rs`	1186-1202
alignment_group_key	同上	1219-1260

源码版本：rust nightly。

13.13.2 读完本章能回答的具体问题清单

作为本章自测：

AbiMapping::Deprecated 是什么、和 Invalid 差在哪？（§13.2.1——能编译但未来会成 Invalid，明示缓冲带）
rustc 源码里 “force listing” 注释是什么意图？（§13.2.1——用 exhaustive match 强制贡献者思考新 arch 的 ABI 影响）
CStr::from_ptr 为什么需要 #[inline]？（§13.3.1——让 LLVM 看到 strlen 调用、做 SIMD 优化）
CStr::from_ptr 和 from_bytes_until_nul 的本质差别？（§13.3.1-2——前者依赖外部 lifetime 不安全、后者从 byte slice 安全构造）
CString::into_raw 后该用哪个函数 reclaim？（§13.3.3——只能 from_raw、不能用 C 的 free）
CString::into_string 失败时数据丢了吗？（§13.3.4——没丢、IntoStringError carry 原对象）
repr(C) 的字段顺序保证是怎么实现的？（§13.4.1——inhibit_struct_field_reordering 让 optimize_field_order=false）
-Z randomize-layout 是干嘛的？（§13.4.2——主动洗牌字段顺序帮 unsafe 用户发现 implicit assumption）
Option<bool> 和 Option<u8> 的内存大小一样吗？（§13.4.3——bool 有 niche、Option<bool> = 1 字节；u8 没 niche、Option<u8> = 2 字节）
x86_64 上一个 Box<HugeStruct> 走寄存器还是栈？（§13.12.1——Box 是 8 字节 pointer 走 Int 寄存器）
多大的复合类型会被强制走栈？（§13.12.2——超过 16 字节且非纯 SSE 的复合类型 → Memory）
System V ABI 的 eightbyte classification 中 Int + Sse 混合时怎么决策？（§13.12.1——min 取 Int 优先）

能答 8 条以上——你对 Rust FFI 在编译器层面的实现细节理解已经超越大多数应用 Rust 开发者。这种深度让你能 debug “Rust 调 C 库参数错位”、“Rust 给 C 写 callback 段错误” 等 ABI 级 bug。

13.13.3 给 FFI 工程作者的 8 条工程启示

1、永远先写 safe wrapper、再暴露：用户用 unsafe FFI 出 bug 是你的责任——不是用户的。每个 unsafe FFI 函数都该有 safe wrapper。

2、#[must_use] 标在所有”忘了用就内存泄露”的函数上：CString::into_raw 是模板。

3、错误时不丢数据：CString::into_string 失败 carry 原对象——你的 API 也该这样。

4、内存所有权契约写进文档第一段：from_raw 的”only call with from into_raw”是必须的。模糊的 ownership 文档让用户必踩坑。

5、ABI 边界永远走 repr(C) + 简单类型：跨 FFI 别用 Vec/String/Box——它们的 layout 不稳定。把它们 destructure 成 (ptr, len) pair 再传。

6、回调用 fn pointer + void 模式*：本章 §13.7 讲过的 trampoline 模式是教科书。

7、bindgen 生成的代码当作起点、不要原样用：bindgen 默认很保守、很多 unsafe 没必要。看一遍 generated 代码、能精简的精简。

8、cbindgen 配合 cdylib 让 Rust 库被 C 用：本章 §13.13 介绍——比手写 .h 维护好 10 倍。

这 8 条加上前面的 12 题、能让你在 FFI 这个 Rust 最危险的领域里安全工作。FFI 出的 bug 90% 来自不遵守上面这些规则——掌握它们就掌握了 Rust 跨语言开发的核心能力。

13.14 FFI 安全性总结

FFI 边界是 Rust 安全保证的断裂带。以下不变量需要手动保证：

不变量	常见错误
类型大小和对齐匹配	用 `i32` 对应 C 的 `long`（64 位 Linux 上是 8 字节）
内存所有权清晰	用 Rust 的 drop 释放 C malloc 的内存
字符串正确转换	把 `&str` 直接转为 `*const c_char`（缺少 `\0`）
空指针检查	未检查 C 返回值是否为 NULL
panic 不跨越 FFI	在 `extern "C" fn` 中未捕获 panic
指针有效性	使用已被 C 释放的指针
`bool`/`enum` 有效值	C 传递了无效的 enum 判别值

掌握了 FFI 的全部机制，我们看到编译器在语言边界处的退让——从完全验证退化为”信任程序员”。unsafe 块是编译器对程序员说：“我检查不了了，你来保证正确性。” 安全封装层的意义在于将这种信任压缩到尽可能小的代码范围内。

下一章我们将回到编译器大展身手的领域：宏展开。声明宏和过程宏在编译流水线的哪个阶段被处理？token tree 到底是什么？编译器如何保证宏生成的代码不会意外地与使用者的代码发生命名冲突？

第13章 FFI：与 C 世界的桥梁

13.1 为什么需要 FFI

13.2 extern “C” 与 ABI 规范：编译器做了什么

13.2.1 AbiMapping 的 Deprecated 三态与 “force listing” 自文档化设计

13.3 类型映射：Rust 类型与 C 类型

字符串转换：CStr 与 CString

13.3.1 源码核对：CStr::from_ptr 的 strlen + from_raw_parts 双步实现

13.3.2 源码核对：from_bytes_until_nul 的 memchr 优化

13.4 repr(C)：确保 C 兼容的内存布局

13.3.3 源码核对：CString::into_raw 与 from_raw 的所有权转移协议

13.3.4 源码核对：CString::into_string 的失败回退路径

repr(C) 联合体

13.4.1 源码核对：repr(C) 关闭字段重排——inhibit_struct_field_reordering

13.4.2 源码核对：-Z randomize-layout 的 RNG 字段重排——unsafe 假设的 fuzzing 工具

13.4.3 源码核对：alignment_group_key 的字段排序启发——niche-aware 优化

13.5 从 Rust 调用 C：extern 块、链接与 bindgen

13.6 从 C 调用 Rust：#[no_mangle]、extern “C” fn 与 cdylib

13.7 回调函数：将 Rust 闭包传递给 C

无状态回调

有状态回调：通过 void* 传递上下文

传递闭包：trampoline 模式

13.8 跨 FFI 边界的错误处理

13.9 跨 FFI 的内存管理：谁分配，谁释放

模式一：Rust 分配，Rust 释放（不透明句柄）

模式二：C 分配，C 释放

模式三：调用者分配缓冲区

13.10 安全封装：将 C API 包装为安全的 Rust API

13.11 平台特定代码：cfg(target_os) 与条件编译

13.12 实战示例：封装 C 加密库

13.12.1 源码核对：System V x86_64 ABI 的 classify_arg 算法

13.12.2 源码核对：MAX_EIGHTBYTES 上限 + Memory 降级回退路径

13.13 cbindgen：从 Rust 代码生成 C 头文件

13.12.4.5.3 写 Rust FFI 的 5 个常见心理陷阱

13.12.4.5.4 异步 FFI：tokio + io_uring/IOCP 的特殊难点

13.12.4.5.5 真实生态里的 Rust FFI 大型项目

13.12.4.6 与全书其他章节的呼应

13.12.4.7 写自己的 cdylib + Python 调用的 50 行实操

13.12.4 几条 FFI 反直觉但真实的事实

13.12.4.5 ABI 演进的 4 个时间节点

13.12.5 几个真实生产 FFI bug 的 root cause

13.12.6 给 FFI 学习者的进阶路径

13.12.3 FFI 性能成本的具体数字

13.12.4 FFI 在不同生态里的对比

13.13.1 本章源码定位索引

13.13.2 读完本章能回答的具体问题清单

13.13.3 给 FFI 工程作者的 8 条工程启示

13.14 FFI 安全性总结

13.2.1 `AbiMapping` 的 `Deprecated` 三态与 “force listing” 自文档化设计

13.4.1 源码核对：repr(C) 关闭字段重排——`inhibit_struct_field_reordering`