AFL源码阅读之afl-gcc&afl-as

基本概念

basic block(基本块)、edge(边)、代码覆盖率

edge就被用来表示在基本块之间的跳转，知道了每个基本块和跳转的执行次数，就可以知道程序中的每个语句和分支的执行次数，从而获得比记录BB更细粒度的覆盖率信息

代码覆盖率是一种度量代码的覆盖程度的方式，也就是指源代码中的某行代码是否已执行；对二进制程序，还可将此概念理解为汇编代码中的某条指令是否已执行。

覆盖率劫持：

劫持汇编器（afl-gcc、afl-clang、afl-g++），通过识别跳转指令，然后在其中插入一段汇编用于与AFL之间的通信(__afl_maybe_log)

clang内置，LLVM 内置了一个简单的代码覆盖率检测（SanitizerCoverage），它在函数、基本块和边缘级别插入对用户定义函数的调用。提供了这些回调的默认实现，并实现了简单的覆盖率报告和可视化

编译器将在 -fsanitize-coverage=trace-pc-guard 每个边缘插入以下代码：

1	__sanitizer_cov_trace_pc_guard(uint32_t* guard_variable)

每条边都有自己的 guard_variable，每当对应的edge被执行到时，都会向共享内存中对list[*guard_variable]进行++操作

//该函数处于AFL/llvm_mode/afl-llvm-rt.o.c
/* The following stuff deals with supporting -fsanitize-coverage=trace-pc-guard.
   It remains non-operational in the traditional, plugin-backed LLVM mode.
   For more info about 'trace-pc-guard', see README.llvm.

   The first function (__sanitizer_cov_trace_pc_guard) is called back on every
   edge (as opposed to every basic block). */

void __sanitizer_cov_trace_pc_guard(uint32_t* guard) {
  __afl_area_ptr[*guard]++;								//__afl_area_ptr是共享内存的指针
}

1	__sanitizer_cov_trace_pc_guard_init(uint32_t start, uint32_t stop);

这个函数会在可执行文件开始执行时会调用一次，每一个有向边对应一个*guard值，参数start是第一个guard指针，代表第一条边；stop是最后一个guard指针，代表最后一条边，遍历start到stop就可以给每个guard指针指向的值初始化一个随机数（但是实际上这个值可以由用户来决定）

//AFL/llvm_mode/afl-llvm-rt.o.c

void __sanitizer_cov_trace_pc_guard(uint32_t* guard) {
  __afl_area_ptr[*guard]++;
}

/* Init callback. Populates instrumentation IDs. Note that we're using
   ID of 0 as a special value to indicate non-instrumented bits. That may
   still touch the bitmap, but in a fairly harmless way. */

void __sanitizer_cov_trace_pc_guard_init(uint32_t* start, uint32_t* stop) {

  u32 inst_ratio = 100;
  u8* x;

  if (start == stop || *start) return;

  x = getenv("AFL_INST_RATIO");
  if (x) inst_ratio = atoi(x);

  if (!inst_ratio || inst_ratio > 100) {
    fprintf(stderr, "[-] ERROR: Invalid AFL_INST_RATIO (must be 1-100).\n");
    abort();
  }

  /* Make sure that the first element in the range is always set - we use that
     to avoid duplicate calls (which can happen as an artifact of the underlying
     implementation in LLVM). */

  *(start++) = R(MAP_SIZE - 1) + 1;								//初始化初始值

  while (start < stop) {										//为每一个*guard赋初始值*start = R(MAP_SIZE - 1) + 1;

    if (R(100) < inst_ratio) *start = R(MAP_SIZE - 1) + 1;		//	#  define R(x) (random() % (x))【random随机数】
    else *start = 0;

    start++;													//指针遍历（uint32_t）

  }

}

但是不可避免的，会有概率出现冲突的情况，因此Sakura师傅对其进行了替换：

void __sanitizer_cov_trace_pc_guard_init(uint32_t* start, uint32_t* stop) {
	unsigned int N = 0 ;
    printf("@sakura in __sanitizer_cov_trace_pc_guard_init\n");
    if(start == stop || *start) return;
    for (uint32* x = start ; x < stop ; x++){
        *x = (++N) % MAP_SIZE;	//mod map_size（1 << 16 == 65536）
    }
    printf("@sakura there are %u guards in total\n",N);
}

要想对每条edge进行插桩，需要先在make afl时传入AFL_TRACE_PC=1来定义DUSE_TRACE_PC宏（make AFL_TRACE_PC=1），然后在执行afl-clang-fast的时候传入-fsanitize-coverage=trace-pc-guard参数即可。正常make会启用afl-llvm-pass而非afl-llvm-rt

GCOV、LCOV

覆盖率记录的实现除了前两种方式之外还可以使用GCOV、LCOV这两个东西可视化展示代码覆盖率，但是不能使用到fuzzer上。

一些fuzz的覆盖率反馈和引导变异的一些概念：

fuzz有生成式fuzz和变异式fuzz两种，生成式fuzz就是一开始什么种子也没有，纯根据fuzzer的逻辑来生成种子（输入），fuzz过程中没有任何的反馈，也不会根据某时某刻的代码覆盖率来把interesting case变异发现新的路径。 AFL采用的是变异式fuzz

进行fuzz时，对样本进行变异，然后在找到新的路径时，将当前变异后数据当成一个新的样本继续变异。经过逐代的变异杂交选优，直到达到一个局部最优解为止（已发现的路径达到max），这就是所谓的**遗传算法**。

因为遗传算法的特征总是来自于初始种子样本和变异策略，所以改进也主要在这两方面。种子也可以是一些代码语句：

1	select * from cyberangel;

如果变异器不会引入新的token（特征），那永远就只会生成select这一种特征，而不会生成如SQL语句中的insert、delete等。所以种子的变异策略也十分的重要，它极大的影响着fuzz的代码覆盖率。

除了对种子的不断变异，我们还需要了解AFL如何将自动或半自动生成的随机数据输入到被fuzzer的程序中并监视如崩溃，断言（assertion）等程序异常这个过程，也就是进行输入和捕获crash。AFL采用的是fork server这种机制，具体方式如下：

fuzzer变异生成新样本后会将它们写入到执行文件目录的.cur_input隐藏文件夹中
afl-fuzz会fork出一个子进程作为fork server，然后fuzzer通过管道发送4字节数据通知fork server去fork一个进程来测试。
无论被fuzz的进程结果如何，fork server都会通过管道返回子进程的执行结果到afl-fuzz
afl-fuzz根据测试生成的覆盖率信息来引导后续的测试

进程关系：fuzzer进程 -> fork server -> 被fuzz的程序

常用的AFL魔改技巧有哪些？

因为AFL中有覆盖率反馈和crash捕获的功能，所以可以替换AFL原本的mutate【变异器】为自己的代码，说白了就是借了AFL的一个壳。
在写入testcase之前可以对写入的内容进行封装和映射，afl-fuzz的这部分原本的代码如下：

//afl-fuzz.c

/* Write modified data to file for testing. If out_file is set, the old file
   is unlinked and a new one is created. Otherwise, out_fd is rewound and
   truncated. */

static void write_to_testcase(void* mem, u32 len) {

  s32 fd = out_fd;

  if (out_file) {

    unlink(out_file); /* Ignore errors. */

    fd = open(out_file, O_WRONLY | O_CREAT | O_EXCL, 0600);

    if (fd < 0) PFATAL("Unable to create '%s'", out_file);

  } else lseek(fd, 0, SEEK_SET);

  ck_write(fd, mem, len, out_file);

  if (!out_file) {

    if (ftruncate(fd, len)) PFATAL("ftruncate() failed");
    lseek(fd, 0, SEEK_SET);

  } else close(fd);

}

write_to_testcase函数会将变异生成的新样本写入到.cur_input隐藏文件夹中，所以我们在文件写入之前可以在外面套上一层函数以便让AFL的原始字节码的变异适用于更多的场景。afl-fuzz适合基于字节粒度变异的fuzz（如fuzz一个简单的输入输出程序），但并不是所有目标都可以直接进行字节粒度的fuzz。有些是因为文件解析部分的代码不能单独抽出来，有些是因为文件解析仅仅是逻辑的开始。那么为变异器加层就是在这方面扩展afl-fuzz的最简单方法，最经典的例子如webassembly。

为了fuzz这些结构化的东西（类似SQL）我们需要进行结构感知（structure-aware），即针对特定输入类型的语法进行感知

如何使用AFL fuzz client-server模式的程序，或者如何使用AFL去fuzz网络协议呢？

lient-server模式中的client -> server大致流程如下：

client ---> server(){
    接收包();
    处理包();
    返回响应包();
}

这篇文章值得参考：https://www.fastly.com/blog/how-fuzz-server-american-fuzzy-lop

文章中提到一个为Persistent mode（持久模式）的概念，如果要向AFL集成这种功能，我们可以按照如下流程编写自己的代码：

while (go)					//while loop
    put_request(read(file)) //将client要发送的请求包写入到文件中
    req = get_request()		//并让服务端从文件中获取请求
    process(req)			//服务端处理请求
    notify_fuzzer() 		//通知fuzzer（AFL）

该方法需要用afl直接启动server程序，patch server程序接收请求包的代码，改为直接从标准输入里读取，server执行足够多次请求后退出，AFL再启动一个新的server再次fuzz。

fuzz要真正解决的问题和一些Sakura师傅的建议

要对某一个项目进行fuzz就要首先考虑如何将项目跑起来，如何在插桩和asan之后跑起来。所以要先学会Makefile和CMake，必要时要学会使用unicorn、qiling等固件仿真工具进行模拟运行。

Address Sanitizer又名ASan，是一个快速的C/C++内存错误检查器。它可以检测到：UAF、Heap buffer overflow、Stack buffer overflow、Global buffer overflow、Use after return、Use after scpoe、Initialization order bugs、Memory leak等漏洞类型。asan可以简单理解成对malloc和free以及存取指令等的hook，从而在发现分配出来的内存的大小，小于要存取的index的大小时，检测出越界读写问题。

要理解被测试程序的代码，并不是说随便拿到一个程序拿fuzz跑起来是有意义的，例如传入数据的parser（解析器、语法分析器），必要时通过逆向来单独取出一部分代码来测试功能。比如Windows的media player，这是一个图形化的应用程序，我们不能通过命令行窗口来对该程序进行流程的控制与输入，但是这些功能的实现肯定都是存在于某一个dll（动态链接库）中的，所以可以利用dlopen函数写harness来加载dll：

harness作用：如果我们想fuzz dll (加载的库)中的函数输入，因为没有定义入口点，我们需要编写测试工具来将输入从命令行传递到 DLL 函数。
善于根据场景改进fuzz，以将只用与文件格式fuzz的AFL利用加层或映射让其扩展到更多的场景中，实现自己的自定义编译。
根据不同的目标掌握不同的知识，找到主流使用的fuzzer并进行改进
- js fuzz需要掌握编译原理，理解文法
- 内核fuzz需要掌握内核知识和驱动盒编写，理解如何构建和生成有关联性的系统调用集合。
自定义检测工具

Address Sanitizer又名ASan，是一个快速的C/C++内存错误检查器。它可以检测到：UAF、Heap buffer overflow、Stack buffer overflow、Global buffer overflow、Use after return、Use after scpoe、Initialization order bugs、Memory leak等漏洞类型。asan可以简单理解成对malloc和free以及存取指令等的hook，从而在发现分配出来的内存的大小，小于要存取的index的大小时，检测出越界读写问题。

asan是对malloc、free等指令的hook，但是如果像js引擎这种直接mmap一块大内存然后自己来管理内存的情况，那么如何检测出潜在的越界读写问题呢？这种时候就需要理解代码，然后自己实现一套针对性的自定义asan了。

afl-gcc源码阅读

/* Main entry point */

int main(int argc, char** argv) {

  if (isatty(2) && !getenv("AFL_QUIET")) { //检查stderr是否连接/dev/

    SAYF(cCYA "afl-cc " cBRI VERSION cRST " by <lcamtuf@google.com>\n");

  } else be_quiet = 1;

  if (argc < 2) {

    SAYF("\n"
         "This is a helper application for afl-fuzz. It serves as a drop-in replacement\n"
         "for gcc or clang, letting you recompile third-party code with the required\n"
         "runtime instrumentation. A common use pattern would be one of the following:\n\n"

         "  CC=%s/afl-gcc ./configure\n"
         "  CXX=%s/afl-g++ ./configure\n\n"

         "You can specify custom next-stage toolchain via AFL_CC, AFL_CXX, and AFL_AS.\n"
         "Setting AFL_HARDEN enables hardening optimizations in the compiled code.\n\n",
         BIN_PATH, BIN_PATH);

    exit(1);

  }

  find_as(argv[0]);

  edit_params(argc, argv);

  execvp(cc_params[0], (char**)cc_params);

  FATAL("Oops, failed to execute '%s' - check your PATH", cc_params[0]);

  return 0;

}

1	find_as(argv[0]);

调用find_as命令，argv[0]是afl-gcc的路径

find_as

/* Try to find our "fake" GNU assembler in AFL_PATH or at the location derived
   from argv[0]. If that fails, abort. */

static void find_as(u8* argv0) {

  u8 *afl_path = getenv("AFL_PATH");
  u8 *slash, *tmp;

  if (afl_path) {

    tmp = alloc_printf("%s/as", afl_path);

    if (!access(tmp, X_OK)) {
      as_path = afl_path;
      ck_free(tmp);
      return;
    }

    ck_free(tmp);

  }

  slash = strrchr(argv0, '/');

  if (slash) {

    u8 *dir;

    *slash = 0;
    dir = ck_strdup(argv0);
    *slash = '/';

    tmp = alloc_printf("%s/afl-as", dir);

    if (!access(tmp, X_OK)) {
      as_path = dir;
      ck_free(tmp);
      return;
    }

    ck_free(tmp);
    ck_free(dir);

  }

  if (!access(AFL_PATH "/as", X_OK)) {
    as_path = AFL_PATH;
    return;
  }

  FATAL("Unable to find AFL wrapper binary for 'as'. Please set AFL_PATH");
 
}

u8 *afl_path = getenv("AFL_PATH");
u8 *slash, *tmp;

if (afl_path) {

  tmp = alloc_printf("%s/as", afl_path); //@@@@

  if (!access(tmp, X_OK)) {
    as_path = afl_path;
    ck_free(tmp);
    return;
  }

  ck_free(tmp);

}

#define alloc_printf(_str...) ({ \
    u8* _tmp; \
    s32 _len = snprintf(NULL, 0, _str); \
    if (_len < 0) FATAL("Whoa, snprintf() fails?!"); \
    _tmp = ck_alloc(_len + 1); \
    snprintf((char*)_tmp, _len + 1, _str); \
    _tmp; \
  })

这个宏定义实现了一个动态内存分配的格式化字符串函数，类似于 sprintf，但它会自动分配足够的内存来存储格式化后的字符串，其中调用ck_alloc函数

#ifndef DEBUG_BUILD

/* In non-debug mode, we just do straightforward aliasing of the above functions
   to user-visible names such as ck_alloc(). */
/* 在非调试模式下，我们只需将上述函数直接别名为用户可见的名称，例如 ck_alloc() */

#define ck_alloc          DFL_ck_alloc
#define ck_alloc_nozero   DFL_ck_alloc_nozero
#define ck_realloc        DFL_ck_realloc
#define ck_realloc_block  DFL_ck_realloc_block
#define ck_strdup         DFL_ck_strdup
#define ck_memdup         DFL_ck_memdup
#define ck_memdup_str     DFL_ck_memdup_str
#define ck_free           DFL_ck_free

#define alloc_report()

#else

此处要求DEBUG_BUILD为假

1 2	#define ck_alloc(_p1) \ TRK_ck_alloc(_p1, __FILE__, __FUNCTION__, __LINE__)

接下来还有这个定义

因此得到：

非调试模式下的执行流程：

ck_alloc(size) 
  → DFL_ck_alloc(size)
    → DFL_ck_alloc_nozero(size)  // 实际分配
    → memset(mem, 0, size)       // 清零
    → 返回内存指针

调试模式下的执行流程：

ck_alloc(size) 
  → TRK_ck_alloc(size, __FILE__, __FUNCTION__, __LINE__)
    → DFL_ck_alloc(size)         // 实际分配
    → TRK_alloc_buf(ret, file, func, line)  // 记录分配信息
    → 返回内存指针

主要看DFL_ck_alloc函数

DFL_ck_alloc

/* Allocate a buffer, returning zeroed memory. */

static inline void* DFL_ck_alloc(u32 size) {

  void* mem;

  if (!size) return NULL;
  mem = DFL_ck_alloc_nozero(size);

  return memset(mem, 0, size);

}

DFL_ck_alloc_nozero

static inline void* DFL_ck_alloc_nozero(u32 size) {

  void* ret;

  if (!size) return NULL;

  ALLOC_CHECK_SIZE(size);
  ret = malloc(size + ALLOC_OFF_TOTAL);
  ALLOC_CHECK_RESULT(ret, size);

  ret += ALLOC_OFF_HEAD;

  ALLOC_C1(ret) = ALLOC_MAGIC_C1;
  ALLOC_S(ret)  = size;
  ALLOC_C2(ret) = ALLOC_MAGIC_C2;

  return ret;

}
---------------------------------------------------------------------------------------------------
/* Magic tokens used to mark used / freed chunks. */

#define ALLOC_MAGIC_C1  0xFF00FF00 /* Used head (dword)  */
#define ALLOC_MAGIC_F   0xFE00FE00 /* Freed head (dword) */
#define ALLOC_MAGIC_C2  0xF0       /* Used tail (byte)   */

/* Positions of guard tokens in relation to the user-visible pointer. */

#define ALLOC_C1(_ptr)  (((u32*)(_ptr))[-2])
#define ALLOC_S(_ptr)   (((u32*)(_ptr))[-1])
#define ALLOC_C2(_ptr)  (((u8*)(_ptr))[ALLOC_S(_ptr)])

1
2
3

((u32*)(_ptr))[-2] = 0xFF00FF00 ;
((u32*)(_ptr))[-1] = size ; 			// size ->【字符串长度】
((u8*)(_ptr))[((u32*)(_ptr))[-1]] =  0xF0 ;

在对应位置写了magic token以及size：

// 执行前------------------------------------------------------------------------
(gdb) x/16gx ret-0x10
0x555555758270:	0xf0006e69622f6c61	0x0000000000000031
0x555555758280:	0x0000000000000000	0x0000000000000000
                                    # ret
0x555555758290:	0x0000000000000000	0x0000000000000000
0x5555557582a0:	0x0000000000000000	0x0000000000020d61
// 执行后------------------------------------------------------------------------
(gdb) x/16gx ret-0x10
0x555555758270:	0xf0006e69622f6c61	0x0000000000000031
0x555555758280:	0x00000016ff00ff00	0x0000000000000000
                # 【2】、【1】        # ret
0x555555758290:	0x0000000000000000	0x00f0000000000000
                                    # 【3】
0x5555557582a0:	0x0000000000000000	0x0000000000020d61

DFL_ck_alloc_nozero函数返回后会调用memset将堆内存清零，说白了DFL_ck_alloc就是外面包了一个memset得到了DFL_ck_alloc_nozero

TRK_ck_alloc

/* Simple wrappers for non-debugging functions: */

static inline void* TRK_ck_alloc(u32 size, const char* file, const char* func,
                                 u32 line) {

  void* ret = DFL_ck_alloc(size);
  TRK_alloc_buf(ret, file, func, line);
  return ret;

}

主要的还是DFL_ck_alloc函数

然后调用TRK_alloc_buf函数

/* Add a new entry to the list of allocated objects. */

static inline void TRK_alloc_buf(void* ptr, const char* file, const char* func,
                                 u32 line) {

  u32 i, bucket;

  if (!ptr) return;

  bucket = TRKH(ptr);                        //计算哈希值

  /* Find a free slot in the list of entries for that bucket. */

  for (i = 0; i < TRK_cnt[bucket]; i++)

    if (!TRK[bucket][i].ptr) {

      TRK[bucket][i].ptr  = ptr;
      TRK[bucket][i].file = (char*)file;
      TRK[bucket][i].func = (char*)func;
      TRK[bucket][i].line = line;
      return;

    }

  /* No space available - allocate more. */

  TRK[bucket] = DFL_ck_realloc_block(TRK[bucket],
    (TRK_cnt[bucket] + 1) * sizeof(struct TRK_obj));

  TRK[bucket][i].ptr  = ptr;
  TRK[bucket][i].file = (char*)file;
  TRK[bucket][i].func = (char*)func;
  TRK[bucket][i].line = line;

  TRK_cnt[bucket]++;

}

首先看到以下的数据结构：

TRK_obj结构体：

struct TRK_obj {
  void *ptr;      // 分配的内存指针
  char *file;     // 分配时的源文件名
  char *func;     // 分配时的函数名
  u32  line;      // 分配时的行号
};

哈希桶系统：

1
2
3

#define ALLOC_BUCKETS 4096
struct TRK_obj* TRK[ALLOC_BUCKETS];    // 哈希桶数组
u32 TRK_cnt[ALLOC_BUCKETS];            // 每个桶的元素计数

哈希函数：

1	#define TRKH(_ptr) (((((u32)(_ptr)) >> 16) ^ ((u32)(_ptr))) % ALLOC_BUCKETS)

1 2	if (!ptr) return; // 空指针检查 bucket = TRKH(ptr); // 计算哈希桶索引

查找空闲槽位：

for (i = 0; i < TRK_cnt[bucket]; i++)
    if (!TRK[bucket][i].ptr) {
        // 找到空闲槽位，记录分配信息
        TRK[bucket][i].ptr  = ptr;
        TRK[bucket][i].file = (char*)file;
        TRK[bucket][i].func = (char*)func;
        TRK[bucket][i].line = line;
        return;
    }

动态扩容（如果没有足够的空间）：

// 重新分配更大的桶空间
TRK[bucket] = DFL_ck_realloc_block(TRK[bucket],
    (TRK_cnt[bucket] + 1) * sizeof(struct TRK_obj));

// 在新分配的位置记录信息
TRK[bucket][i].ptr  = ptr;
TRK[bucket][i].file = (char*)file;
TRK[bucket][i].func = (char*)func;
TRK[bucket][i].line = line;

TRK_cnt[bucket]++;          // 增加桶的元素计数

find_as

slash = strrchr(argv0, '/');  // 获取"/afl-gcc"的起始地址

if (slash) {

  u8 *dir;

  *slash = 0;     	       // 对argv0进行\x00截断，使其变为/usr/local/bin	
  dir = ck_strdup(argv0);
  *slash = '/';

  tmp = alloc_printf("%s/afl-as", dir);

  if (!access(tmp, X_OK)) {
    as_path = dir;
    ck_free(tmp);
    return;
  }

  ck_free(tmp);
  ck_free(dir);

}

if (!access(AFL_PATH "/as", X_OK)) {
  as_path = AFL_PATH;
  return;
}

FATAL("Unable to find AFL wrapper binary for 'as'. Please set AFL_PATH");

DFL_ck_strdup

static inline u8* DFL_ck_strdup(u8* str) {

  void* ret;
  u32   size;

  if (!str) return NULL;

  size = strlen((char*)str) + 1;

  ALLOC_CHECK_SIZE(size);
    
  ret = malloc(size + ALLOC_OFF_TOTAL);
  ALLOC_CHECK_RESULT(ret, size);

  ret += ALLOC_OFF_HEAD;

  ALLOC_C1(ret) = ALLOC_MAGIC_C1;
  ALLOC_S(ret)  = size;
  ALLOC_C2(ret) = ALLOC_MAGIC_C2;

  return memcpy(ret, str, size);

}

ALLOC_CHECK_SIZE(size);
// 展开为：
if (size > MAX_ALLOC)  // MAX_ALLOC = 0x40000000 (1GB)
    ABORT("Bad alloc request: %u bytes", size);

1
2
3

ALLOC_C1(ret) = ALLOC_MAGIC_C1;  // ((u32*)ret)[-2] = 0xFF00FF00
ALLOC_S(ret)  = size;            // ((u32*)ret)[-1] = size
ALLOC_C2(ret) = ALLOC_MAGIC_C2;  // ((u8*)ret)[size] = 0xF0

内存布局大致如下：

1
2
3

地址偏移:  -8    -4     0           size
内容:    [C1]  [SIZE] [用户数据...] [C2]
值:    0xFF00FF00  size   字符串内容   0xF0

ck_free

static inline void DFL_ck_free(void* mem) {

  if (!mem) return;

  CHECK_PTR(mem);

#ifdef DEBUG_BUILD

  /* Catch pointer issues sooner. */
  memset(mem, 0xFF, ALLOC_S(mem));

#endif /* DEBUG_BUILD */

  ALLOC_C1(mem) = ALLOC_MAGIC_F;  // 标记为已经释放

  free(mem - ALLOC_OFF_HEAD);

}

if (mem) {
    // 检查头部magic number
    if (ALLOC_C1(mem) ^ ALLOC_MAGIC_C1) {
        if (ALLOC_C1(mem) == ALLOC_MAGIC_F)
            ABORT("Use after free.");           // 重复释放
        else 
            ABORT("Corrupted head alloc canary."); // 头部损坏
    }
    // 检查尾部magic number
    if (ALLOC_C2(mem) ^ ALLOC_MAGIC_C2)
        ABORT("Corrupted tail alloc canary.");     // 尾部损坏
}

释放前的内存布局：

实际分配的内存块：
[Guard1: 4字节] [Size: 4字节] [用户数据: size字节] [Guard2: 1字节]
0xFF00FF00        size          字符串内容           0xF0
     ↑                              ↑
malloc返回的地址              mem指向的位置

释放后的内存布局：

1
2
3

[Guard1: 4字节] [Size: 4字节] [污染数据: size字节] [Guard2: 1字节]
0xFE00FE00        size       0xFF...0xFF         0xF0
(已释放标记)                  (调试模式下)

说白了就是在free外面套一层wrapper

find_as

if (!access(AFL_PATH "/as", X_OK)) {
  as_path = AFL_PATH;
  return;
}

FATAL("Unable to find AFL wrapper binary for 'as'. Please set AFL_PATH");

宏AFL_PATH（非环境变量AFL_PATH）在编译时由Makefile文件确定，值默认为/usr/local/lib/afl/，拼接后就有：/usr/local/lib/afl/as

find_as 函数总结

该函数会按照一定的规则在系统中寻找afl-as汇编器的位置：

检测环境变量AFL_PATH是否存在，如果存在则会调用access函数检查指定的可执行文件是否拥有可执行权限，若拥有则设置全局变量as_path并return。
若环境变量不存在，则检查/usr/local/bin/afl-as。
若以上两个路径均无效，最后尝试能否访问/usr/local/lib/afl/as(afl-as)，若仍然无效则会终止afl-gcc的运行。

edit_params

1	edit_params(argc, argv);

edit_params是AFL编译器包装器的核心函数，负责处理和修改编译参数，将用户的编译命令转换为带有AFL插桩功能的编译命令。

/* Copy argv to cc_params, making the necessary edits. */

static void edit_params(u32 argc, char** argv) {

  u8 fortify_set = 0, asan_set = 0;
  u8 *name;

#if defined(__FreeBSD__) && defined(__x86_64__)
  u8 m32_set = 0;
#endif

  cc_params = ck_alloc((argc + 128) * sizeof(u8*));

  name = strrchr(argv[0], '/');
  if (!name) name = argv[0]; else name++;

  if (!strncmp(name, "afl-clang", 9)) {

    clang_mode = 1;

    setenv(CLANG_ENV_VAR, "1", 1);

    if (!strcmp(name, "afl-clang++")) {
      u8* alt_cxx = getenv("AFL_CXX");
      cc_params[0] = alt_cxx ? alt_cxx : (u8*)"clang++";
    } else {
      u8* alt_cc = getenv("AFL_CC");
      cc_params[0] = alt_cc ? alt_cc : (u8*)"clang";
    }

  } else {

    /* With GCJ and Eclipse installed, you can actually compile Java! The
       instrumentation will work (amazingly). Alas, unhandled exceptions do
       not call abort(), so afl-fuzz would need to be modified to equate
       non-zero exit codes with crash conditions when working with Java
       binaries. Meh. */

#ifdef __APPLE__

    if (!strcmp(name, "afl-g++")) cc_params[0] = getenv("AFL_CXX");
    else if (!strcmp(name, "afl-gcj")) cc_params[0] = getenv("AFL_GCJ");
    else cc_params[0] = getenv("AFL_CC");

    if (!cc_params[0]) {

      SAYF("\n" cLRD "[-] " cRST
           "On Apple systems, 'gcc' is usually just a wrapper for clang. Please use the\n"
           "    'afl-clang' utility instead of 'afl-gcc'. If you really have GCC installed,\n"
           "    set AFL_CC or AFL_CXX to specify the correct path to that compiler.\n");

      FATAL("AFL_CC or AFL_CXX required on MacOS X");

    }

#else

    if (!strcmp(name, "afl-g++")) {
      u8* alt_cxx = getenv("AFL_CXX");
      cc_params[0] = alt_cxx ? alt_cxx : (u8*)"g++";
    } else if (!strcmp(name, "afl-gcj")) {
      u8* alt_cc = getenv("AFL_GCJ");
      cc_params[0] = alt_cc ? alt_cc : (u8*)"gcj";
    } else {
      u8* alt_cc = getenv("AFL_CC");
      cc_params[0] = alt_cc ? alt_cc : (u8*)"gcc";
    }

#endif /* __APPLE__ */

  }

  while (--argc) {
    u8* cur = *(++argv);

    if (!strncmp(cur, "-B", 2)) {

      if (!be_quiet) WARNF("-B is already set, overriding");

      if (!cur[2] && argc > 1) { argc--; argv++; }
      continue;

    }

    if (!strcmp(cur, "-integrated-as")) continue;

    if (!strcmp(cur, "-pipe")) continue;

#if defined(__FreeBSD__) && defined(__x86_64__)
    if (!strcmp(cur, "-m32")) m32_set = 1;
#endif

    if (!strcmp(cur, "-fsanitize=address") ||
        !strcmp(cur, "-fsanitize=memory")) asan_set = 1;

    if (strstr(cur, "FORTIFY_SOURCE")) fortify_set = 1;

    cc_params[cc_par_cnt++] = cur;

  }

  cc_params[cc_par_cnt++] = "-B";
  cc_params[cc_par_cnt++] = as_path;

  if (clang_mode)
    cc_params[cc_par_cnt++] = "-no-integrated-as";

  if (getenv("AFL_HARDEN")) {

    cc_params[cc_par_cnt++] = "-fstack-protector-all";

    if (!fortify_set)
      cc_params[cc_par_cnt++] = "-D_FORTIFY_SOURCE=2";

  }

  if (asan_set) {

    /* Pass this on to afl-as to adjust map density. */

    setenv("AFL_USE_ASAN", "1", 1);

  } else if (getenv("AFL_USE_ASAN")) {

    if (getenv("AFL_USE_MSAN"))
      FATAL("ASAN and MSAN are mutually exclusive");

    if (getenv("AFL_HARDEN"))
      FATAL("ASAN and AFL_HARDEN are mutually exclusive");

    cc_params[cc_par_cnt++] = "-U_FORTIFY_SOURCE";
    cc_params[cc_par_cnt++] = "-fsanitize=address";

  } else if (getenv("AFL_USE_MSAN")) {

    if (getenv("AFL_USE_ASAN"))
      FATAL("ASAN and MSAN are mutually exclusive");

    if (getenv("AFL_HARDEN"))
      FATAL("MSAN and AFL_HARDEN are mutually exclusive");

    cc_params[cc_par_cnt++] = "-U_FORTIFY_SOURCE";
    cc_params[cc_par_cnt++] = "-fsanitize=memory";


  }

  if (!getenv("AFL_DONT_OPTIMIZE")) {

#if defined(__FreeBSD__) && defined(__x86_64__)

    /* On 64-bit FreeBSD systems, clang -g -m32 is broken, but -m32 itself
       works OK. This has nothing to do with us, but let's avoid triggering
       that bug. */

    if (!clang_mode || !m32_set)
      cc_params[cc_par_cnt++] = "-g";

#else

      cc_params[cc_par_cnt++] = "-g";

#endif

    cc_params[cc_par_cnt++] = "-O3";
    cc_params[cc_par_cnt++] = "-funroll-loops";

    /* Two indicators that you're building for fuzzing; one of them is
       AFL-specific, the other is shared with libfuzzer. */

    cc_params[cc_par_cnt++] = "-D__AFL_COMPILER=1";
    cc_params[cc_par_cnt++] = "-DFUZZING_BUILD_MODE_UNSAFE_FOR_PRODUCTION=1";

  }

  if (getenv("AFL_NO_BUILTIN")) {

    cc_params[cc_par_cnt++] = "-fno-builtin-strcmp";
    cc_params[cc_par_cnt++] = "-fno-builtin-strncmp";
    cc_params[cc_par_cnt++] = "-fno-builtin-strcasecmp";
    cc_params[cc_par_cnt++] = "-fno-builtin-strncasecmp";
    cc_params[cc_par_cnt++] = "-fno-builtin-memcmp";
    cc_params[cc_par_cnt++] = "-fno-builtin-strstr";
    cc_params[cc_par_cnt++] = "-fno-builtin-strcasestr";

  }

  cc_params[cc_par_cnt] = NULL;

}

1. 初始化和内存分配

功能：

初始化状态标志变量
分配参数数组内存（原参数数 + 128个额外槽位）
使用AFL的安全内存分配函数

2. 确定编译器类型和路径

编译器选择逻辑：

调用名称	默认编译器	环境变量覆盖
afl-clang++	clang++	$AFL_CXX
afl-clang	clang	$AFL_CC
afl-g++	g++	$AFL_CXX
afl-gcj	gcj	$AFL_GCJ
afl-gcc	gcc	$AFL_CC

3. macOS特殊处理

#ifdef __APPLE__
if (!cc_params[0]) {
    SAYF("\n" cLRD "[-] " cRST
         "On Apple systems, 'gcc' is usually just a wrapper for clang. Please use the\n"
         "    'afl-clang' utility instead of 'afl-gcc'. If you really have GCC installed,\n"
         "    set AFL_CC or AFL_CXX to specify the correct path to that compiler.\n");
    FATAL("AFL_CC or AFL_CXX required on MacOS X");
}
#endif

原因：在macOS上，gcc通常是clang的别名，需要明确指定编译器路径。

4. 处理原始编译参数

while (--argc) {
    u8* cur = *(++argv);
    
    if (!strncmp(cur, "-B", 2)) {
        if (!be_quiet) WARNF("-B is already set, overriding");
        if (!cur[2] && argc > 1) { argc--; argv++; }
        continue;  // 跳过-B参数，AFL会设置自己的-B
    }
    
    if (!strcmp(cur, "-integrated-as")) continue;  // 跳过
    if (!strcmp(cur, "-pipe")) continue;           // 跳过
    
#if defined(__FreeBSD__) && defined(__x86_64__)
    if (!strcmp(cur, "-m32")) m32_set = 1;
#endif
    
    // 检测ASAN和FORTIFY_SOURCE
    if (!strcmp(cur, "-fsanitize=address") || 
        !strcmp(cur, "-fsanitize=memory")) asan_set = 1;
    if (strstr(cur, "FORTIFY_SOURCE")) fortify_set = 1;
    
    cc_params[cc_par_cnt++] = cur;  // 保留其他参数
}

参数处理策略：

跳过的参数：-B, -integrated-as, -pipe
检测的参数：-fsanitize=*, FORTIFY_SOURCE
特殊标记：FreeBSD x64下的-m32
保留的参数：其他所有参数

5. 添加AFL特定参数

cc_params[cc_par_cnt++] = "-B";
cc_params[cc_par_cnt++] = as_path;  // 指向AFL的汇编器包装器

if (clang_mode)
    cc_params[cc_par_cnt++] = "-no-integrated-as";

功能：

强制使用AFL的汇编器包装器
Clang模式下禁用集成汇编器

加固措施：

启用栈保护：-fstack-protector-all
启用FORTIFY_SOURCE：-D_FORTIFY_SOURCE=2（如果未设置）

6. 内存检测工具支持

if (asan_set) {
    setenv("AFL_USE_ASAN", "1", 1);  // 通知afl-as调整映射密度
} else if (getenv("AFL_USE_ASAN")) {
    if (getenv("AFL_USE_MSAN"))
        FATAL("ASAN and MSAN are mutually exclusive");
    if (getenv("AFL_HARDEN"))
        FATAL("ASAN and AFL_HARDEN are mutually exclusive");
        
    cc_params[cc_par_cnt++] = "-U_FORTIFY_SOURCE";
    cc_params[cc_par_cnt++] = "-fsanitize=address";
} else if (getenv("AFL_USE_MSAN")) {
    // 类似的MSAN处理逻辑
    cc_params[cc_par_cnt++] = "-U_FORTIFY_SOURCE";
    cc_params[cc_par_cnt++] = "-fsanitize=memory";
}

互斥性检查：

ASAN与MSAN不能同时使用
ASAN/MSAN与AFL_HARDEN不能同时使用

7. 优化和调试选项

if (!getenv("AFL_DONT_OPTIMIZE")) {
#if defined(__FreeBSD__) && defined(__x86_64__)
    if (!clang_mode || !m32_set)
        cc_params[cc_par_cnt++] = "-g";
#else
    cc_params[cc_par_cnt++] = "-g";
#endif

    cc_params[cc_par_cnt++] = "-O3";
    cc_params[cc_par_cnt++] = "-funroll-loops";
    
    // 模糊测试标识宏
    cc_params[cc_par_cnt++] = "-D__AFL_COMPILER=1";
    cc_params[cc_par_cnt++] = "-DFUZZING_BUILD_MODE_UNSAFE_FOR_PRODUCTION=1";
}

默认优化设置：

启用调试信息：-g
高级优化：-O3
循环展开：-funroll-loops
定义模糊测试宏

FreeBSD特殊处理：在64位FreeBSD系统上，clang -g -m32有bug，需要特殊处理。

8. 禁用内建函数优化 (AFL_NO_BUILTIN)

if (getenv("AFL_NO_BUILTIN")) {
    cc_params[cc_par_cnt++] = "-fno-builtin-strcmp";
    cc_params[cc_par_cnt++] = "-fno-builtin-strncmp";
    cc_params[cc_par_cnt++] = "-fno-builtin-strcasecmp";
    cc_params[cc_par_cnt++] = "-fno-builtin-strncasecmp";
    cc_params[cc_par_cnt++] = "-fno-builtin-memcmp";
    cc_params[cc_par_cnt++] = "-fno-builtin-strstr";
    cc_params[cc_par_cnt++] = "-fno-builtin-strcasestr";
}

目的：

禁用字符串比较函数的编译器优化
确保这些函数调用能被AFL的插桩捕获

1	cc_params[cc_par_cnt] = NULL;

环境变量支持

环境变量	功能	默认值
AFL_CC	指定C编译器路径	gcc/clang
AFL_CXX	指定C++编译器路径	g++/clang++
AFL_GCJ	指定GCJ编译器路径	gcj
AFL_HARDEN	启用安全加固选项	无
AFL_USE_ASAN	启用AddressSanitizer	无
AFL_USE_MSAN	启用MemorySanitizer	无
AFL_DONT_OPTIMIZE	禁用默认优化	无
AFL_NO_BUILTIN	禁用内建函数优化	无

execvp

1	execvp(cc_params[0], (char**)cc_params);

就是执行编译相应的源文件

至此afl-gcc就分析完毕

编译执行过程

gcc -g -fno-stack-protector -z execstack -no-pie -z norelro /home/cyberangel/Desktop/test/test.c \
  -o  /home/cyberangel/Desktop/test/test_fuzz_gcc_source \
  -B  /usr/local/lib/afl -g -O3 \
  -funroll-loops -D__AFL_COMPILER=1 \
  -DFUZZING_BUILD_MODE_UNSAFE_FOR_PRODUCTION=1

这里用cyberangel师傅提供的例子做一个记录

会发现最后execvp的命令是这一串，我们可以通过-B来指定我们自己的as（汇编器）

gcc编译大多都认为是以下四个步骤：

预处理，生成预编译文件（**.i**文件）：gcc –E main.c –o main.i
- 文件包含(#include)、添加行号和文件名标识、宏定义展开及处理(#define)、条件编译处理(#ifdef)、清理注释内容、特殊控制处理(#pragma/#error)
编译，生成汇编代码（**.s**文件）：gcc –S main.i –o main.s
- 词法分析、语法分析、语义分析、代码优化
汇编，生成目标文件（**.o**文件）：gcc –c main.s –o main.o
- 汇编->可执行机器码
链接，生成可执行文件（executable文件）：gcc main.o –o main

会发现上述的gcc只指定了特殊的as（即汇编器），因此可以猜测出是通过劫持汇编器来达到在特殊位置插入特殊函数的结果，因此接下来看一手afl的as源码

afl-as源码阅读

核心数据结构和全局变量：

static u8** as_params;          /* 传递给真实 'as' 的参数 */
static u8*  input_file;         /* 原始输入文件 */
static u8*  modified_file;      /* 插桩后的文件 */

static u8   be_quiet,           /* 静默模式 */
            clang_mode,         /* 是否运行在 clang 模式 */
            pass_thru,          /* 是否直接传递数据 */
            just_version,       /* 只显示版本信息 */
            sanitizer;          /* 是否使用 ASAN/MSAN */

static u32  inst_ratio = 100,   /* 插桩概率（%） */
            as_par_cnt = 1;     /* 'as' 参数数量 */

/* Main entry point */

int main(int argc, char** argv) {

  s32 pid;
  u32 rand_seed;
  int status;
  u8* inst_ratio_str = getenv("AFL_INST_RATIO");

  struct timeval tv;
  struct timezone tz;

  clang_mode = !!getenv(CLANG_ENV_VAR);

  if (isatty(2) && !getenv("AFL_QUIET")) {

    SAYF(cCYA "afl-as " cBRI VERSION cRST " by <lcamtuf@google.com>\n");
 
  } else be_quiet = 1;

  if (argc < 2) {

    SAYF("\n"
         "This is a helper application for afl-fuzz. It is a wrapper around GNU 'as',\n"
         "executed by the toolchain whenever using afl-gcc or afl-clang. You probably\n"
         "don't want to run this program directly.\n\n"

         "Rarely, when dealing with extremely complex projects, it may be advisable to\n"
         "set AFL_INST_RATIO to a value less than 100 in order to reduce the odds of\n"
         "instrumenting every discovered branch.\n\n");

    exit(1);

  }

  gettimeofday(&tv, &tz);  //获得当前系统时间

  rand_seed = tv.tv_sec ^ tv.tv_usec ^ getpid();  //种子

  srandom(rand_seed);  //初始化

  edit_params(argc, argv); //1

  if (inst_ratio_str) {

    if (sscanf(inst_ratio_str, "%u", &inst_ratio) != 1 || inst_ratio > 100) 
      FATAL("Bad value of AFL_INST_RATIO (must be between 0 and 100)");

  }

  if (getenv(AS_LOOP_ENV_VAR))
    FATAL("Endless loop when calling 'as' (remove '.' from your PATH)");

  setenv(AS_LOOP_ENV_VAR, "1", 1);

  /* When compiling with ASAN, we don't have a particularly elegant way to skip
     ASAN-specific branches. But we can probabilistically compensate for
     that... */

  if (getenv("AFL_USE_ASAN") || getenv("AFL_USE_MSAN")) {
    sanitizer = 1;
    inst_ratio /= 3;
  }

  if (!just_version) add_instrumentation();

  if (!(pid = fork())) {

    execvp(as_params[0], (char**)as_params);
    FATAL("Oops, failed to execute '%s' - check your PATH", as_params[0]);

  }

  if (pid < 0) PFATAL("fork() failed");

  if (waitpid(pid, &status, 0) <= 0) PFATAL("waitpid() failed");

  if (!getenv("AFL_KEEP_ASSEMBLY")) unlink(modified_file);

  exit(WEXITSTATUS(status));

}

edit_params

1	edit_params(argc, argv);

功能：

解析命令行参数，准备传递给真实汇编器的参数
检测目标架构（32位/64位）
处理 macOS 特殊情况（使用 clang 而不是 as）
确定临时文件路径

static void edit_params(int argc, char** argv) {

  u8 *tmp_dir = getenv("TMPDIR"), *afl_as = getenv("AFL_AS");
  u32 i;

#ifdef __APPLE__
//...
#endif /* __APPLE__ */

  /* Although this is not documented, GCC also uses TEMP and TMP when TMPDIR
     is not set. We need to check these non-standard variables to properly
     handle the pass_thru logic later on. */

  if (!tmp_dir) tmp_dir = getenv("TEMP");
  if (!tmp_dir) tmp_dir = getenv("TMP");
  if (!tmp_dir) tmp_dir = "/tmp";

  as_params = ck_alloc((argc + 32) * sizeof(u8*));

  as_params[0] = afl_as ? afl_as : (u8*)"as";

  as_params[argc] = 0;

  for (i = 1; i < argc - 1; i++) {

    if (!strcmp(argv[i], "--64")) use_64bit = 1;
    else if (!strcmp(argv[i], "--32")) use_64bit = 0;

#ifdef __APPLE__
//...
#endif /* __APPLE__ */

    as_params[as_par_cnt++] = argv[i];

  }

#ifdef __APPLE__
//...
#endif /* __APPLE__ */

  input_file = argv[argc - 1];

  if (input_file[0] == '-') {

    if (!strcmp(input_file + 1, "-version")) {
      just_version = 1;
      modified_file = input_file;
      goto wrap_things_up;
    }

    if (input_file[1]) FATAL("Incorrect use (not called through afl-gcc?)");
      else input_file = NULL;

  } else {

    /* Check if this looks like a standard invocation as a part of an attempt
       to compile a program, rather than using gcc on an ad-hoc .s file in
       a format we may not understand. This works around an issue compiling
       NSS. */

    if (strncmp(input_file, tmp_dir, strlen(tmp_dir)) &&
        strncmp(input_file, "/var/tmp/", 9) &&
        strncmp(input_file, "/tmp/", 5)) pass_thru = 1;

  }

  modified_file = alloc_printf("%s/.afl-%u-%u.s", tmp_dir, getpid(),
                               (u32)time(NULL));

wrap_things_up:

  as_params[as_par_cnt++] = modified_file;
  as_params[as_par_cnt]   = NULL;

}

环境变量TEMP和TMP的使用均需要用户在执行afl-as前手动设置，如果均不存在则默认设置tmp_dir变量为/tmp目录

生成临时文件名：/tmp/.afl-<pid>-<timestamp>.s

add_instrumentation

以下是核心数据结构：

static u8 line[MAX_LINE];           // 读取每行汇编代码的缓冲区
FILE* inf;                          // 输入文件指针
FILE* outf;                         // 输出文件指针
s32 outfd;                          // 输出文件描述符
u32 ins_lines = 0;                  // 插桩行数计数器

// 控制标志
u8  instr_ok = 0,                   // 是否在可插桩区域(.text段)
    skip_csect = 0,                 // 跳过代码段(架构不匹配)
    skip_next_label = 0,            // 跳过下一个标签
    skip_intel = 0,                 // 跳过Intel语法块
    skip_app = 0,                   // 跳过内联汇编块
    instrument_next = 0;            // 标记下一条指令需要插桩

插桩注入，这是程序的核心函数：

/* Process input file, generate modified_file. Insert instrumentation in all
   the appropriate places. */

static void add_instrumentation(void) {

  static u8 line[MAX_LINE];

  FILE* inf;
  FILE* outf;
  s32 outfd;
  u32 ins_lines = 0;

  u8  instr_ok = 0, skip_csect = 0, skip_next_label = 0,
      skip_intel = 0, skip_app = 0, instrument_next = 0;

  if (input_file) {

    inf = fopen(input_file, "r");
    if (!inf) PFATAL("Unable to read '%s'", input_file);

  } else inf = stdin;

  outfd = open(modified_file, O_WRONLY | O_EXCL | O_CREAT, 0600);

  if (outfd < 0) PFATAL("Unable to write to '%s'", modified_file);

  outf = fdopen(outfd, "w");

  if (!outf) PFATAL("fdopen() failed");  

  while (fgets(line, MAX_LINE, inf)) {

    /* In some cases, we want to defer writing the instrumentation trampoline
       until after all the labels, macros, comments, etc. If we're in this
       mode, and if the line starts with a tab followed by a character, dump
       the trampoline now. */

    if (!pass_thru && !skip_intel && !skip_app && !skip_csect && instr_ok &&
        instrument_next && line[0] == '\t' && isalpha(line[1])) {

      fprintf(outf, use_64bit ? trampoline_fmt_64 : trampoline_fmt_32,
              R(MAP_SIZE));

      instrument_next = 0;
      ins_lines++;

    }

    /* Output the actual line, call it a day in pass-thru mode. */

    fputs(line, outf);

    if (pass_thru) continue;

    /* All right, this is where the actual fun begins. For one, we only want to
       instrument the .text section. So, let's keep track of that in processed
       files - and let's set instr_ok accordingly. */

    if (line[0] == '\t' && line[1] == '.') {

      /* OpenBSD puts jump tables directly inline with the code, which is
         a bit annoying. They use a specific format of p2align directives
         around them, so we use that as a signal. */

      if (!clang_mode && instr_ok && !strncmp(line + 2, "p2align ", 8) &&
          isdigit(line[10]) && line[11] == '\n') skip_next_label = 1;

      if (!strncmp(line + 2, "text\n", 5) ||
          !strncmp(line + 2, "section\t.text", 13) ||
          !strncmp(line + 2, "section\t__TEXT,__text", 21) ||
          !strncmp(line + 2, "section __TEXT,__text", 21)) {
        instr_ok = 1;
        continue; 
      }

      if (!strncmp(line + 2, "section\t", 8) ||
          !strncmp(line + 2, "section ", 8) ||
          !strncmp(line + 2, "bss\n", 4) ||
          !strncmp(line + 2, "data\n", 5)) {
        instr_ok = 0;
        continue;
      }

    }

    /* Detect off-flavor assembly (rare, happens in gdb). When this is
       encountered, we set skip_csect until the opposite directive is
       seen, and we do not instrument. */

    if (strstr(line, ".code")) {

      if (strstr(line, ".code32")) skip_csect = use_64bit;
      if (strstr(line, ".code64")) skip_csect = !use_64bit;

    }

    /* Detect syntax changes, as could happen with hand-written assembly.
       Skip Intel blocks, resume instrumentation when back to AT&T. */

    if (strstr(line, ".intel_syntax")) skip_intel = 1;
    if (strstr(line, ".att_syntax")) skip_intel = 0;

    /* Detect and skip ad-hoc __asm__ blocks, likewise skipping them. */

    if (line[0] == '#' || line[1] == '#') {

      if (strstr(line, "#APP")) skip_app = 1;
      if (strstr(line, "#NO_APP")) skip_app = 0;

    }

    /* If we're in the right mood for instrumenting, check for function
       names or conditional labels. This is a bit messy, but in essence,
       we want to catch:

         ^main:      - function entry point (always instrumented)
         ^.L0:       - GCC branch label
         ^.LBB0_0:   - clang branch label (but only in clang mode)
         ^\tjnz foo  - conditional branches

       ...but not:

         ^# BB#0:    - clang comments
         ^ # BB#0:   - ditto
         ^.Ltmp0:    - clang non-branch labels
         ^.LC0       - GCC non-branch labels
         ^.LBB0_0:   - ditto (when in GCC mode)
         ^\tjmp foo  - non-conditional jumps

       Additionally, clang and GCC on MacOS X follow a different convention
       with no leading dots on labels, hence the weird maze of #ifdefs
       later on.

     */

    if (skip_intel || skip_app || skip_csect || !instr_ok ||
        line[0] == '#' || line[0] == ' ') continue;

    /* Conditional branch instruction (jnz, etc). We append the instrumentation
       right after the branch (to instrument the not-taken path) and at the
       branch destination label (handled later on). */

    if (line[0] == '\t') {

      if (line[1] == 'j' && line[2] != 'm' && R(100) < inst_ratio) {

        fprintf(outf, use_64bit ? trampoline_fmt_64 : trampoline_fmt_32,
                R(MAP_SIZE));

        ins_lines++;

      }

      continue;

    }

    /* Label of some sort. This may be a branch destination, but we need to
       tread carefully and account for several different formatting
       conventions. */

    /* Everybody else: .L<whatever>: */

    if (strstr(line, ":")) {

      if (line[0] == '.') {

        /* .L0: or LBB0_0: style jump destination */

        /* Apple: .L<num> / .LBB<num> */

        if ((isdigit(line[2]) || (clang_mode && !strncmp(line + 1, "LBB", 3)))
            && R(100) < inst_ratio) {


          /* An optimization is possible here by adding the code only if the
             label is mentioned in the code in contexts other than call / jmp.
             That said, this complicates the code by requiring two-pass
             processing (messy with stdin), and results in a speed gain
             typically under 10%, because compilers are generally pretty good
             about not generating spurious intra-function jumps.

             We use deferred output chiefly to avoid disrupting
             .Lfunc_begin0-style exception handling calculations (a problem on
             MacOS X). */

          if (!skip_next_label) instrument_next = 1; else skip_next_label = 0;

        }

      } else {

        /* Function label (always instrumented, deferred mode). */

        instrument_next = 1;
    
      }

    }

  }

  if (ins_lines)
    fputs(use_64bit ? main_payload_64 : main_payload_32, outf);

  if (input_file) fclose(inf);
  fclose(outf);

  if (!be_quiet) {

    if (!ins_lines) WARNF("No instrumentation targets found%s.",
                          pass_thru ? " (pass-thru mode)" : "");
    else OKF("Instrumented %u locations (%s-bit, %s mode, ratio %u%%).",
             ins_lines, use_64bit ? "64" : "32",
             getenv("AFL_HARDEN") ? "hardened" : 
             (sanitizer ? "ASAN/MSAN" : "non-hardened"),
             inst_ratio);
 
  }

}

if (input_file) {
    inf = fopen(input_file, "r");
    if (!inf) PFATAL("Unable to read '%s'", input_file);
} else inf = stdin;

outfd = open(modified_file, O_WRONLY | O_EXCL | O_CREAT, 0600);
if (outfd < 0) PFATAL("Unable to write to '%s'", modified_file);
outf = fdopen(outfd, "w");

文件打开和初始化

while (fgets(line, MAX_LINE, inf)) {

if (!pass_thru && !skip_intel && !skip_app && !skip_csect && instr_ok &&
    instrument_next && line[0] == '\t' && isalpha(line[1])) {

  fprintf(outf, use_64bit ? trampoline_fmt_64 : trampoline_fmt_32,
          R(MAP_SIZE));

  instrument_next = 0;
  ins_lines++;

}

fputs(line, outf);

if (pass_thru) continue;

if (line[0] == '\t' && line[1] == '.') {

  if (!clang_mode && instr_ok && !strncmp(line + 2, "p2align ", 8) &&
      isdigit(line[10]) && line[11] == '\n') skip_next_label = 1;

  if (!strncmp(line + 2, "text\n", 5) ||
      !strncmp(line + 2, "section\t.text", 13) ||
      !strncmp(line + 2, "section\t__TEXT,__text", 21) ||
      !strncmp(line + 2, "section __TEXT,__text", 21)) {
    instr_ok = 1;
    continue; 
  }

  if (!strncmp(line + 2, "section\t", 8) ||
      !strncmp(line + 2, "section ", 8) ||
      !strncmp(line + 2, "bss\n", 4) ||
      !strncmp(line + 2, "data\n", 5)) {
    instr_ok = 0;
    continue;
  }

}

if (strstr(line, ".code")) {

  if (strstr(line, ".code32")) skip_csect = use_64bit;
  if (strstr(line, ".code64")) skip_csect = !use_64bit;

}

if (strstr(line, ".intel_syntax")) skip_intel = 1;
if (strstr(line, ".att_syntax")) skip_intel = 0;

if (line[0] == '#' || line[1] == '#') {

  if (strstr(line, "#APP")) skip_app = 1;
  if (strstr(line, "#NO_APP")) skip_app = 0;

}

if (skip_intel || skip_app || skip_csect || !instr_ok ||
    line[0] == '#' || line[0] == ' ') continue;

/* Conditional branch instruction (jnz, etc). We append the instrumentation
   right after the branch (to instrument the not-taken path) and at the
   branch destination label (handled later on). */

if (line[0] == '\t') {

  if (line[1] == 'j' && line[2] != 'm' && R(100) < inst_ratio) {

    fprintf(outf, use_64bit ? trampoline_fmt_64 : trampoline_fmt_32,
            R(MAP_SIZE));

    ins_lines++;

  }

  continue;

}

这是延迟插桩的机制，fgets的第一个参数line被定义为static u8 line[MAX_LINE];，宏MAX_LINE在config.h中被默认定义为8192。将line展开，可以得到static uint8_t line[8192]，也就是说fgets函数会读取input_file的一行代码存放到line数组（最多读取MAX_LINE个字符）：

if (!pass_thru && !skip_intel && !skip_app && !skip_csect && instr_ok &&
    instrument_next && line[0] == '\t' && isalpha(line[1])) {

    fprintf(outf, use_64bit ? trampoline_fmt_64 : trampoline_fmt_32,
            R(MAP_SIZE));

    instrument_next = 0;
    ins_lines++;
}

触发条件：

不在直通模式 (!pass_thru)
不跳过Intel语法 (!skip_intel)
不跳过内联汇编 (!skip_app)
不跳过代码段 (!skip_csect)
在可插桩区域 (instr_ok)
标记需要插桩 (instrument_next)
当前行是指令 (line[0] == '\t' && isalpha(line[1]))

如果满足以上所有条件，就会直接将插桩代码写入outf变量，然后再通过fputs函数写入对应文件中

如果是直通模式就会直接将汇编写入进入下一行了

1 2	fputs(line, outf); if (pass_thru) continue;

if (line[0] == '\t' && line[1] == '.') {
    // 检测.text段
    if (!strncmp(line + 2, "text\n", 5) ||
        !strncmp(line + 2, "section\t.text", 13) ||
        !strncmp(line + 2, "section\t__TEXT,__text", 21) ||
        !strncmp(line + 2, "section __TEXT,__text", 21)) {
        instr_ok = 1;
        continue; 
    }

    // 检测其他段
    if (!strncmp(line + 2, "section\t", 8) ||
        !strncmp(line + 2, "section ", 8) ||
        !strncmp(line + 2, "bss\n", 4) ||
        !strncmp(line + 2, "data\n", 5)) {
        instr_ok = 0;
        continue;
    }
}

段类型处理：

段类型	插桩状态	说明
.text	启用	代码段，主要插桩目标
__TEXT,__text	启用	macOS代码段
.bss	禁用	未初始化数据段
.data	禁用	已初始化数据段
其他 .section	禁用	其他特殊段

假设当前是.text段，那么会标记instr_ok=1，说明当前段是可以插桩的代码，那么continue之后，接下来的片段就是.text段即代码段的内容了，就可以进行插桩了

接下来就是特殊情况的处理：

1 2	if (!clang_mode && instr_ok && !strncmp(line + 2, "p2align ", 8) && isdigit(line[10]) && line[11] == '\n') skip_next_label = 1;

OpenBSD的特殊性：

OpenBSD 将跳转表（jump tables）直接内联在代码中
跳转表是编译器生成的用于 switch 语句优化的数据结构
这些表被放在 .text 段中，但不应该被当作普通代码插桩

例如：

.text
.p2align 4              # ← 这里会被检测到
.L_jump_table:          # ← 这个标签会被跳过插桩
    .quad .L1
    .quad .L2
    .quad .L3

如果不进行特殊处理，会出现以下情况：

// 错误的插桩（如果不特殊处理）
.p2align 4
/* AFL插桩代码 */    # ← 这会破坏跳转表的对齐
.L_jump_table:
    .quad .L1

会破坏p2align指令的对齐效果；跳转表是数据，不是代码，不应该被插桩；错误的插桩会导致跳转表寻址错误

if (strstr(line, ".code")) {
    if (strstr(line, ".code32")) skip_csect = use_64bit;
    if (strstr(line, ".code64")) skip_csect = !use_64bit;
}

混合架构代码：

同一个汇编文件可能包含32位和64位代码段（比如armv8向前兼容）
在调试器（如GDB）中经常遇到
需要根据当前编译目标跳过不匹配的代码段

1 2	if (strstr(line, ".intel_syntax")) skip_intel = 1; if (strstr(line, ".att_syntax")) skip_intel = 0;

由于gcc本身默认使用AT&T语法，AFL的插桩代码也是用AT&T语法写的

或许我可以写一个patch来兼容这部分（）

例子：

.text
    movl %eax, %ebx     # AT&T语法，正常插桩

.intel_syntax noprefix  # ← 检测到Intel语法
    mov ebx, eax        # Intel语法，跳过插桩
    jnz label1          # 跳过插桩

.att_syntax prefix      # ← 检测到AT&T语法
    movl %eax, %ebx     # 恢复插桩

if (line[0] == '#' || line[1] == '#') {
    if (strstr(line, "#APP")) skip_app = 1;
    if (strstr(line, "#NO_APP")) skip_app = 0;
}

内联汇编一般都是

main:
    # 普通C代码生成的汇编
    pushq %rbp
    
#APP                    # ← 内联汇编开始标记
    movl $42, %eax      # 用户手写的汇编
#NO_APP                 # ← 内联汇编结束标记
    
    # 继续普通汇编
    popq %rbp
    ret

因为内联汇编一般都是程序员手写的，不符合编译器生成代码的模式；同时#还检测了注释

接下来就是一个流程图：

开始
  ↓
打开输入/输出文件
  ↓
逐行读取汇编代码
  ↓
检查延迟插桩条件 → 是 → 插入插桩代码
  ↓                     ↓
输出当前行              更新计数器
  ↓                     ↓
直通模式? → 是 → 继续下一行
  ↓
检测段类型 → .text段 → 启用插桩
  ↓           其他段 → 禁用插桩
检测特殊情况
  ↓
跳过条件检查 → 是 → 继续下一行
  ↓
指令行? → 是 → 条件分支? → 是 → 概率插桩
  ↓                        ↓
标签行? → 是 → 分支标签? → 是 → 标记延迟插桩
  ↓              函数标签? → 是 → 标记延迟插桩
继续下一行
  ↓
文件结束? → 否 → 返回逐行读取
  ↓
添加主要载荷
  ↓
关闭文件，输出统计
  ↓
结束

然后回到main函数

if (!(pid = fork())) {

execvp(as_params[0], (char**)as_params);
FATAL("Oops, failed to execute '%s' - check your PATH", as_params[0]);

}

if (pid < 0) PFATAL("fork() failed");

if (waitpid(pid, &status, 0) <= 0) PFATAL("waitpid() failed");

if (!getenv("AFL_KEEP_ASSEMBLY")) unlink(modified_file);

exit(WEXITSTATUS(status));

fork一个子进程执行as --64 -o /home/cyberangel/Desktop/test/exec_obj.o /tmp/.afl-27115-1673249415.s命令

参考

部分样例以及阅读过程中的参考来自于https://www.yuque.com/cyberangel