LLVM后端优化Pass：别名分析

AliasAnalysis.html 中对别名分析进行了基本的介绍：

Alias Analysis (又名 Pointer Analysis)，用于确定两个指针是否指向内存中的同一对象，这里有很多不同的别名分析算法，可分为：流敏感 vs 流非敏感、上下文敏感 vs 上下文非敏感、域敏感 vs 域非敏感、基于一致性的 vs 基于子集的。

传统的别名分析用于给出 Must、May、No 的回答:

Must 代表两个指针总是指向同一对象；
May 代表可能指向同一对象；
No 代表绝不会指向同一对象。

编译器优化：何为别名分析 - 知乎 (zhihu.com) 这篇文章介绍了别名分析算法的几种类型。

LLVM AliasAnalysis 类是实现别名分析的基础类，能够提供简单的别名分析信息，且能提供 Mod / Ref 信息，有利于进行更复杂的分析。在SkyEye中启用的两个别名分析的优化，是AliasAnalysis 类的派生类。（Mod：memory access modifies，Ref： references memory）

别名分析最简单的应用：

例如以下 C 代码：

int foo (int __attribute__((address_space(0))) *a,             int __attribute__((address_space(1))) *b) {    *a = 42;    *b = 20;    return *a;}

转换成 llvm 如下：

define i32 @foo(i32 addrespace(0)* %a, i32 addrspace(1)* %b) #0 {entry:    store i32 42, i32 addrspace(0)* %a, align 4    store i32 42, i32 addrspace(1)* %b, align 4    %0 = load i32, i32* %a, align 4    ret i32 %0}

现在需要对 foor 进行优化，去掉不必要的 load ：

define i32 @foo(i32 addrespace(0)* %a, i32 addrspace(1)* %b) #0 {entry:    store i32 42, i32 addrspace(0)* %a, align 4    store i32 42, i32 addrspace(1)* %b, align 4    ret i32 42}

但是这个优化的前提是， a 和 b 不能别名，否则会导致错误如下：

    int i = 0;    int result = foo(&i, &i);

可以看到，以上调用会使 a 和 b 别名，本应该返回 20，结果因为优化的缘故，返回了 42，导致错误。所以编译器只有确定两个指针不会产生别名时，才能进行以上优化。

以我的理解，别名分析主要作为其他优化的前置 pass ，主要作用还是帮别的优化方向区分哪些变量可以优化的，哪些不能被优化，以便于准确消除冗余的 IR 。

所以我在 LLVM3.0 源代码进行了搜索：

在后端编译流程里面，有一些过程，确实需要调用别名分析的作为前置 pass。

之后选择 TypeBasedAliasAnalysis 类源代码，进行主要分析：

TypeBasedAliasAnalysis 继承自 ImmutablePass 和 AliasAnalysis，提供了基于元数据的类型别名分析的实现，通过检查节点之间的关系判断两个类型是否可能别名。

由于继承自 ImmutablePass ，因此 TypeBasedAliasAnalysis 是一旦创建就不能更改的 Pass，这样的 Pass 对象通常用于在整个编译过程中共享状态，在这里我们是为每个 JIT Function 创建这样的 pass。因此可以理解为什么要在最开始的时候就调用它。

另外一个关键字是元数据（metadata），LLVM 使用元数据来传递附加信息，这些信息在编译过程中不直接影响程序的执行，但对于优化和分析非常有用。在这种情况下，元数据用于指示指针和内存对象的类型信息，以便进行更准确的别名分析。如下图给出介绍，展示了元数据的格式（LLVM Metadata 介绍-CSDN博客）：

// This file defines the TypeBasedAliasAnalysis pass, which implements// metadata-based TBAA.//// In LLVM IR, memory does not have types, so LLVM's own type system is not// suitable for doing TBAA. Instead, metadata is added to the IR to describe// a type system of a higher level language. This can be used to implement// typical C/C++ TBAA, but it can also be used to implement custom alias// analysis behavior for other languages.//// The current metadata format is very simple. TBAA MDNodes have up to// three fields, e.g.://   !0 = metadata !{ metadata !"an example type tree" }//   !1 = metadata !{ metadata !"int", metadata !0 }//   !2 = metadata !{ metadata !"float", metadata !0 }//   !3 = metadata !{ metadata !"const float", metadata !2, i64 1 }//// The first field is an identity field. It can be any value, usually// an MDString, which uniquely identifies the type. The most important// name in the tree is the name of the root node. Two trees with// different root node names are entirely disjoint, even if they// have leaves with common names.//// The second field identifies the type's parent node in the tree, or// is null or omitted for a root node. A type is considered to alias// all of its descendants and all of its ancestors in the tree. Also,// a type is considered to alias all types in other trees, so that// bitcode produced from multiple front-ends is handled conservatively.//// If the third field is present, it's an integer which if equal to 1// indicates that the type is "constant" (meaning pointsToConstantMemory// should return true; see// http://llvm.org/docs/AliasAnalysis.html#OtherItfs).//// TODO: The current metadata format doesn't support struct// fields. For example://   struct X {//     double d;//     int i;//   };//   void foo(struct X *x, struct X *y, double *p) {//     *x = *y;//     *p = 0.0;//   }// Struct X has a double member, so the store to *x can alias the store to *p.// Currently it's not possible to precisely describe all the things struct X// aliases, so struct assignments must use conservative TBAA nodes. There's// no scheme for attaching metadata to @llvm.memcpy yet either.

另外我又关注了一下，别名分析对内存的消耗情况，调用它们是否会比较耗时？但是没有找到相关资料，这个下次再研究。

主要参考资料：llvm-3.0.src/docs/AliasAnalysis.html

侧重的源代码：lib\Analysis\TypeBasedAliasAnalysis.cpp