mace 记录 -- 思维误区

在 BwTree 中，随着delta chain的长度不断增加，对读性能的影响也会越来越来严重，因此需要进行 consolidate 操作，简单来讲：就是将 delta chain 合并到一个 page 中

要完成 consolidate 操作，首先需要计算delta chain 的长度，在mace中，通过在 page header中的8位来记录，在首次插入 delta 时设置为 0（这个在mace首次运行时走另外的逻辑完成），后续插入的 delta 的长度就是前一个的 + 1

当检查发现delta chain的长度超过配置的值时，就需要分配一个新的 page，将 delta chain 合并进去，去掉已经标记删除的delta记录，随后将这个 page 通过 CAS 更新到 page table 中

当多个线程同时进行 consolidate 时，会出现如下的情况


page_table[0] --> [del 5] -> [ins 3] -> [ins 4] -> [ins 5]
                    ^            ^                     ^
                    |            |                     |
                    |             +---- thread 1-------+
                    |                                  |
                    +------ thread 2 ------------------+

thread 1 和 thread 2 同时插入数据，但 thread 1 在插入的 CAS 中竞争胜出，因此它先发现可以进行consolidate 而 thread 2 随后也发现可以进行consolidate，结果也有两个

thread 1 先于 thread 2 完成

timestamp
 |                page_table[0] --> [ins 3 4 5]   thread 1 CAS
 |                page_table[0] --> [ins 3 4]      thread 2 CAS
 ⬇️

thread 2 先于 thread 2 完成

timestamp
 |                page_table[0] --> [ins 3 4]      thread 2 CAS
 |                page_table[0] --> [ins 3 4 5]   thread 1 CAS
 ⬇️

很明显，后者是错误的，因为已经删除的记录del 5会被扫描到，那么问题来了：怎么解决呢？

以上的分析陷入了一个误区：CAS 谁先竞争胜出，谁就先consolidate

这在插入 delta 是是成立的，因此我们拿着前一个delta的 addr，如果当前原子变量的值不是这个 addr 说明有其他线程先于我们插入了，我们只需要更新 addr 为最新即可(CAS会返回这个addr）

而对于consolidate来说不存在重试，多个线程同时进行consolidate只有一个能成功，其他的全都会失败，拿上面分类的情况举例

timestamp
  |                    thread 1 [ins 3] -> addr 114
  |                    thread 2 [del 5] -> addr 514
  |
  |   page_table[0] -> 514
  |
  |                   thread 1 [ins 3 4 5] -> addr 233 -> CAS => ❌  514 != 114
  |                   thread 2 [ins 3 4]    -> addr 666 -> CAS => ✔   514 = 514
  |   
  |   page_table[0] -> 666
  |
  ⬇️

如果还有其他线程在 thread 2 执行 consolidate 的CAS前进行了插入，那么 thread 2 也会失败，但最终一定有一个线程成功完成 consolidate，这个结论的一个副作用是：delta chain 的长度可能会很长，8位也可能不够。因此在设置长度时需要确保不回绕，在Rust中可以使用 saturating_add