Skip to content
  • Jay Zhuang's avatar
    Align compaction output file boundaries to the next level ones (#10655) · f3cc6663
    Jay Zhuang authored
    Summary:
    Try to align the compaction output file boundaries to the next level ones
    (grandparent level), to reduce the level compaction write-amplification.
    
    In level compaction, there are "wasted" data at the beginning and end of the
    output level files. Align the file boundary can avoid such "wasted" compaction.
    With this PR, it tries to align the non-bottommost level file boundaries to its
    next level ones. It may cut file when the file size is large enough (at least
    50% of target_file_size) and not too large (2x target_file_size).
    
    db_bench shows about 12.56% compaction reduction:
    ```
    TEST_TMPDIR=/data/dbbench2 ./db_bench --benchmarks=fillrandom,readrandom -max_background_jobs=12 -num=400000000 -target_file_size_base=33554432
    
    # baseline:
    Flush(GB): cumulative 25.882, interval 7.216
    Cumulative compaction: 285.90 GB write, 162.36 MB/s write, 269.68 GB read, 153.15 MB/s read, 2926.7 seconds
    
    # with this change:
    Flush(GB): cumulative 25...
    f3cc6663
To find the state of this project's repository at the time of any of these versions, check out the tags.