Legend:
Library
Module
Module type
Parameter
Class
Class type
Persistent block store with linear history
The cemented block store is a store where blocks are stored linearly (by level) in chunks. Blocks in this store should not be reorganized anymore and are thus *cemented*. As these blocks should not be accessed regularly and especially their optionally stored metadata, the later are compressed using a zip format to save disk space. For each chunk of blocks, a dedicated file is used. Moreover, to enable easy access and to prevent too much on-disk reading, two indexed maps are used to retrieve blocks hash from their level and their level from the block hash.
The cemented block store contains a set of files updated each time a new chunk is added to the store. These files indicate which interval of blocks (w.r.t. their levels) are stored in it.
Invariants
This store is expected to respect the following invariants:
A key/value present in an index is present as well in the other as value/key.
Every block stored is correctly referenced through its associated indexes.
A cemented chunk of blocks that is represented by the interval i ; j (with i <= j) contains | j - i + 1 | blocks and are ordered from i to j in the file.
The set F of cemented chunks is always ordered by block level.
The cemented store does not contain holes: let F be the cemented chunks, if |F| > 1 then:
∀f_x=(i,j) ∈ F ∧ x < |F|, ∃f_y =(i', j'), x = y - 1 ∧ j + 1 = j'
meaning the concatenation of every chunk must be continuous.
A metadata zip file is indexed by the same interval as the chunks and, when it is the lowest chunk of metadata stored, is not assured to contain every block's metadata of the chunk.
Files format
The cemented block store is composed of the following files:
file: /<i_j>, a chunk of blocks from level i to level j. The format of this file is:
| <n> × <offset> | <n> × <block> |
where n is (j - i + 1), <offset> is a 4 bytes integer representing the absolute offset of a block where the k-th (with 0 <= k < n) offset stands for the absolute offset of the k-th block in the file and with <block>, a Block_repr.t value encoded using Block_repr.encoding (thus prefixed by the its size).
dir: /cemented_block_index_level, the Hash -> Level key/value index ;
dir: /cemented_block_index_hash, the Level -> Hash key/value index.
dir: /metadata, the directory containing chunks of compressed metadata (present if relevant).
files: /metadata/<i_j>.zip, the compressed metadata where every chunk of block's metadata is indexed by their level encoded as string (present if relevant).
init ?log_size ~cemented_blocks_dir ~readonly creates or loads an existing cemented block store at path cemented_blocks_dir. cemented_blocks_dir will be created if it does not exists. If readonly is true, cementing blocks will result in an error. log_size determines the index cache size.
find_block_file cemented_store block_level lookups the cemented_store to find the cemented block chunk file that contains the block at level block_level. Returns None if the block cannot be found.
cement_blocks_metadata cemented_store chunk compresses and stores the metadata of blocks present in chunk. If no block of the given chunk contains metadata, nothing is done. Otherwise, for every block containing metadata, an entry is written in the dedicated .zip metadata file.
We assume that the blocks containing metadata are contiguous and if at least one block has metadata, then the blocks from that block with metadata to the last block of chunk must have metadata. However, we do not check the validity of this assumption.
get_cemented_block_by_level cemented_store ~read_metadata level reads the cemented block at level in cemented_store, if it exists. It also retrieves the metadata depending on read_metadata.
get_cemented_block_by_hash cemented_store hash reads the cemented block of hash in cemented_store, if it exists. It also retrieves the metadata depending on read_metadata.
cemented_blocks ?check_consistency cemented_store ~write_metadata
chunk stores the chunk of blocks and write their metadata if the flag write_metadata is set. check_consistency (default is true) ensures that the blocks in the given chunk are contiguous.
trigger_gc cemented_store history_mode garbage collects metadata chunks and/or chunks from the cemented_store depending on the History_mode.t:
in Archive mode, nothing is done;
in Full offset mode, only offset chunks of metadata are kept;
in Rolling offset mode, only offset chunks of metadata and chunks are kept.
Important: when purging chunks of blocks, it is necessary to rewrite the index to remove garbage collected blocks. Therefore, the higher the offset is, the longest the GC phase will last.
check_indexes_consistency ?post_step ?genesis_hash cemented_store
history_mode iterates over a partially initialized cemented_store that contains both chunks of blocks and indexes then check the consistency of each block: (hashes, predecessors and levels). The hash is not checked for genesis_hash and post_step is called after each treated chunk. This is used for snapshot imports.