Skip to content

std.db.lmx: Read-Optimized Key/Value Pages

std.db.lmx is Janus’ native read-optimized key/value storage substrate. It is used by code-graph storage today and reserved for future read-heavy field maps and catalogs.

LMX is deliberately small in Phase 1:

  • one 64-byte file header
  • 4096-byte pages
  • one root leaf for key descriptors
  • inline small values
  • overflow pages for values that do not fit in the root leaf

It is not yet a full B+ tree with branch splits.

An LMX file starts with a 64-byte header followed by 4096-byte pages:

[64-byte header]
[4 KiB page 1]
[4 KiB page 2]
...

The header stores magic, version, root page, free head, and page count.

Leaf pages store entries sequentially from byte 16:

[0..4) key_len u32 little-endian
[4..8) value_len u32 little-endian
[8..) key bytes, then value bytes

Small values remain inline. This preserves the original simple fast path.

When a value cannot fit inline, put appends overflow pages and stores a 24-byte descriptor in the leaf value slot:

[0..4) "OVF1"
[4..8) reserved zeros
[8..16) original value length
[16..24) first overflow page

Each overflow page stores:

[0..2) page type = 3
[2..4) payload length
[4..12) next overflow page, or 0
[16..4096) payload bytes

get follows the descriptor and reconstructs the original value into the caller buffer. If the caller buffer is too small, get returns -1.

LMX now supports values larger than one page, but Phase 1 still has these intentional boundaries:

  • the root leaf must still have enough space for the key and 24-byte overflow descriptor
  • branch pages and root-leaf splits are not implemented yet
  • deleting an overflow-backed key marks the root entry but does not reclaim overflow pages

Use LSM/STL for append-heavy replay and audit trails. Use LMX when the workload wants compact key lookup and read-oriented metadata.

Focused gates:

Terminal window
cd janus
./scripts/zb test-lmx-smoke
./scripts/zb test-lmx-large-value
./scripts/zb test-storage-gap-drift-closures

test-lmx-large-value writes a mixed small/large database, round-trips a 5000 byte value, closes and reopens the file, and verifies the large value again.