LSM-backed STLStore
LSM-backed STLStore
Section titled “LSM-backed STLStore”std.stl.lsm_store is the persistent implementation of the Sovereign Transparency Ledger store. It wraps a caller-owned std.db.lsm.GrainStoreBytes and stores each event in two keyspaces:
- primary key:
[0x00] ++ EventId.bytes->canonical_event ++ u64_be(rank) - rank sidecar:
[0x01] ++ u64_be(rank)->EventId.bytes
The primary keyspace preserves content-addressed lookup by event id. The sidecar keyspace gives deterministic append-order lookup for auditors, witnesses, replay validators, and proof generators.
Public Surface
Section titled “Public Surface”use std.db.lsmuse std.stl.lsm_storeuse std.stl.store
var gs = lsm.gs_open_bytes("/tmp/stl.wal", 0x6503)var store = lsm_store.make_store(&gs)
// append / append_and_derive// get_by_id// get_by_insertion_rank// kind_by_insertion_rank// iter_in_append_order / iter_next / iter_done / iter_err// flush / flush_with_manifest// make_store_recovered / manifest_status// rebuild_rank_sidecar
_ = lsm.gs_close_bytes(&gs)The LSMStore borrows the GrainStoreBytes; callers still own open, path selection, and close. For manifest-backed stores, use flush_with_manifest and make_store_recovered so the STL facade applies the save/load policy around the borrowed substrate.
make_store opens in one of three modes:
MODE_HEALTHY: primary entries and rank sidecars agree.MODE_DEGRADED: the primary ledger is still readable by id, but positional APIs refuse until the sidecar is repaired.MODE_RECOVERY_DEGRADED: a present L0 manifest was refused during recovered open.
Degraded sidecar mode is triggered by primary/sidecar count divergence or deterministic probe failure. Recovery-degraded mode is triggered by make_store_recovered when the manifest exists but fails validation or attachment. get_by_id remains available for primary entries that are still reachable. get_by_insertion_rank and iterators require MODE_HEALTHY.
Rebuild
Section titled “Rebuild”rebuild_rank_sidecar(&store) scans primary entries, rewrites the rank sidecar, updates event_count, and flips the store back to MODE_HEALTHY on success.
Existing v0.2 entries keep their embedded rank suffix. Legacy primary entries without a trusted suffix are assigned tail ranks after the highest explicit rank. The repair writes through the GrainStoreBytes WAL, so a clean close/open replays the rebuilt sidecar.
Flush and Pool Behavior
Section titled “Flush and Pool Behavior”gs_put_bytes_owned copies keys and values into a GrainStore-owned pool so per-call stack buffers can safely be appended. gs_flush_bytes drains the MemTable into an SSTable and resets that pool. The pool now bounds one in-memory batch, not the lifetime of the store.
Callers should flush periodically:
if lsm_store.flush(&store, "/tmp/stl-l0.sst") != store.STORE_OK do return 1endL0 Manifest Recovery
Section titled “L0 Manifest Recovery”GrainStoreBytes can persist its attached L0 SSTable list as a level manifest. This is the recovery path for stores that have flushed data out of the WAL and into SSTables. The STL facade exposes the normal policy as flush_with_manifest:
if lsm_store.flush_with_manifest( &store, "/var/lib/janus/stl-l0.sst", "/var/lib/janus/stl.manifest", "/var/lib/janus/stl.manifest.tmp", "/var/lib/janus",) != store.STORE_OK do return 1endflush_with_manifest flushes and attaches the SSTable first, then writes a complete manifest to the temp path, fsyncs it, closes it, renames it over the final manifest, and fsyncs the directory. Each L0 entry records:
- level number, currently
0 - slot order, oldest to newest
- SSTable path
- SSTable byte length
- SSTable footer
entry_count - SSTable image fingerprint
Reopen through make_store_recovered after opening the WAL:
var reopened = lsm.gs_open_bytes("/var/lib/janus/stl.wal", 0x6503)var recovered = lsm_store.make_store_recovered(&reopened, "/var/lib/janus/stl.manifest")if lsm_store.manifest_status(&recovered) != store.STORE_OK do _ = lsm.gs_close_bytes(&reopened) return 2endRecovered open validates the manifest magic, version, body length, CRC, slot order, table file size, SSTable footer entry count, and SSTable image fingerprint before running the normal sidecar probe. Torn manifests, missing or truncated SSTables, and stale same-shape SSTables are refused. Attach-time failures reset the in-memory L0 count to zero and mark the store MODE_RECOVERY_DEGRADED so callers do not observe a partially loaded level list as healthy.
flush_with_manifest is the preferred STL path when the L0 list must represent all flushed writes. The manifest records attached SSTables, not the current MemTable. The lower-level lsm.gs_save_l0_manifest_bytes and lsm.gs_load_l0_manifest_bytes remain available for substrate tests and custom policies.
Cluster Tombstones
Section titled “Cluster Tombstones”std.cluster.tombstones is the adapter from local actor supervision
tombstones into STL events. It encodes a 64-byte inline effect payload
with:
- magic bytes
AT - payload version
- stable stop-reason code
- tombstone sequence
- child slot
- code version, input digest, replay token, and state epoch
- attempt count
- timestamp seconds
The adapter exposes ActorTombstone, zero, make_event,
append_lsm, and small event readers such as is_tombstone_event,
event_reason, event_sequence, event_child, and
event_attempt_count.
Sink callbacks normally receive the runtime record through
std.cluster.local.tombstone_* accessors, copy the fields into
ActorTombstone, and call:
if tombstones.append_lsm(&store, &t) != store.STORE_OK do return 0endThe LSM store still follows the same borrowed-handle rule. A callback may
construct a short-lived LSMStore over the caller-owned
GrainStoreBytes; the durable primary and rank sidecar entries live in
the shared LSM substrate, not in the wrapper value.
Current Limits
Section titled “Current Limits”- Single writer per store.
- Caller-owned
GrainStoreByteslifecycle. - Automatic discovery of manifest paths beyond the explicit recovered-open helper.
- No per-append fsync in the facade; call substrate sync/flush according to the durability policy of the consumer.
- Commit proof production is owned by future SPEC-088 integration.