:cluster — The Sanctum
:cluster — The Sanctum
Section titled “:cluster — The Sanctum”“Fault tolerance by design.”
:cluster is where Janus becomes a system for building software that endures.
Everything from :service plus the actor model, supervised lifecycle, local
grain activation shells, the local grain activation registry, local grain
namespace lookup, and future distribution layers belongs here.
The current compiler/runtime slice is local supervised actors plus the first grain source-contract shell, local single-writer activation registry, local namespace lookup layer, and explicit GrainStore-backed lifecycle callbacks. Compiler-generated state serializers, remote placement, first-class supervisor declarations, migration, durable namespace persistence, and distributed registries are roadmap work, shown below as future sketches where noted.
What :cluster Gives You
Section titled “What :cluster Gives You”Actors — Concurrent Entities with Mailboxes
Section titled “Actors — Concurrent Entities with Mailboxes”The canonical shape today is actor X do var ... receive do match __msg { ... } end end.
Each actor compiles to a setup/handler/destroy triple that the
generated X_start_supervised(system, slot, policy) wrapper threads
into the local supervisor. Messages are i64 and dispatch is over
their value via an explicit match __msg.
actor Counter do var count: i64 = 0
receive do match __msg { 0 => do count = count + 1 end, 1 => do return 0 end, _ => do count = count end, } endend- Isolated state — No shared memory; each
varis a private slot. - Auto-supervised —
Counter_setup/Counter_handler/Counter_destroyandCounter_start_supervisedare auto-emitted alongside the spawn-form__Counter_loop. - No locks — Message passing is the only concurrency.
Walk through it hands-on in the Stateful Actors tutorial.
Typed message protocols are now local actor syntax, not just a sketch:
message declarations may include payload variants, ActorRef[Msg]
checks the send protocol, and receive arms can destructure local boxed
payload messages:
message Cmd { Tick, Set { value: u64 }, Stop,}
actor Counter(msg: Cmd) do var count: u64 = 0
receive do Cmd.Tick => do count += 1 end, Cmd.Set { value } when value >= 0 as u64 => do count += value end, Cmd.Stop => do return 0 end, after 30_000 => do count = count end, endendThis is still the node-local actor path. Guards and receive-loop
timeouts are live; supervised actors register after arms as local mailbox
timeouts. Distributed payload wire formats remain future
:cluster work.
Local Grain Shell — Virtual Identity Shape
Section titled “Local Grain Shell — Virtual Identity Shape”message UserMsg { Ping, Stop,}
@persist(via: GrainStoreBytes)@lifecycle(activation: .lazy)grain User(id: u64, msg: UserMsg) do var count: u64 = 0
receive do UserMsg.Ping => do count += 1 end, UserMsg.Stop => do return 0 end, endend- Live now — the parser accepts
grain Name(id: Id, msg: Msg),@persist,@lifecycle, state slots, receive arms, and emits a local supervised start wrapper. - Live now —
cluster.local_grain_lookup_or_start(...)maps a numeric(grain_type, grain_id)to one stable local activation ref while it is live. - Live now —
cluster.local_grain_lookup_or_start_namespace(...)maps a local(grain_type, namespace)key to an internal durable id, then reuses the same single-writer activation registry. - Live now —
cluster.local_grain_lookup_or_start_persistent(...)invokes explicit load/store callbacks that can restore and commit state throughGrainStoreBytes. - Live now — local grain persistence exposes per-system load/store failure counters so operators can detect callback failures instead of inferring them from stopped activations.
- Not live yet — compiler-generated GrainStore serializers, durable namespace persistence, passivation, migration, and remote routing.
- Rule — a grain is virtual identity with owned state. The current shell proves the source shape; the local registry pins the single-writer identity invariant.
Future Supervision Trees
Section titled “Future Supervision Trees”supervisor GameServerSupervisor do strategy: one_for_one
child LobbyManager # Restart on crash child MatchMaker # Restart on crash child MetricsCollector # Restart on crashend- one_for_one — Restart crashed child only
- one_for_all — Restart all if any crashes
- rest_for_one — Restart crashed + subsequent children
- Exponential backoff — Prevent death spirals
Additional Features
Section titled “Additional Features”- Memory sovereignty tags —
Local.Exclusive,Session.Replicated,Volatile.Ephemeral - Typed message protocols —
messagedeclarations,ActorRef[Msg], local payload sends, guarded receive-arm payload destructuring, and direct receive-loop timeout arms are live for node-local actors - Location transparency — Same syntax for local and remote
What :cluster Excludes
Section titled “What :cluster Excludes”| Excluded | Available In |
|---|---|
| Tensors and GPU | :compute |
Raw pointers and unsafe | :sovereign |
When to Use :cluster
Section titled “When to Use :cluster”Perfect for:
- Game servers handling thousands of concurrent connections
- Chat systems and real-time messaging
- Distributed databases and key-value stores
- Metaverse infrastructure and virtual world backends
- Any system where a node crash should not take down the service
- Stateful services that need to persist across restarts
The rule: If it needs to stay up when hardware fails, :cluster is your home.
Future Code Sketches
Section titled “Future Code Sketches”The following examples show the intended destination for grains, remote message payloads, and first-class supervisor declarations. They are not the current local actor tracer bullet.
A Chat Server
Section titled “A Chat Server”message ChatMsg { Join { user_id: UserId, reply: Reply[void] }, Send { user_id: UserId, text: String }, Leave { user_id: UserId }, history { count: i32, reply: Reply[[Message]] },}
actor ChatRoom(room_id: RoomId) implements ChatMsg do var members: Set[UserId] := Set.new() var messages: [Message] := []
receive do | Join { user_id, reply } => do members.insert(user_id) reply.send(void.ok()) end
| Send { user_id, text } => do if not members.contains(user_id) do reply.send(Error.not_a_member()) return end messages.push(Message{user_id, text, now()}) end
| Leave { user_id } => do members.remove(user_id) endendSupervision with Recovery
Section titled “Supervision with Recovery”supervisor DatabaseCluster do strategy: one_for_all
child ConnectionPool(max: 10) child QueryProcessor child MetricsExporter
# If ConnectionPool crashes, ALL children restart # This ensures consistent state across the clusterendDistributed Key-Value Store
Section titled “Distributed Key-Value Store”message KVStoreMsg { Get { key: String, reply: Reply[Option[Bytes]] }, Set { key: String, value: Bytes, reply: Reply[void] }, Delete { key: String, reply: Reply[void] }, Range { start: String, end: String, reply: Reply[[(String, Bytes)]] },}
@requires(cap: [.storage_nvme, .network_infiniband])grain KVNode(node_id: NodeId) implements KVStoreMsg do var data: HashMap[String, Bytes]
receive do | Get { key, reply } => do reply.send(data.get(key)) end
| Set { key, value, reply } => do data.set(key, value) # Replicate to other nodes replicate(key, value) reply.send(void.ok()) endendWhy :cluster Wins
Section titled “Why :cluster Wins”vs. Erlang/OTP:
- Types — Erlang’s dynamic types are a feature we left behind
- Generics — No more boilerplate for different message types
- Single language — Everything in Janus, not a separate DSL
vs. Akka (Scala/Java):
- Lighter — No JVM overhead
- Better interop — Native Zig bindings via graft
- Simpler — No implicit state machines
vs. Go + etcd:
- Supervision built-in — etcd is external, here it’s native
- Location transparency — Go needs service discovery, Janus has it baked in
- Grain migration — Go services can’t move between nodes automatically
Next Steps
Section titled “Next Steps”- Move to :sovereign — When you need raw performance
- Move to :service — For simpler applications
- Architecture Docs — Deep dive into the actor model
Build systems that endure.