Skip to content

:cluster — The Sanctum

“Fault tolerance by design.”

:cluster is where Janus becomes a system for building software that endures. Everything from :service — plus the actor model, virtual actors that migrate between nodes, and supervision trees that recover from failure automatically. If your system can’t go down, :cluster is where you live.


Actors — Concurrent Entities with Mailboxes

Section titled “Actors — Concurrent Entities with Mailboxes”
actor Counter(msg: CounterMsg) do
var count: i64 := 0
receive do
| CounterMsg.increment => do
count = count + 1
end
| CounterMsg.get(reply) => do
reply.send(count)
end
end
  • Isolated state — No shared memory
  • Typed mailbox — Each actor has a specific message type
  • No locks — Message passing is the only concurrency
@requires(cap: [.storage_nvme])
grain UserSession(session_id: SessionId) does SessionBehavior with
Storage,
Caching
do
var user_data: UserData
var last_access: Timestamp
# This grain can move between nodes at will
# State is automatically serialized and restored
end
  • Location transparent — Address local and remote grains the same way
  • Auto-migration — Grains move to balance load
  • Capability contracts — Declare hardware/software requirements
supervisor GameServerSupervisor do
strategy: one_for_one
child LobbyManager # Restart on crash
child MatchMaker # Restart on crash
child MetricsCollector # Restart on crash
end
  • one_for_one — Restart crashed child only
  • one_for_all — Restart all if any crashes
  • rest_for_one — Restart crashed + subsequent children
  • Exponential backoff — Prevent death spirals
  • Memory sovereignty tagsLocal.Exclusive, Session.Replicated, Volatile.Ephemeral
  • Typed message protocols — Sealed algebraic types for exhaustive matching
  • Location transparency — Same syntax for local and remote

ExcludedAvailable In
Tensors and GPU:compute
Raw pointers and unsafe:sovereign

Perfect for:

  • Game servers handling thousands of concurrent connections
  • Chat systems and real-time messaging
  • Distributed databases and key-value stores
  • Metaverse infrastructure and virtual world backends
  • Any system where a node crash should not take down the service
  • Stateful services that need to persist across restarts

The rule: If it needs to stay up when hardware fails, :cluster is your home.


message ChatMsg {
Join { user_id: UserId, reply: Reply[void] },
Send { user_id: UserId, text: String },
Leave { user_id: UserId },
history { count: i32, reply: Reply[[Message]] },
}
actor ChatRoom(room_id: RoomId) implements ChatMsg do
var members: Set[UserId] := Set.new()
var messages: [Message] := []
receive do
| Join { user_id, reply } => do
members.insert(user_id)
reply.send(void.ok())
end
| Send { user_id, text } => do
if not members.contains(user_id) do
reply.send(Error.not_a_member())
return
end
messages.push(Message{user_id, text, now()})
end
| Leave { user_id } => do
members.remove(user_id)
end
end
supervisor DatabaseCluster do
strategy: one_for_all
child ConnectionPool(max: 10)
child QueryProcessor
child MetricsExporter
# If ConnectionPool crashes, ALL children restart
# This ensures consistent state across the cluster
end
message KVStoreMsg {
Get { key: String, reply: Reply[Option[Bytes]] },
Set { key: String, value: Bytes, reply: Reply[void] },
Delete { key: String, reply: Reply[void] },
Range { start: String, end: String, reply: Reply[[(String, Bytes)]] },
}
@requires(cap: [.storage_nvme, .network_infiniband])
grain KVNode(node_id: NodeId) implements KVStoreMsg do
var data: HashMap[String, Bytes]
receive do
| Get { key, reply } => do
reply.send(data.get(key))
end
| Set { key, value, reply } => do
data.set(key, value)
# Replicate to other nodes
replicate(key, value)
reply.send(void.ok())
end
end

vs. Erlang/OTP:

  • Types — Erlang’s dynamic types are a feature we left behind
  • Generics — No more boilerplate for different message types
  • Single language — Everything in Janus, not a separate DSL

vs. Akka (Scala/Java):

  • Lighter — No JVM overhead
  • Better interop — Native Zig bindings via graft
  • Simpler — No implicit state machines

vs. Go + etcd:

  • Supervision built-in — etcd is external, here it’s native
  • Location transparency — Go needs service discovery, Janus has it baked in
  • Grain migration — Go services can’t move between nodes automatically


Build systems that endure.