The Polar-Signal Paradigm
The Polar-Signal Paradigm
Section titled “The Polar-Signal Paradigm”“The fastest code is not the code that hides the machine. It is the code that speaks the machine’s native tongue — honestly.”
Every mainstream ML framework stores embeddings as Cartesian vectors — flat arrays of f32. Every display system transports pixels — flat grids of color values. Both are lies. They encode information into lossy absolute representations and then burn compute to recover the structure they threw away.
Janus takes a different path.
The Problem With Cartesian
Section titled “The Problem With Cartesian”An embedding vector [0.3, 0.4, 0.5] stores three coordinates. But the two properties that actually matter are entangled:
- Signal strength (magnitude) — how confident is this data point?
- Semantic direction (angles) — where does it point in meaning-space?
Every cosine similarity computation redundantly recomputes ||v||. Every nearest-neighbor search wastes cycles normalizing vectors that should never have been Cartesian in the first place. The representation lies about the structure of the data.
Polar Embeddings as a Language Primitive
Section titled “Polar Embeddings as a Language Primitive”Janus’s :compute profile introduces polar[N] — a type that separates radius (confidence) from angles (direction) by construction:
profile :compute
// Cartesian: 768 × f32 = 3072 bytes, magnitude and direction entangledlet old = vec[768](0.3, 0.4, 0.5, ...)
// Polar: radius (f16) + 767 quantized angles (u8) = 769 bytes// Magnitude and direction orthogonal. 4× compression. Zero information loss that matters.let embedding: polar[768] = polar[768].from_cartesian(raw_vec)
// Signal strength: O(1) — just read the fieldlet confidence = embedding.radius
// Similarity: operates on angles directly — no norm computationlet sim = polar.cosine_similarity(embedding_a, embedding_b)This isn’t a library. It’s a type. The compiler enforces the separation:
// Compile error E3010: arithmetic on polar types is meaninglesslet bad = embedding_a + embedding_b
// You must explicitly cross back to Cartesian — the cost is visible@allow_lossylet sum = polar[768].from_cartesian( embedding_a.to_cartesian() + embedding_b.to_cartesian())Syntactic Honesty applied to data representation. If it costs something, you see it.
The Numbers
Section titled “The Numbers”PolarQuant (Wu et al., NeurIPS 2025) validated this approach empirically:
| Configuration | Cosine Similarity | Compression | Token Match |
|---|---|---|---|
| Polar, 4-bit angles | 0.990 | 3.8× | 100% |
| Polar, 2-bit angles | 0.883 | 10.4× | ~97% |
| Polar + QJL correction | 0.995 | 3.9× | 100% |
For applications that search, rank, or compare embeddings — semantic search, recommendation engines, trust graph similarity — polar storage is strictly better. The accuracy loss is negligible; the memory and compute savings are not.
The Ternary Compound: Multiplication-Free Inference
Section titled “The Ternary Compound: Multiplication-Free Inference”Here is where it gets architecturally interesting.
Microsoft’s BitNet b1.58 makes model weights ternary: {-1, 0, +1}. The forward pass becomes integer add/subtract — no floating-point multiplication. But BitNet has a blind spot: it doesn’t compress the attention KV cache. During inference, the attention state still accumulates as full-precision vectors in RAM.
PolarQuant fixes exactly this gap. And the interaction is compound, not additive.
BitNet’s i8 activations pre-quantize the inputs that PolarQuant needs to convert. Integer-to-polar conversion is cheaper than float-to-polar conversion. The precision loss from quantization is already baked in — polar conversion adds negligible additional error.
Three independent optimizations stack:
| Pipeline Stage | Technique | Multiplication? |
|---|---|---|
| Forward pass | BitNet ternary matvec() | No — integer add/subtract |
| KV storage | PolarQuant polar[N] | No — magnitude + angle storage |
| Similarity | Polar angular distance | No — angle subtraction |
The entire inference + retrieval pipeline is multiplication-free. No FPU needed. ARM NEON integer SIMD handles everything.
This isn’t a 2× speedup. It’s a category change in what hardware can run local inference.
What This Enables
Section titled “What This Enables”On a Raspberry Pi 4 with ~50 MB usable RAM:
Standard (fp16 model + fp16 KV): Model: ~1.4 GB KV at 2K context: ~80 MB → impossible
BitNet alone (ternary model + i8 KV): Model: ~44 MB KV at 2K context: ~80 MB → barely fits Model: ~44 MB KV at 4K context: ~160 MB → impossible
BitNet + PolarQuant (ternary model + polar KV): Model: ~44 MB KV at 2K context: ~13 MB → comfortable Model: ~44 MB KV at 4K context: ~27 MB → comfortable Model: ~44 MB KV at 8K context: ~53 MB → feasible with swapA solar-powered phone in Mombasa with 4 hours daily connectivity can now maintain 4× the conversational context on the same silicon. The cloud has infinite RAM; it only cares about one compression axis at a time. Constrained hardware cares about both — and that’s where the compound shines.
QJL: Compiler-Inserted Error Correction
Section titled “QJL: Compiler-Inserted Error Correction”Quantization introduces systematic bias in similarity computations. The Quantized Johnson-Lindenstrauss (QJL) corrector fixes this with a 1-bit random projection sketch — 16 bytes per embedding, regardless of dimensionality.
Janus makes this a compiler attribute:
// The programmer writes:type CorrectedEmbedding = @qjl_corrected polar[768]let sim = polar.cosine_similarity(a, b)
// The compiler emits (visible via `janus desugar`):let raw_sim = polar.__raw_cosine_similarity(a, b)let bias = qjl.__estimate_bias(a.__qjl_sketch, b.__qjl_sketch)let sim = raw_sim - bias@qjl_corrected is a ⟁ Delta transformation — invisible to the programmer, but visible to the auditor. This is Syntactic Honesty meets ergonomics: you don’t manually thread correction logic; but the transformation is never hidden. janus desugar shows you exactly what the compiler inserted.
Beyond Embeddings: The Signal Paradigm
Section titled “Beyond Embeddings: The Signal Paradigm”Polar embeddings are one instance of a broader principle: store meaning, not artifacts.
| Layer | Artifact (old) | Signal (new) |
|---|---|---|
| Data | Cartesian [N]f32 | polar[N] (direction + magnitude) |
| Rendering | Pixel framebuffers | Scene graph descriptions |
| Coordinates | Absolute pixels | rel[T] (relative values) |
| Transport | H.265 pixel streams | Scene delta streams |
| Textures | Pixel grids (PNG) | Procedural / neural signals |
The pixel framebuffer was the right abstraction for 1973 hardware. Cartesian embeddings were the right abstraction for 2015 frameworks. Neither is the right abstraction for a language designed in 2026 with honesty as a doctrine.
Janus’s :compute profile doesn’t just support these types — it enforces the paradigm through the type system. You cannot accidentally add two polar embeddings (compile error). You cannot feed pixels back into a signal pipeline (compile error). You cannot mix relative and absolute coordinates (compile error). The compiler prevents you from lying about what your data represents.
The Roadmap
Section titled “The Roadmap”| Phase | Scope | Status |
|---|---|---|
| Phase 0 | PolarEmbedding struct, full-precision conversion, naive similarity | Spec complete |
| Phase 1 | Quantized precision modes (u8, u4, u2), Lloyd-Max codebooks | Spec complete |
| Phase 2 | @qjl_corrected compiler attribute, bias correction | Spec complete |
| Phase 3 | SIMD acceleration, codebook table lookup, device-qualified impls | Spec complete |
| Phase 4 | polar[N] as first-class parser/sema type, QTJIR opcodes | Spec complete |
The specifications are published. Implementation begins with Phase 0 — which doubles as a Janus stdlib advancement target (std.math.trig doesn’t exist yet; building it unblocks all future :compute work).
Further Reading
Section titled “Further Reading”- SPEC-070: Polar Embedding Primitives (internal)
- SPEC-055: Signal-First Rendering Primitives (internal)
- PolarQuant: Wu et al., “Leveraging Polar Transformation for Key Cache Quantization” — NeurIPS 2025
- BitNet b1.58: Ma et al., “The Era of 1-bit LLMs” — Microsoft Research, 2024
A CPU that never multiplies. An attention cache that stores angles instead of coordinates. A model whose weights are {-1, 0, +1}. Three independent optimizations that compound into something the cloud can’t match on price-per-watt.