std.text.peg
std.text.peg
Section titled “std.text.peg”Elegance belongs to PEG. Regex is the knife.
std.text.peg provides first-class typed PEG (Parsing Expression Grammars) as compile-time validated language constructs. PEGs are the elegant alternative to regex — readable, composable, and semantically rich.
Quick Example
Section titled “Quick Example”let ipv4 : Peg[(u8, u8, u8, u8)] := peg do octet := digit{1..3} -> u8.parse($$)? dotted := octet "." octet "." octet "." octet main := dottedend
match ipv4.parse("192.168.1.1") do| .some((a, b, c, d)) => print("IP: ${a}.${b}.${c}.${d}")| .none => print("Invalid")endSigil Doctrine
Section titled “Sigil Doctrine”| Sigil | Context | Meaning |
|---|---|---|
$1, $2, … | :script pipelines | Positional capture access |
value | PEG semantic actions | Current matched value |
@ | Reserved | Future: metaprogramming |
$$is only valid inside PEG semantic actions. It is not part of the:script$-family.
Why PEG?
Section titled “Why PEG?”The Janus philosophy:
- Use regex when the pattern is short and local
- Use PEG when the pattern has names, structure, or meaning
- If a regex needs more than a few captures, it has probably become PEG-shaped
PEG grammars are:
- First-class values — Assignable, passable, composable
- Typed at compile time — Every grammar has a result type
Peg[T] - Human-readable — Grammar rules read like documentation
- Composables — Interpolate grammars into larger grammars
Syntax Reference
Section titled “Syntax Reference”Rule Definition
Section titled “Rule Definition”rule_name : Type := expressionrule_name := expression -> transform| Expression | Syntax | Description |
|---|---|---|
| Sequence | a b c | Match in order |
| Choice | a / b | First match |
| Repetition | a*, a+, a?, a{n,m} | Repeat |
| Literal | "hello" | Match exact string |
| Class | [a-z], [\p{Letter}] | Character set |
| Reference | rule_name | Another rule |
| Semantic action | expr -> transform | Transform result |
| Lookahead | &a, !a | Assert without consuming |
| Memoization | peg memo do | Enable packrat |
Semantic Value value
Section titled “Semantic Value value”Inside a semantic action, value refers to the matched substring:
year := digit{4} -> u64.parse(value)?Peg[T]
Section titled “Peg[T]”A grammar parameterized by its result type:
# Tuple resultlet point: Peg[(i64, i64)] := peg do x := "-"? digit+ -> i64.parse($$)? y := "-"? digit+ -> i64.parse($$)? main := "(" x "," y ")"endGrammar Composition
Section titled “Grammar Composition”Interpolate grammars with ${grammar}:
let ipv4 : Peg[(u8, u8, u8, u8)] := peg do octet := digit{1..3} -> u8.parse($$)? main := octet "." octet "." octet "." octetend
let url : Peg[Url] := peg do scheme := "http" / "https" host := ${ipv4} / hostname main := scheme "://" hostend# Parse entire inputgrammar.parse(input) -> T
# Partial matchgrammar.partial_parse(input) -> ?T
# Find all matchesgrammar.find_all(input) -> Iterator[T]
# Check if matchesgrammar.is_match(input) -> boolTyped String Literals
Section titled “Typed String Literals”PEG enables user-extensible typed string literals:
# Define typed literallet sql : Peg[SqlQuery] := peg do main := "SELECT" ...end
# Use it — validates at COMPILE TIMElet query := sql""" SELECT name, email FROM users WHERE age > ${min_age}"""Built-in typed literals:
| Literal | Type |
|---|---|
sql"""...""" | Peg[SqlQuery] |
json"""...""" | Peg[JsonValue] |
kdl"""...""" | Peg[KdlValue] |
toml"""...""" | Peg[TomlValue] |
Features
Section titled “Features”Memoization (Opt-in)
Section titled “Memoization (Opt-in)”# Enable packrat for ambiguous grammarlet ambiguous : Peg[T] := peg memo do expr := term / expr "+" termendUnicode Categories
Section titled “Unicode Categories”letter := [\p{Letter}]number := [\p{Number}]Position Tracking
Section titled “Position Tracking”Enabled by default. All parsed results include line, column, and byte offset.
Examples
Section titled “Examples”IPv4 Address Parser
Section titled “IPv4 Address Parser”let ipv4 : Peg[(u8, u8, u8, u8)] := peg do octet := [0-9]{1,3} -> u8.parse($$)? main := octet "." octet "." octet "." octetendCSV Parser
Section titled “CSV Parser”let csv : Peg[[String]] := peg do field := [^*,\n]+ -> String.new($$) line := field ("," field)* main := line ("\n" line)*endComposed URL Grammar
Section titled “Composed URL Grammar”let ipv4 := peg do octet := digit{1,3} -> u8.parse($$)? main := octet "." octet "." octet "." octetend
let url : Peg[Url] := peg do scheme := "http" / "https" host := ${ipv4} / hostname main := scheme "://" hostendComparison
Section titled “Comparison”| Feature | PEG | Regex |
|---|---|---|
| Readability | ✅ Grammar reads like docs | ❌ Line noise |
| Named captures | ✅ Natural syntax | ❌ (?<name>...) |
| Composability | ✅ Grammar interpolation | ❌ No |
| Typed results | ✅ -> T annotations | ❌ String only |
| Unicode | ✅ [\p{Letter}] | ⚠️ Limited |
| Memoization | ✅ Opt-in packrat | ❌ No |
Next Steps
Section titled “Next Steps”- Regex (SPEC-048) — For terse patterns
- TextStream — Pipeline algebra
- Script Profile — Using PEG in pipelines
Elegance belongs to PEG. Regex is the knife.
$$is to PEG what$1is to pipelines — semantic value, isolated domain.