Skip to content

std.text.peg

Elegance belongs to PEG. Regex is the knife.

std.text.peg provides first-class typed PEG (Parsing Expression Grammars) as compile-time validated language constructs. PEGs are the elegant alternative to regex — readable, composable, and semantically rich.


let ipv4 : Peg[(u8, u8, u8, u8)] := peg do
octet := digit{1..3} -> u8.parse($$)?
dotted := octet "." octet "." octet "." octet
main := dotted
end
match ipv4.parse("192.168.1.1") do
| .some((a, b, c, d)) => print("IP: ${a}.${b}.${c}.${d}")
| .none => print("Invalid")
end

SigilContextMeaning
$1, $2, …:script pipelinesPositional capture access
valuePEG semantic actionsCurrent matched value
@ReservedFuture: metaprogramming

$$ is only valid inside PEG semantic actions. It is not part of the :script $-family.


The Janus philosophy:

  • Use regex when the pattern is short and local
  • Use PEG when the pattern has names, structure, or meaning
  • If a regex needs more than a few captures, it has probably become PEG-shaped

PEG grammars are:

  • First-class values — Assignable, passable, composable
  • Typed at compile time — Every grammar has a result type Peg[T]
  • Human-readable — Grammar rules read like documentation
  • Composables — Interpolate grammars into larger grammars

rule_name : Type := expression
rule_name := expression -> transform
ExpressionSyntaxDescription
Sequencea b cMatch in order
Choicea / bFirst match
Repetitiona*, a+, a?, a{n,m}Repeat
Literal"hello"Match exact string
Class[a-z], [\p{Letter}]Character set
Referencerule_nameAnother rule
Semantic actionexpr -> transformTransform result
Lookahead&a, !aAssert without consuming
Memoizationpeg memo doEnable packrat

Inside a semantic action, value refers to the matched substring:

year := digit{4} -> u64.parse(value)?

A grammar parameterized by its result type:

# Tuple result
let point: Peg[(i64, i64)] := peg do
x := "-"? digit+ -> i64.parse($$)?
y := "-"? digit+ -> i64.parse($$)?
main := "(" x "," y ")"
end

Interpolate grammars with ${grammar}:

let ipv4 : Peg[(u8, u8, u8, u8)] := peg do
octet := digit{1..3} -> u8.parse($$)?
main := octet "." octet "." octet "." octet
end
let url : Peg[Url] := peg do
scheme := "http" / "https"
host := ${ipv4} / hostname
main := scheme "://" host
end

# Parse entire input
grammar.parse(input) -> T
# Partial match
grammar.partial_parse(input) -> ?T
# Find all matches
grammar.find_all(input) -> Iterator[T]
# Check if matches
grammar.is_match(input) -> bool

PEG enables user-extensible typed string literals:

# Define typed literal
let sql : Peg[SqlQuery] := peg do
main := "SELECT" ...
end
# Use itvalidates at COMPILE TIME
let query := sql"""
SELECT name, email
FROM users
WHERE age > ${min_age}
"""

Built-in typed literals:

LiteralType
sql"""..."""Peg[SqlQuery]
json"""..."""Peg[JsonValue]
kdl"""..."""Peg[KdlValue]
toml"""..."""Peg[TomlValue]

# Enable packrat for ambiguous grammar
let ambiguous : Peg[T] := peg memo do
expr := term / expr "+" term
end
letter := [\p{Letter}]
number := [\p{Number}]

Enabled by default. All parsed results include line, column, and byte offset.


let ipv4 : Peg[(u8, u8, u8, u8)] := peg do
octet := [0-9]{1,3} -> u8.parse($$)?
main := octet "." octet "." octet "." octet
end
let csv : Peg[[String]] := peg do
field := [^*,\n]+ -> String.new($$)
line := field ("," field)*
main := line ("\n" line)*
end
let ipv4 := peg do
octet := digit{1,3} -> u8.parse($$)?
main := octet "." octet "." octet "." octet
end
let url : Peg[Url] := peg do
scheme := "http" / "https"
host := ${ipv4} / hostname
main := scheme "://" host
end

FeaturePEGRegex
Readability✅ Grammar reads like docs❌ Line noise
Named captures✅ Natural syntax(?<name>...)
Composability✅ Grammar interpolation❌ No
Typed results-> T annotations❌ String only
Unicode[\p{Letter}]⚠️ Limited
Memoization✅ Opt-in packrat❌ No


Elegance belongs to PEG. Regex is the knife. $$ is to PEG what $1 is to pipelines — semantic value, isolated domain.