Match Text with std.text.rex
Match Text with std.text.rex
Section titled “Match Text with std.text.rex”This tutorial walks through the shipped std.text.rex v1 surface. You will
compile a pattern, match input text, and reject unsupported syntax.
Time: 7 minutes
Level: Intermediate
Prerequisites: use imports, string slices, and basic i32 return codes.
Compile a Pattern
Section titled “Compile a Pattern”Import the module with an alias and compile a short pattern.
use std.text.rex as rex
pub func main() -> i32 do const digits = "\\d+" const h = rex.compile(digits)
if h == 0 do return 1 end
return 0endcompile returns an opaque handle. A zero handle means the pattern is malformed
or outside the v1 grammar.
Match Input
Section titled “Match Input”Use isMatch when you already have a handle.
if rex.isMatch(h, "invoice 123") == false do return 2end
if rex.isMatch(h, "invoice") do return 3endThe matcher searches for the pattern anywhere in the input unless the pattern
uses anchors such as ^ and $.
Use Anchors and Classes
Section titled “Use Anchors and Classes”The v1 grammar supports literal characters, ., *, +, ?, ^, $,
character classes, negated classes, and the \d, \w, and \s escapes.
const anchored = rex.compile("^h.*o$")if anchored == 0 do return 4; endif rex.isMatch(anchored, "hello") == false do return 5; endif rex.isMatch(anchored, "hello!") do return 6; end
const letters = rex.compile("[a-c]+")if letters == 0 do return 7; endif rex.isMatch(letters, "xxbxx") == false do return 8; endReject Unsupported Syntax
Section titled “Reject Unsupported Syntax”std.text.rex is deliberately smaller than PCRE. Grouping, alternation,
backreferences, and uppercase shorthand escapes are not v1.
if rex.compile("[") != 0 do return 9end
if rex.compile("(abc)") != 0 do return 10end
if rex.compile("\\D") != 0 do return 11endUse std.text.peg when the pattern needs structure, names, or
semantic meaning.
Use Typed Queries
Section titled “Use Typed Queries”If a command recipe is too small for your code path, switch to the typed stdlib layer. Raw bounded rex and exact literals are different query values:
const raw = rex.raw("\\d+")const exact = rex.literal("call(foo)")
const a = rex.evaluate(raw, "invoice 123")if a.valid == false do return 12; endif a.matched == false do return 13; end
const b = rex.evaluate(exact, "x call(foo) y")if b.matched == false do return 14; endUse require when invalid patterns and misses should become control flow:
func has_invoice() -> !bool do const q = rex.raw("\\d+") const result = try rex.require(q, "invoice 123") return result.matchedendUse select[T] when the caller owns the output type:
const status: i32 = rex.select[i32](raw, "invoice", 1, 0, -1)Matching remains pure and capability-free. File walking and process execution stay in the capability-bearing layers.
Search Files with rex
Section titled “Search Files with rex”The rex command is the tool facet over the same bounded engine. It
starts with natural search phrases:
rex "contains TODO" stdrex "digits after invoice " logsrex "contains email address" srcrex "between start and end" app.logrex "starts with error and ends with 500" app.logAsk for the generated rex pattern when you want to inspect the lowering:
rex explain "digits after invoice "rex --explain "digits after invoice "Ask for the recipe list when you want the next layer:
rex syntaxRaw regex and literal modes are explicit:
rex regex "\\d+" stdrex literal "call(foo)" srcUse --json when a shell pipeline, another tool, or a Janus std.command
caller should consume the results.
Use -l or --files-with-matches when a pipeline only needs matching paths.
The test-text-rex gate verifies this by capturing the built rex executable
from a Janus smoke through std.command.
Complete Program
Section titled “Complete Program”use std.text.rex as rex
pub func main() -> i32 do const digits = "\\d+" const h = rex.compile(digits) if h == 0 do return 1; end if rex.isMatch(h, "invoice 123") == false do return 2; end if rex.isMatch(h, "invoice") do return 3; end
const anchored = rex.compile("^h.*o$") if anchored == 0 do return 4; end if rex.isMatch(anchored, "hello") == false do return 5; end if rex.isMatch(anchored, "hello!") do return 6; end
const letters = rex.compile("[a-c]+") if letters == 0 do return 7; end if rex.isMatch(letters, "xxbxx") == false do return 8; end
if rex.compile("[") != 0 do return 9; end if rex.compile("(abc)") != 0 do return 10; end if rex.compile("\\D") != 0 do return 11; end
return 0endRun the repository smoke for the canonical version:
cd janus./scripts/zb test-text-rex./scripts/zb test-rextest-text-rex covers both the std.text.rex facade and std.command capture
of the standalone rex tool.
Boundary
Section titled “Boundary”Do not describe v1 as typed regex literals or capture extraction. Captures,
named fields, $1, $2, $*, and parser-literal validation for r/.../
belong in future SPEC-048 amendments.
Do not teach rex as PCRE hidden behind natural language. Natural phrases lower
only to the bounded v1 rex grammar, and rex explain is the audit path for that
lowering.