Module lexer

Expand description

Two-stage lexer for LOGOS natural language input.

The lexer transforms natural language text into a token stream suitable for parsing. It operates in two stages:

The LineLexer handles structural concerns:

The Lexer performs word-level tokenization:

When a word matches multiple lexicon entries, priority determines the token:

Input:  "Every cat sleeps."
Output: [Quantifier("every"), Noun("cat"), Verb("sleeps"), Period]

Structs§

Lexer
LineLexer: Stage 1 Lexer: Handles only lines, indentation, and structural tokens. Treats all other text as opaque Content for the Stage 2 WordLexer.

LexerMode
LineToken: Tokens emitted by the LineLexer (Stage 1). Handles structural tokens (Indent, Dedent, Newline) while treating all other content as opaque for Stage 2 word classification.