[antlr-interest] lexical modes
Terence Parr
parrt at cs.usfca.edu
Wed Jun 7 14:01:39 PDT 2006
Hi, consider matching strings in the lexer. It's pretty easy in
ANTLR as you can make rule references:
STRING : '"' (ESC | .)* '"' ;
ESC : ... ;
What if you want the lexer though to return a stream of tokens chosen
from a different set in between square brackets such as when
recognizing regular expressions. Inside [...] you can refer to '('
as just a char not a grouping symbol. Rather than creating and
switching to a new lexer every time you see a '[', perhaps good old
lexical modes from lex are the right idea.
grammar regex;
expr : atom | range | ebnf | ... ;
range : LBRACK (CHAR | CHAR DASH CHAR)+ RBRACK ;
LBRACK : '[' {pushMode(inside_brackets);} ;
mode inside_brackets;
CHAR : ... ;
DASH : '-' ;
RBRACK : ']' {popMode();} ;
Something like that...make sense to add? ANTLR can just switch-on-
mode when it enters nextToken() to jump to the appropriate set of
lexical rules.
Ter
More information about the antlr-interest
mailing list