[antlr-interest] lexical modes

Loring Craymer craymer at warpiv.com
Wed Jun 7 16:59:16 PDT 2006


Another +1.  That's a PCCTS feature that I've had occasion to miss, and it
would be good to have it back.

I would suggest using something similar to the PCCTS approach for assigning
rules to modes:  the PCCTS syntax was to use "#lexclass" <mode> to cause
following lexer rules to be a part of <mode>.  Make <mode> a list of modes
and find a good name to replace "lexclass", and I think that the syntax
would be convenient.

--Loring

> -----Original Message-----
> From: antlr-interest-bounces at antlr.org [mailto:antlr-interest-
> bounces at antlr.org] On Behalf Of Terence Parr
> Sent: Wednesday, June 07, 2006 2:02 PM
> To: ANTLR Interest
> Subject: [antlr-interest] lexical modes
> 
> Hi, consider matching strings in the lexer.  It's pretty easy in
> ANTLR as you can make rule references:
> 
> STRING : '"' (ESC | .)* '"' ;
> ESC : ... ;
> 
> What if you want the lexer though to return a stream of tokens chosen
> from a different set in between square brackets such as when
> recognizing regular expressions.  Inside [...] you can refer to '('
> as just a char not a grouping symbol.  Rather than creating and
> switching to a new lexer every time you see a '[', perhaps good old
> lexical modes from lex are the right idea.
> 
> grammar regex;
> 
> expr : atom | range | ebnf | ... ;
> 
> range : LBRACK (CHAR | CHAR DASH CHAR)+ RBRACK ;
> 
> LBRACK : '[' {pushMode(inside_brackets);} ;
> 
> mode inside_brackets;
> 
> CHAR : ... ;
> DASH : '-' ;
> RBRACK : ']' {popMode();} ;
> 
> Something like that...make sense to add?  ANTLR can just switch-on-
> mode when it enters nextToken() to jump to the appropriate set of
> lexical rules.
> 
> Ter




More information about the antlr-interest mailing list