[antlr-interest] Lexer non-determinisme

Gunnar Wagenknecht kreismeister at yahoo.de
Sat Mar 8 05:10:05 PST 2003


Hi!

I have a -maybe simple- problem with my Lexer. I created a simple 
grammar which compiles with no warnings. Now I want to enhance the 
lexer to recognize special "macro" text but I currently don't have an 
idea of how to make this without getting non-determinisme.

Currently  I have the following rules which might affects the macro 
rules. The language is case insensitive and the lookahead is k=2.

IDENTIFIER   : ('a'..'z'|'_')('a'..'z'|'_'|'0'..'9')*;
COMMENT_LINE : ("//" | "&&") (~('n'))*;
DOT          : '.'
               ( "and." {$setType(AND);}
               | "or." {$setType(OR);}
               | "not." {$setType(NOT);}
               | ("t."|"y.") {$setType(TRUE);}
               | ("f."|"n.") {$setType(FALSE);}
               )?;
MACROOP      : '&';

I need to recognize the following tokens:
MACROVAR  : '&' IDENTIFIER ('.')?;
MACROTEXT : ( '&' 
              IDENTIFIER 
              '.' 
              ('a'..'z'|'_'|'0'..'9')+ 
            )
          | ( IDENTIFIER 
              ( '&' 
                IDENTIFIER 
                ( '.' 
                  ('a'..'z'|'_'|'0'..'9')+
                )?
              )+
            )
          ;

The MACROVAR is easy because I can extend the MACROOP rule.

MACROOP : '&' ( IDENTIFIER ('.'!)? {$setType(MACROVAR);} )?;

But how to compose the MACROTEXT rule? I can't recognize it in the 
parser because it isn't allowed to have whitespaces inside the macro 
tokens.

Anyone has a good idea?

Thanks, KM



 

Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/ 



More information about the antlr-interest mailing list