[antlr-interest] antlr-interest Digest, Vol 27, Issue 48
Martin d'Anjou
martin.danjou at neterion.com
Mon Feb 26 06:46:03 PST 2007
>> lexer grammar DUMMY_Lexer;
>> options { filter=true; }
>>
>> INT : 'int' ;
>> SEMI : ';' ;
>> WS : ( ' '| '\t'| '\r' | '\n' )+ {$channel=HIDDEN;} ;
>> IDENTIFIER : ('a'..'z'|'A'..'Z'|'_')+;
>
>Why are you using the filter option? This option causes ANTLR to try the
>tokens one-by-one. It continues at the next token if the current token
>does not match. So on the input 'intt' it will match an INT token first,
>followed by the IDENTIFIER 't'. When you remove the filter option, it
>should match a single IDENTIFIER token.
I guess the real reason is I am lazy. I did not want to tokenize
everything contained in the input (I could have used the skip feature -
but I was too lazy for that too!).
I still don't understand why the lexer would break the token at a
character identified in a rule the lexer can match, and what it has to
do with the filter=true. Perhaps an example would help me get that.
Cheers,
Martin
More information about the antlr-interest
mailing list