[antlr-interest] how to define these characters(java)

Martin Probst mail at martin-probst.com
Tue Jan 18 13:29:03 PST 2005


Hi,

Am Montag, den 17.01.2005, 15:04 -0500 schrieb Nan Zhang:

> class L extends Lexer;
> PLUS : '+' ;
> NEWLINE : '\r''\n'|'\n' ;
> NON : ~('+'|'\r''\n'|'\n') {$setType(Token.SKIP); }//and other actions
>  ;
> ------------
> but I got an error message: This subrule cannot be inverted. Only
> subrules of the form:
>     (T1|T2|T3...) or
>     ('c1'|'c2'|'c3'...)
> may be inverted (ranges are also allowed).
> Exiting due to errors.

The simple solution is to write:
NON : ~ ( '+' | '\r' | '\n' ) { $set... }

You can't invert groups whose single elements are more than one
character (this is a limitation of the underlying ANTLR algorithm). With
k=1 in the Lexer you will get ambiguities because the \r may either be
matcher by NEWLINE or by NON - remove the \r from the NEWLINE statement
or increase k to 2 to fix that.

Regards,
Martin



More information about the antlr-interest mailing list