[antlr-interest] White spaces not allowed
Gavin Lambert
antlr at mirality.co.nz
Mon Jan 12 12:19:45 PST 2009
At 08:53 13/01/2009, Dominic Tardif wrote:
>Hello everyone! I've been working on this grammar for quite
some
>time now, and it works quite well except for one little detail:
>white spaces are not allowed.
[...]
>stmt: ID ' ' function_id STMT_END -> ^(STMT ID
function_id)
> | ID '=' expr STMT_END -> ^('=' ID expr)
> | NEWLINE ->
> ;
Your grammar is expecting to see NEWLINE tokens...
[...]
>NEWLINE: ('\r'? '\n')+;
>WS: (' '|'\t'|'\r'|'\n')+ {skip();};
... but your NEWLINE and WS tokens overlap, such that if there is
any WS before (or possibly even after) a newline then the newline
will be consumed and skipped without generating a NEWLINE token.
Having said that, I'm not entirely sure why you are using NEWLINE
tokens in your parser; in most cases it looks like it's optional
anyway, so it seems like it could just be removed (though you
might need to change some 'stmt+'s to 'stmt*'s as well).
That's not the real problem, though. The real problem is that
quoted space you have in the stmt rule above. Whenever you use a
quoted literal in a parser rule, it effectively creates a new
lexer rule -- so you then have two lexer rules representing
spaces; one that represents exactly one space and one that
represents multiple spaces, tabs, and newlines. The two are going
to fight. Just remove this space (it shouldn't be necessary
anyway) and it should behave.
More information about the antlr-interest
mailing list