[antlr-interest] White spaces not allowed
Dominic Tardif
Dominic.Tardif at USherbrooke.ca
Mon Jan 12 12:59:12 PST 2009
Quoting Gavin Lambert <antlr at mirality.co.nz>:
> At 08:53 13/01/2009, Dominic Tardif wrote:
> >Hello everyone! I've been working on this grammar for quite
> some
> >time now, and it works quite well except for one little detail:
> >white spaces are not allowed.
> [...]
> >stmt: ID ' ' function_id STMT_END -> ^(STMT ID
> function_id)
> > | ID '=' expr STMT_END -> ^('=' ID expr)
> > | NEWLINE ->
> > ;
>
> Your grammar is expecting to see NEWLINE tokens...
>
> [...]
> >NEWLINE: ('\r'? '\n')+;
> >WS: (' '|'\t'|'\r'|'\n')+ {skip();};
>
> ... but your NEWLINE and WS tokens overlap, such that if there is
> any WS before (or possibly even after) a newline then the newline
> will be consumed and skipped without generating a NEWLINE token.
>
> Having said that, I'm not entirely sure why you are using NEWLINE
> tokens in your parser; in most cases it looks like it's optional
> anyway, so it seems like it could just be removed (though you
> might need to change some 'stmt+'s to 'stmt*'s as well).
>
> That's not the real problem, though. The real problem is that
> quoted space you have in the stmt rule above. Whenever you use a
> quoted literal in a parser rule, it effectively creates a new
> lexer rule -- so you then have two lexer rules representing
> spaces; one that represents exactly one space and one that
> represents multiple spaces, tabs, and newlines. The two are going
> to fight. Just remove this space (it shouldn't be necessary
> anyway) and it should behave.
OK, I've removed the NEWLINE token and changed stmt+ by stmt* and it works just
fine, except that I want to be able to support the ' ' operator, which acts
like a '*'. If I didn't support it, I don't think I would have had any
problems. Is there a way to have a white space act as an operator as well? If
not, I'll just have to remove it. Thanks again for all your help! ^_^
More information about the antlr-interest
mailing list