[antlr-interest] distinction between newline and ws

Clifford Heath clifford.heath at gmail.com
Sat Oct 20 17:45:09 PDT 2007


Sven Busse wrote:
> something. The grammar looks like this:
> grammar Expr;
...
> NEWLINE     :     '\r'? '\n';
> WS    :     (' '|'\t'|'\n'|'\r')+ {skip();};
> My Question now is, how does antrl know, that “\n” should match to a 
> NEWLINE instead

I have some understanding of this, but I'm still struggling with
a similar case also.

Your grammar parses the string "a=3\n" just fine, but
the string "a=3 \n" (with space) doesn't parse... why?

If I redefine NEWLINE as (' '|'\t')* '\r'? '\n'
it doesn't parse either form... why?

> I would have thought, this grammar is ambiguous, but apparantly, it isn’t. Why not?

Lexical rules are often ambiguous, and all rules always apply
regardless of the grammar rule context (this is the answer to
Simon West's question). The Lexer is supposed to choose the
longest token that matches the current input... but I don't
see that principle applying here.

Clifford Heath.



More information about the antlr-interest mailing list