[antlr-interest] distinction between newline and ws
Peizhao Hu
peizhao at itee.uq.edu.au
Sat Oct 20 18:48:19 PDT 2007
not sure what you guys trying to do, but try the following:
grammar T;
options {
k=3;
}
test : (TEXT | NEWLINE | WS)* ;
TEXT : 'x'+ ;
NEWLINE : '\r'? '\n' ;
WS : (' '|'\t')* {$channel=HIDDEN;} ;
regards;
Peizhao
Joseph Gentle wrote:
> [forgot to reply all]
>
> I can't find the documentation for it, but ANTLR does seem to have token
> matching precedence rules.
>
> Have a play with it - write a tokeniser like this:
>
> test : ( TEXT | NEWLINE | WS )*;
> TEXT : 'x'+;
>
> NEWLINE : '\r'? '\n';
>
> WS : (' '|'\t'|'\n'|'\r')+;
>
>
> and pass it some strings with newlines and whitespace and whatnot. Have
> a look at the token stream generated. I've got a feeling that antlr
> prefers to match earlier tokens to later tokens. Using your rules, I
> expect that a line of text followed immediately by a newline will become
> TEXT NEWLINE whereas a line of text followed by whitespace then a
> newline will be TEXT WS. This is because by default the + in the WS rule
> is greedy and will consume the newline as well, if it can.
>
> Have a play!
>
> -J
>
>
> Sven Busse wrote:
>>
>> hello,
>>
>>
>>
>> i am very new to antlr and language recognition. So i bought the book
>>
>> from Terence Parr and now i am currently working through the first
>>
>> example, the calculator. And unfortunately already, i don’t understand
>>
>> something. The grammar looks like this:
>>
>>
>>
>> grammar Expr;
>>
>>
>>
>> prog : stat+ ;
>>
>>
>>
>> stat : expr NEWLINE
>>
>> | ID '=' expr NEWLINE
>>
>> | NEWLINE
>>
>> ;
>>
>>
>>
>> expr : multExpr (('+'|'-') multExpr)* ;
>>
>>
>>
>> multExpr: atom ('*' atom)* ;
>>
>>
>>
>> atom : INT
>>
>> | ID
>>
>> | '(' expr ')'
>>
>> ;
>>
>>
>>
>> ID : ('a'..'z'|'A'..'Z')+;
>>
>> INT : '0'..'9'+;
>>
>> NEWLINE : '\r'? '\n';
>>
>> WS : (' '|'\t'|'\n'|'\r')+ {skip();};
>>
>>
>>
>> My Question now is, how does antrl know, that “\n” should match to a
>> NEWLINE instead
>>
>> of WS (which would mean, it would skip it)? I would have thought, this
>> grammar is
>>
>> ambiguous, but apparantly, it isn’t. Why not?
>>
>>
>>
>> Thank you
>>
>> Sven
>>
>
>
More information about the antlr-interest
mailing list