[antlr-interest] ANTLR4 synpred combination with (..)+ to greedy?
cd.barth at t-online.de
cd.barth at t-online.de
Tue Oct 30 02:16:20 PDT 2012
Using the following grammar
lexer grammar MyLexer;
WORD1 : ID1+;
WORD2 : ID2+;
fragment ID1 : {getCharPositionInLine()<2}? [a-zA-Z];
fragment ID2 : {getCharPositionInLine()>=2}? [a-zA-Z];
WS : [ \t\r\n]+ -> skip ;
and looking at lexer tokens with
for (Token token : lexer.getAllTokens()) {
int idx = token.getType();
tokenName = lexer.getTokenNames()[idx];
System.out.format(" %-12s", tokenName);
System.out.println(token);
}
for this two input lines
a cde
abcde
has printed the results
WORD1 [@-1,0:0='a',<1>,1:0]
WORD2 [@-1,2:4='cde',<2>,1:2]
WORD1 [@-1,7:9='abc',<1>,2:0]
WORD2 [@-1,10:11='de',<2>,2:3]
And now my question:
Why is letter c from the first line "a cde" part of WORD2
and in the next line "abcde" part of WORD1?
My sneaking suspicion is that in case of second line the ()+ construct from
ID1+ is to greedy and consumes one token
to much.
Claus-Dieter
More information about the antlr-interest
mailing list