[antlr-interest] Error recovery in lexer (~ unclosed string)
Thibaut Colar
tcolar at colar.net
Tue Jun 30 17:53:47 PDT 2009
Hello there.
In my grammar file, i have a lexer rule to match DSL string like this:
DSL :'<|' ( options {greedy=false;} : . )* '|>' ;
Now my Lexer is hooked up int the NetBeans IDE to parse(well lexer only
for now) files and that works well.
The problem is that if you type an oening <| (unclosed at this time),
the lexer errors out.
What happens is that it goes all the way to the end of the file trying
to find the closing |>, hits the EOF and throws an exception (null
nextToken / unmatched tokens left).
I'm not surprised it doesn't like this, but i want to add some error
recovery so it deos not fail.
I've tried many thing in last 24h but can't seem to find something that
works.
Latest try is basically this (using states):
Grammar:
DSL :'<|' {state=INCOMPLETE_DSL} ( options {greedy=false;} : . )*
'|>' {state=NORMAL};
Lexer: ------------------------
public Token<FanTokenID> nextToken()
{
curToken = (CommonToken) lexer.nextToken();
if(curToken.getType()==-1) //prob. EOF
{
int state = lexer.getState();
switch (state)
{
case FanStates.INCOMPLETE_DSL:
curToken.setType(FanLexer.INCOMPLETE_DSL); // set as
incomplete token type
break;
}
lexer.clearState();
}
}
------------------------
However that still does not work quite right, it does replace the token
correctly, but i guess after that it does not continue at the right
place (do i need a rewind() or consume() or something ?)
Or maybe i'm doing this the wrong way - should i "emit" a fake closing
token(|>) instead ?
I'm sure this is a pretty common issue but i couldn't find any
documented way to do this in antlr3 in the book or online.
Any help would be greatly appreciated, Thanks.
More information about the antlr-interest
mailing list