[antlr-interest] Continue parsing after an error

Luchesar Cekov luchesar.cekov at ontology-partners.com
Wed Jun 30 10:35:23 PDT 2010


Hi Gordon,

Thanks for the prompt response.
Adding OTHER as an alternative was what I tried to do in the beginning. 
Unfortunately my use case is a bit more complex. I have worked out a 
better example below.
In this example, the input string  [ax][kx][ax] is wrong (k is not 
allowed) but the grammar builds the full ast tree, so it recovers from 
the error - it would generate three expression nodes the second of which 
contains a ErrorCommonToken inside as per recoverFromMismatchedToken().
The string [ax]sax][ax] on the other end, generates only the first bit 
of the tree, till the error.  - it generares only one expression node.

I do not understand why I get this different behavior - the parser 
recovers if the error happens in the middle of a rule, but not if the 
error is at the beginning of a rule.

Is this a problem in my grammar or it is just the way ANTLR works?

Thanks,
Luchesar

================
grammar StartOfARuleFailTest;

options {    output=AST;    ASTLabelType=CommonTree; }

tokens { ROOT_TOKEN;ERROR_TOKEN;EXPRESSIONS;EXPRESSION; }

@members {
    @Override
    protected Object recoverFromMismatchedToken(IntStream input, int 
ttype, BitSet follow)
            throws RecognitionException {
        MismatchedTokenException ex = new 
MismatchedTokenException(ttype, input);
        input.consume();
        return createErrorToken(ex, ttype);
    }
   
    public static ErrorCommonToken createErrorToken(RecognitionException 
ex, int ttype) {
        ErrorCommonToken errorCommonToken = new ErrorCommonToken(ex.token);
        errorCommonToken.setType(ttype);
       
        return errorCommonToken;
    }
}

root : expressions  EOF -> ^(ROOT_TOKEN expressions) ;
expressions  : expression* -> ^(EXPRESSIONS expression*) ;
expression : '[' 'a' 'x' ']' -> ^(EXPRESSION '[' 'a' 'x' ']');

OTHER   : . ;
================


Gordon Tyler wrote:
> The grammar you have defined says, roughly:
>
> Parse any number of '[' or ']' until you reach EOF.
>
> It does not describe what to do if something other than '[' or ']' are found before EOF is found.
>
> You have defined a token, OTHER, to match the other stuff, but your parse rules do not reference OTHER. Perhaps something like this would work:
>
> root : (expressions | OTHER)* EOF -> ^(ROOT_TOKEN expressions) ;
>
>
>
> -----Original Message-----
> From: antlr-interest-bounces at antlr.org [mailto:antlr-interest-bounces at antlr.org] On Behalf Of Luchesar Cekov
> Sent: June 30, 2010 10:10 AM
> To: antlr-interest at antlr.org
> Cc: Valerio Malenchino
> Subject: [antlr-interest] Continue parsing after an error
>
> Dear ANTLR enthusiasts,
>
> I am struggling with a problem. The parser jumps to the end of file from 
> the middle of the document.
>
> The setup is as follow:
>     * I have two alternatives flowed by EOF
>     * during parse time in the middle of the document next token can not 
> match either alternatives start
>
> This leads to parsing termination because the parser jumps to the EndOfFile.
>
> A simple grammar the illustrates the problem is
>
> ===============
> tokens {ROOT_TOKEN;}
> root
>     : expressions EOF -> ^(ROOT_TOKEN expressions) ;
> expressions : ('[' | ']')* ;
> OTHER   : . ;
> ===============
>
> If then I try parsing "[[][]]sdsdf[]][]][" the parsing will stop and the 
> first "s" and will try to recover as if the EOF was the next token.
> When looking at the generated Parser it looks like if there is no viable 
> alternative in the top rule in this case "root" the parser will behave 
> as if it reached the EOF and will skip the rest of the tokens.
>
> The result AST will contain only children up until the first illegal 
> token "s".
>
> I cannot see where my mistake is. It looks like the parser should not do 
> that. Can you suggest a workaround for the problem?
>
> Thanks in advance,
> Luchesar
>   

-- 

Luchesar Cekov
Software Engineer
+44 (0) 207 239 4949
*Ontology Systems*
www.ontology.com <http://www.ontology.com/>

	

award list of icons       

 

 

 

 

.

 



More information about the antlr-interest mailing list