[antlr-interest] Catching errors

Victor Giordano power_giordo at yahoo.com.ar
Thu Feb 3 20:11:51 PST 2011


Hello people, two days a posted this problem and i still having troubles 
with this matter, pliz give a hand, if you can of course....
I need to handle the errors of parsing in order to display a nice user 
message.

I am watching when i use the generated lexer and parser (Generated from 
the LinearMath grammar below) in a java application is that do really 
emit somekind of warning about two thinks:

1)extraneous input '<some_token>' expecting EOF *Only when a append the 
EOF token at the end of the rule*
2)required (...)+ loop did not match anything at input <some_token>' 
*Only when i use the '+' quantity token modifier*

where <some_token> there is actually token.

The matter is, again, how do i catch in a exception those errors. I show 
below the grammar with an input example to try for yourself:

grammar LinearMath;

tokens
{
       PLUS     = '+';
       MINUS     = '-';
       MUL        = '*';
       DIV        = '/';
}

inecuation:    linexpr ((RELATIONSHIP) linexpr)+ EOF!;
catch [UnwantedTokenException ute]
{
     System.out.println ("inecuation UnwantedTokenException  " +
ute.toString());
     throw ute;
}

linexpr : (MINUS|PLUS)? linterm ((PLUS|MINUS) linterm)* EOF;

linterm : factor? ID;

expr returns [double value]
       : e=term {$value = $e.value;}
       (    PLUS e=term {$value += $e.value;}
       |    MINUS e=term {$value -= $e.value;}
       )*;

term returns [double value]
       : f=factor {$value = $f.value;}
       (    MUL f=factor {$value *= $f.value;}
       |    DIV f=factor {$value /= $f.value;}
       )*;

factor returns [double value]
       : DOUBLE {$value = Double.parseDouble($DOUBLE.text);}
       | '(' e=expr ')'{$value = $e.value;};

ID  :    ('a'..'z'|'A'..'Z'|'_') ('a'..'z'|'A'..'Z'|'0'..'9'|'_')*;

DOUBLE
       :   ('0'..'9')+
       |    ('0'..'9')+ '.' ('0'..'9')* EXPONENT?
         |   '.' ('0'..'9')+ EXPONENT?
         |   ('0'..'9')+ EXPONENT
         ;

fragment EXPONENT : ('e'|'E') ('+'|'-')? ('0'..'9')+ ;

NEWLINE:'\r'? '\n' { $channel = HIDDEN; };

WS  :   (' '|'\t'|'\n'|'\r')+ { $channel = HIDDEN; };


RELATIONSHIP :    '<'|'<='|'='|'>'|'>=';

and with the following input: "x<  y x"
that isn't a valid inecuation beacause the y x must have a binary
aritmetic operator (PLUS OR MINUS). The parser do his job very well, he
consume the "x" then "<" later "y" and when it reachs the seconds "x" it
emits an "UnwantedTokenException". The think is, that i am not being
able to catch it, and display an error to the final user. Look that i am
using to parse that input the inecuation "rule".

--------------------------------------------------------------------------
2) Other thing is about invalid tokens, i manage to treat then
overriding a member function of the lexer called nextToken(), like this:

@lexer::members
{
     @Override
     public Token nextToken()
     {
         while (true) {
             state.token = null;
             state.channel = Token.DEFAULT_CHANNEL;
             state.tokenStartCharIndex = input.index();
             state.tokenStartCharPositionInLine = 
input.getCharPositionInLine();
             state.tokenStartLine = input.getLine();
             state.text = null;
             if ( input.LA(1)==CharStream.EOF ) {
                 return Token.EOF_TOKEN;
             }
             try {
                 mTokens();
                 if ( state.token==null ) {
                     emit();
                 }
                 else if ( state.token==Token.SKIP_TOKEN ) {
                     continue;
                 }
                 return state.token;
             }
             catch (RecognitionException re) {
                 reportError(re);
                 throw new RuntimeException("Invalid Character  : " + 
(char) (re.c));
// or throw Error
             }
         }
     }
}
¿It's that the correct way?

Ok, so far this.
Sorry for the bombing III of emails!. Thanks for advance.
Víctor.


El 03/02/2011 12:36 a.m., Victor Giordano escribió:
> I am watching when i use the generated lexer and parser (Generated from
> the LinearMath grammar below) in a java application is that do really
> emit somekind of warning about two thinks:
>
> 1)extraneous input '<some_token>' expecting EOF *Only when a append the
> EOF token at the end of the rule*
> 2)required (...)+ loop did not match anything at input <some_token>'
> *Only when i use the '+' quantity token modifier*
>
> where <some_token> there is actually token.
>
> In fact the warnings is actually are a strings sended to the standart
> error.
>
> The matter is, again, how do i do to manage those errors altering normal
> flow with a real exception and treating it like one.
> Ok, so far this.
> Sorry for the bombing of emails!. Thanks for advance.
> Víctor.


More information about the antlr-interest mailing list