[antlr-interest] Ignoring syntax errors

Thu Nov 23 04:01:43 PST 2006

Hello David,

when your input is not syntactically correct to your grammar, you will
get a RecognitionException thrown by your Parser anyway, why not using
this mechanism? In case you find semantic errors you just throw an own
SemanticException.
> I would like to ignore syntax errors in my grammar (because if the
> syntax is wrong, I presume that some other parser in an expandable
> system might understand it), yet if the syntax is correct, I want to
> flag semantic errors that are detected within semantic actions.
> Therefore I have a start rule like this:
>
> real_starting_rule returns[AST u=null]:
>     (    (starting_rule)=> u=starting_rule
>     |   (.)*
>     );
Why consuming the input and building a flat AST, when you don't need it?

real_starting_rule returns[AST u=null]:
    ( u=starting_rule )
    ;
    exception // for rule
    catch [RecognitionException ex] {
        reportError(ex);
        return(null);
    }

> INVALID_CHARACTER: '\u0001'..'\uFFFE';
>
> This works, but it causes a boatload of nondeterminism exceptions.
Isn't the range of INVALID_CHARACTER to big? Isn't it almost all UTF-8
characters? Afaik ASCII ranges from '\u0000' to '\u007F'. When
INVALID_CHARACTER and the alphabet of your tokens overlap that should be
a reason for indeterminism.

Regards, Magnus.

-- 
Magnus Knuth
mai00cas at studserv.uni-leipzig.de
JABBER mgns at jabber.ccc.de