[antlr-interest] C target: unhelpful error messages from the default error handler in trivial cases

Jim Idle jimi at temporal-wave.com
Wed Jul 20 19:54:58 PDT 2011


Oh BTW - the start/stop random indexes are the addresses in memory, not
offsets, as the documentation states and about 20 past posts explain (and
explain why).

Jim

> -----Original Message-----
> From: antlr-interest-bounces at antlr.org [mailto:antlr-interest-
> bounces at antlr.org] On Behalf Of Vlad
> Sent: Wednesday, July 20, 2011 6:50 PM
> To: antlr-interest at antlr.org
> Subject: [antlr-interest] C target: unhelpful error messages from the
> default error handler in trivial cases
>
> Greetings,
>
> Like apparently many new ANTLR users, I've borrowed the implementation
> from the default displayRecognitionError() to implement my own version.
> Somewhat unfortunately, this version generates unhelpful/random errors
> in rather trivial cases. Here is a full example:
>
> grammar testerrors;
>
> options
> {
>     language='C';
> }
>
> NAME    :   ( 'a'..'z' | 'A'..'Z' | '0'..'9' )+ ;
> WS      :   ( ' ' | '\t' | '\r' | '\n' )+ { $channel = HIDDEN; } ;
>
> parse:
>     decl ( options { greedy = true; }: ',' decl )* ','? EOF
>     ;
>
> decl:
>     NAME ':' type
>     ;
>
> type:
>     'int' | 'float'
>     ;
>
> Feeding "A : badtype" into parse() results in:
>
> -memory-(1)  : error 10 : Unexpected token, at offset 3
>     near [Index: 0 (Start: 0-Stop: 0) ='<missing <invalid>>', type<0>
> Line:
> 1 LinePos:3]
>      : Missing <invalid>
>
> What puzzles me is where the <invalid> comes from. It would seem easy
> to compute that either 'int' or 'float' token was expected. In the
> stock error handler this comes from tokenNames[ex->expecting] evaluated
> for
> ex->expecting being 0. What change to the default implementation is
> necessary to make this work correctly?
>
> Similary, attempting to parse "A :" results in:
>
> -unknown source-(1)  : error 10 : Unexpected token, at offset -1
>     near [Index: 0 (Start: 0-Stop: 0) ='<missing <invalid>>', type<0>
> Line:
> 1 LinePos:1]
>      : Missing <invalid>
>
> Note how the source became "unknown" and the offset became -1. In the
> default handler this is determined by "streamName" as follows:
>
> if (ex->streamName == NULL)
> {
> if (((pANTLR3_COMMON_TOKEN)(ex->token))->type == ANTLR3_TOKEN_EOF) {
> ANTLR3_FPRINTF(stderr, "-end of input-("); } else {
> ANTLR3_FPRINTF(stderr, "-unknown source-("); } } else { ftext = ex-
> >streamName->to8(ex->streamName);
> ANTLR3_FPRINTF(stderr, "%s(", ftext->chars); }
>
> and it is frankly unexpected that a slightly different match error type
> should have this impact since it does not impact the branches taken
> here at all (that happens later in the function). Anyone trying to take
> this function as a blueprint for their own handler would conclude that
> ex->streamName is NULL in one case but not the other and that is set
> somewhere *outside* of displayRecognitionError(): the problem of fixing
> the default implementation begins to feel like it might snowball into
> patching the runtime itself.
>
> As the last example, trying to parse "A B" results in:
>
> -memory-(1)  : error 1 : Unexpected token, at offset 1
>     near [Index: 2 (Start: 15787098-Stop: 15787098) ='B', type<4> Line:
> 1 LinePos:1]
>      : syntax error...
>
> The start/stop indices are bogus, i.e. some uninitialized variables --
> on repeated parses they change randomly.
>
> My second question follows. Good error handling is a big selling point
> of ANTLR, but with all due respect it hardly seems so for the C target.
> Is there documentation available for all context relevant to handling
> main mismatch error conditions? I have scanned everything in the
> available examples download and there are no examples of customizing
> the error handler that I can find. Alternatively, could someone share a
> workable version of their own that might be a good learning example?
>
> Thank you,
> Vlad
>
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-
> email-address


More information about the antlr-interest mailing list