[antlr-interest] NoViableAltException - Am I trying to do too much with the Lexer?

Kenny MacDermid kenny at kmdconsulting.ca
Thu Aug 23 11:58:17 PDT 2007


Okay, I've got a solution that passes the tests, but I'm not a fan of it.

I've added to REALNUMBER:
    | ('0' '..')=> '0' { $type = NUMBER; }

and to NUMBER:
     | ('0' '0'..'9')=> DIGIT+ { $type = REALNUMBER; }

So now the rules are interdependent. Can anyone suggest a cleaner solution?

Kenny

On 8/21/07, Kenny MacDermid <kenny at kmdconsulting.ca> wrote:
>
> Hello all,
>
> I'm looking to developer (in Antlr 3, using TDD) a lexer and parser for a
> grammar that contains (among others):
>
> Numbers - Start with a '1'..'9',  or are just the digit '0'
> Real Numbers - Start with anything. May contain a decimal part. May
> contain an exponent part.
> Range - '..'
> Ellipsis - '...'
>
> I've been messing with everything I can think of for lexer rules, but
> always end up getting NoViableAltException's. This is what I currently have:
>
>
> NUMBER
>         : ( ('1' .. '9' DIGIT* '..')=> DIGIT*
>           | ('0' (~(DIGIT) | EOF))=> '0'
>           | ('1' .. '9' DIGIT*)
>           )
>         ;
>
> fragment
> REALNUMBEREXP
>         : ( ('e' | 'E') '-'? NUMBER )
>         ;
>
> fragment
> REALNUMBERDOTREM
>         : '.' DIGIT* REALNUMBEREXP?
>         ;
>
> REALNUMBER
>         : ( (DIGIT+ ('.' (~('.') | EOF)))=> DIGIT+ REALNUMBERDOTREM
>           | (DIGIT+ ('e' | 'E'))=> DIGIT+ REALNUMBEREXP
>           | ('0'+ '1'..'9')=> DIGIT+ REALNUMBERDOTREM?
>           )
>         ;
>
>
> This is resulting in:
>
> NoViableAltException(48!=[167:4: ( ( '1' .. '9' ( DIGIT )* '..' )=> (
> DIGIT )* | ( '0' (~ ( DIGIT ) | EOF ) )=> '0' | (
> '1' .. '9' ( DIGIT )* ) )])
>
> from the location of: Lexer.mNUMBER() on the attempted lexing of:
>
> testLexerToken("0123", "0123", REALNUMBER);
> (testLexerToken takes: input, expected output, expected type)
>
> So, am I expecting too much from my lexer to have it distinguish between a
> number and a real number? I could just have it return one token for either
> and work it out later at the parsing level, although this does sound kludgy
> to me. Does anyone have any tips on how to go about solving this?
>
> Thanks,
>
> Kenny
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.antlr.org/pipermail/antlr-interest/attachments/20070823/6c2b8ce6/attachment-0001.html 


More information about the antlr-interest mailing list