[antlr-interest] Examining characters in lexer
Dennis Brothers
brothers at bros.com
Fri Mar 13 08:36:26 PDT 2009
Aaargh. (Sound of hand hitting forehead)
It's always the dumb, simple things that are the hardest to see.
Efficiency isn't a major concern - I'm parsing one-line Lucene-like
query expressions. (But I'd still like to see your suggestion).
Thanks -
- Dennis
On Mar 13, 2009, at 11:13 AM, Jim Idle wrote:
> Dennis Brothers wrote:
>> OK, I tried it, and I'm getting an error I don't know how to
>> interpret:
>>
>> [10:19:33] error(10): internal error:
>> org
>> .antlr
>> .analysis
>> .NFAToDFAConverter.getPredicatesPerNonDeterministicAlt(Unknown
>> Source): no AST/token for nonepsilon target w/o predicate
>>
>> That is emitted three times when I try to generate code.
>>
>> Here's the lexer section:
>>
>> NEWLINE : '\r'? '\n' ;
>> WS : (' '|'\t'|NEWLINE)+ {$channel=HIDDEN;} ;
>> STRING : ( '0'..'9'|'_'|'\'' | LETTER )+ ;
>> LETTER : { Char.IsLetter( input.LA(1) ) }?=> . ;
>>
> lexer grammar f;
>
> NEWLINE : '\r'? '\n' ;
> WS : (' '|'\t'|NEWLINE)+ {$channel=HIDDEN;} ;
> STRING : ( '0'..'9'|'_'|'\'' | LETTER )+ ;
> fragment
> LETTER : { Char.IsLetter( input.LA(1)) }?=> . ;
>
> You missed the fragment specifier from your LETTER rule, which
> creates a
> real token rule that clashes with the invocation of the self same rule
> in STRING and all sorts of other problems ;-)
>
> If you are bothered about efficiency here, you might find that the
> following generates better code:
>
> Jim
>
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address
>
More information about the antlr-interest
mailing list