[antlr-interest] Understanding Lexer rules
Shawn Poulson
spoulson3 at yahoo.com
Wed Feb 20 04:44:58 PST 2008
> From: Gavin Lambert <antlr at mirality.co.nz>
>
> >So, for example, I'd put NUMERIC (the specific case) before
> >ALPHANUMERIC in the lexer rules.
>
> I'm not entirely sure whether input such as "42foo" will resolve
> to ALPHANUMERIC or NUMERIC ALPHANUMERIC (it probably depends on
> how the rules are defined). Either way, you need to word your
> parser rules carefully if NUMERIC is a complete subset of
> ALPHANUMERIC.
What do you need to be careful about, exactly, to avoid this ambiguity?
See, I'm getting hung up on how any change I make to a lexer rule
breaks a number of parser rules. Here's a quick example that parses a
string that defines a datetime span. It should take input like
"2008-01-30 5:16:27.677 lasting T30", which indicates that date with a
timespan of 30 seconds.
This portion works. Now, I want to implement a 'fetch' construct where
you can instead supply "@{some text}" to reference a stored value by
name. To enable this, I uncomment out the commented code in the
sample. Once done, the fetch feature works, but now the previous
syntax defined by once_p is getting hung up on the 'lasting' keyword
with "NoViableAltException". It seems the SYMBOL lexer rule is
gobbling up the text.
Can someone clarify what is going on?
--------->>
grammar T;
prog:
once_p /* | fetch_p */ ;
once_p: start=datetime_p ('lasting' duration=timespan_p);
//fetch_p: '@{' name=SYMBOL '}';
datetime_p:
(y=UINT '-' mo=UINT '-' d=UINT)?
h=UINT ':' m=UINT (':' s=UINT ('.' ms=UINT)? )?;
timespan_p:
'T' (((d=int_p '.')? h=int_p ':')? m=int_p ':')? s=int_p ('.'
ms=int_p)?;
int_p: '-'? UINT;
//SYMBOL: LEADIDCHAR IDCHAR*;
//IDCHAR: LETTER | DIGIT | '_' | ' ' | '-';
//LEADIDCHAR: LETTER | '_';
UINT: DIGIT+;
fragment DIGIT: '0'..'9';
fragment LETTER: 'a'..'z'|'A'..'Z';
WS: (' '|'\t'|'\r'|'\n'|'\u000C')+ { $channel=HIDDEN; };
<<---------
---
Shawn Poulson
spoulson at explodingcoder.com
More information about the antlr-interest
mailing list