[antlr-interest] antlr 3 lexer question
John B. Brodie
jbb at acm.org
Tue Nov 16 07:52:41 PST 2010
Greetings!
On Tue, 2010-11-16 at 11:37 +0100, Philippe Frankson wrote:
> Hi,
>
> I spent quite some time to find a solution to the following problem but
> I could not find a suitable solution so any help would be very much
> appreciated.
>
> When I have the following input:
> row1.subrow1.subsubrow1..row1.subrow1.subsubrow5
> I would like the lexer to return the following tokens: NAME RANGE NAME
> Where RANGE is '..', the first NAME would be 'row1.subrow1.subsubrow1'
> and the second one ' row1.subrow1.subsubrow5'.
> For info, the dot is not mandatory (we can have row1 alone, for
> example).
> Let's assume that we allow any alpha characters (apart from the dot) ->
> fragment ALPHA : ('a'..'z'|'A'..'Z');
>
> Rem.: it is important to me to have a solution in the lexer side (I know
> it is possible to solve this in the parser but I would like to avoid
> it).
>
sometimes syntactic predicates can be Good (but be careful!)
try this:
NAME : ID ( ('.' ALPHA)=> '.' ID )* ;
RANGE : '..' ;
fragment ID : ALPHA (ALPHA|DIGIT)* ;
fragment ALPHA : ('a'..'z')|('A'..'Z') ;
fragment DIGIT : '0'..'9' ;
hope this helps...
-jbb
More information about the antlr-interest
mailing list