[antlr-interest] ambigous lexer tokens

Torsten Curdt tcurdt at vafer.org
Thu Jun 28 04:54:37 PDT 2007


>>> 2. Use predicates in the lexer to turn alternatives on and off  
>>> depending on which "column" you're in (ie. make a context- 
>>> sensitive lexer).
>>
>> Could you give an example how that would look like?
>
> Well, here's one (untested) idea: one way might be to modify your  
> WS rule to increment a "column" counter whenever a run of spaces is  
> seen; you'd have to set up the column counter in your  
> @lexer::members section (exactly how you set up and initialize that  
> variable is dependent on your target language):
>
>   WS : ' '+ { column++; };
>
> And then modify your NEWLINE rule to reset the column counter:
>
>   NEWLINE : '\r'? '\n' { column = 0; };
>
> Now you can prefix your rules with gated semantic predicates,  
> effectively turning them on/off depending on the input column; for  
> example, you only want your INT rule to be applied in columns 4 and  
> 10:
>
>   INT : { column == 4 || column == 10 }?=> '0'..'9'+ ;
>
> And so on... Obviously if columns are whitespace delimited you need  
> to roll your "TYPE" and "MODS" rules into one, and also remember  
> that your final column (the file name) may actually contain  
> whitespace so to scan filenames you probably want a rule like:
>
>   FILENAME : { column > 8 }?=> ~('\n' | '\r')+ ;
>
> Or alternatively, make your WS rule only apply in the leftmost  
> columns and apply your FILENAME rule in column 9 only:
>
>   WS : { column < 9 }?=> ' '+ { column++; };
>   FILENAME : { column == 9 }?=> ~('\n' | '\r')+ ;
>
> So I think this could be made to work (although not sure how you'd  
> handle filenames with embedded newlines), but it starts to look  
> pretty complex (look at the source code for the generated lexer),  
> and in that case it seems easier/simpler to just write a simple  
> parser by hand...

Will do a handcrafted parser ...but still thanks for explaining. Was  
enlightened :)

cheers
--
Torsten




More information about the antlr-interest mailing list