[antlr-interest] Comment rule matches links

Martin Probst mail at martin-probst.com
Tue Aug 26 04:10:51 PDT 2008


>> Island grammars under lexer control will probably not cut it, as the
>> '/' token is ambiguous in many languages, e.g.
>> "int x = 5 / 3;" vs "match(/3/, ...);". The grammar/lexer switching
>> has to be done in the parser.
>>
> That's right, but how can I implement an island grammar under parser
> control if the string matching already was done in the lexer?

You can build a Lexer that doesn't lex all the stuff in one big run  
upfront, but rather scans the input for every LA(xx) call. I did this  
in http://code.google.com/p/xqpretty/, even though my sub-grammar was  
much more complex than just a regexp string, it might help you.

> Due to the fact comments are not part of the program statements,  
> they have
> to be skpped in the lexer, and to avoid strings containing //s to be
> skipped, I implemented the string token rule also in the lexer. So I  
> really
> need a way to handle my regexp problem in the lexer, too - or is there
> another way?

Not that I know of. You indeed need to handle RegExps in the Lexer,  
but as I said, you can teach your parser to switch the lexer in use  
for the regular program with a specific lexer just for the RegExp  
context.

Regards,
Martin



More information about the antlr-interest mailing list