[antlr-interest] Context-sensitive lexing

Mon Nov 19 00:48:28 PST 2007

Steve Bennett wrote:
> .....
> I gather that most programming languages don't have this drama,
> because there are generally two lexing situations: normal text, where
> { and -> are special tokens, or strings/comments, where /* blah -> {
> blah */ is treated as a single token. But what would you do if you
> wanted to actually parse the contents of that comment, rather than
> making it a monolithic token?
> .......
> Steve
>   

For cases like comments, there is an alternative to island grammars. 
Since the lexer is capable of recognizing  the boundaries of the 
comment, you can have it return a comment to the parser. The parser 
calls another lexerer/parser passing them the content of the comment. 
This involves double lexing but should be fast enough.

Shmuel