[antlr-interest] lexical modes

Martin Probst mail at martin-probst.com
Wed Jun 7 15:00:59 PDT 2006


Hi,

> Something like that...make sense to add?  ANTLR can just switch-on- 
> mode when it enters nextToken() to jump to the appropriate set of  
> lexical rules.

That's exactly what I need. Make it easy and cheap to switch, and you  
can hit a lot of problems with that.

Something has to be done about duplicate rules in that case, e.g.

grammar regex:

CHAR: ...;

mode inside_brackets;

CHAR: ...;

I would opt to treat them identical from the grammar, e.g. have the  
same token type map to CHAR. This is probably useful for whitespace,  
comments etc. JFlex also has something called implicit rules - rules  
that are not specified as belonging to a certain state can be matched  
always, except for explicit states. That's nice for whitespace and  
comments. Plus it has the ability to have a rule belong to multiple  
states, but I think that's overkill. I don't want ANTLR to implement  
JFlex' feature set, I just wanted to tell for comparison.

BTW I would prefer a syntax where groups are not separated by  
statements but enclosed in parentheses. E.g.

mode A {
   ruleA1: ...;
   ruleA2: ...;
}

mode B {
   ruleB1: ...;
   ruleB2: ...;
}

But maybe I'm just tainted by XML ;-)

Regards,
Martin



More information about the antlr-interest mailing list