[antlr-interest] languages without reserved words

Martin Probst mail at martin-probst.com
Wed Mar 8 02:33:21 PST 2006


> > The problem is that every parser would first need to know if an NCNAME
> > comes up to decide to enter the identifier rules. That's always a
> > problem with state switching in the parser - lookahead stops working
> > predictably. There is no way around that - you can't decide to enter the
> > identifier rule before you know there's an identifier coming up, for
> > which you need to know whether you're in the identifier rule ...
> Not quite. There's a line "if (LA(1)==NCNAME)" in the code "to know if an 
> NCNAME comes up". The trick I would need is simply a 
> lexer.testLiterals=false; call right before it. That's all. I could modify 
> the generated code by hand, but that's something I'd rather avoid for now.

The problem is where to check for NCNAMEs and where not to. You
certainly have some parts in your grammar where you expect NCNAMEs and
some parts where you have to test for the operators. The knowledge where
that is appropriate and where not is not available to ANTLR.

If it's really exactly one place, then this probably means your just
referring there from exactly one place, i.e. the "identifier" rule is
only accessed from one point. You can then switch off literal testing in
the calling rule before the branching decision is made, e.g.

  BAR BAZ { lexer.testLiterals = off; } identifier;


More information about the antlr-interest mailing list