[antlr-interest] Troubles lexing a decimal, (from an antlr beginner)

Wed Jul 25 07:58:04 PDT 2007

> -----Original Message-----
> From: antlr-interest-bounces at antlr.org [mailto:antlr-interest-
> bounces at antlr.org] On Behalf Of Johannes Luber
> Sent: Wednesday, July 25, 2007 2:34 AM
> Cc: antlr-interest at antlr.org
> Subject: Re: [antlr-interest] Troubles lexing a decimal,(from an antlr
> beginner)
> 
> Igor Murashkin wrote:
> > Hello,
> >
> > Thanks for all the help. I used a syntactic predicate like Jim
> suggested
> > and it seems to lex everything properly now. I wish I understood
> more
> > academically why my original lexing syntax didn't work, does ANTLR
> not
> > put the tokens back and backtrack when it fails to match a rule?
> 
> Backtracking has to be explicitly activated because this option is
more
> time consuming than a straight pass.

This was a lexing question. Igor is asking why ANTLR does not generate
code that acts like {f}lex in that you can get through a matching
sequence and then decide to YYREJECT; manually or the algorithm will
give up and try the next rule and so on.

ANTLR generates recursive descent recognizers and so there is no [neat]
way to pop back up the recognition chain and start again. In practice,
this just means you have to get your head around it until you have
expunged {f}lex from your brain. It creates some lexing problems which
are difficult to solve until you have the gestalt. 

The easiest way is look at your tokens, merge common roots and write the
lexing rule so that it branches where the tokens will differ then uses
an action to set the type. You don't need to go to this trouble for
keywords with common roots 'call' 'calling' etc, but when you are
constructing compounds like INT.INT in the lexer and INT.xxx can mean
something else, then you need to guide the lexer analysis a bit. It may
not be exactly intuitive (at least not at first) but if you start
looking at the generated code, then as a programmer it may help you to
see what is happening, even if you don't a have a firm grasp of the
theory.

Ter has recently stated that he may look at the algorithm in order to
make it generate some of the 'intuitive' cases as one might expect. Of
course, that will screw up those of us that have got used to the way it
is ;-)

Jim