[antlr-interest] Problem when parsing numerics

Thomas Woelfle thomas.woelfle at interactive-objects.com
Mon Mar 2 08:10:43 PST 2009


Hi,

I've been running in an almost similar problem again.

The subject language that has to be parsed defines some keywords which 
begin with a '.'. Besides that there are specific names allowed and '.' 
is allowed to be a token too.

The reduced lexer grammar that produces the problem is:

DOT: '.';

ARG: ('.ARG')=> '.ARG';

ATT: ('.ATT')=> '.ATT';

NAME
  :
  ('A'..'Z')*;


Valid inputs in the subject language are:

'.' which should result in one token DOT
'.ARG' which should result in one token ARG
'.ATT' which should result in one token ATT
'ALFRED' which should result in one token NAME
'ALFRED.ABACUS' which should result in three tokens NAME DOT NAME

Everything works fine except the last input. When lexing that string the 
lexer logs an error "no viable alternative at character '.' " and 
returns only two NAME tokens but no DOT token.

I guess this is the same problem with the lookahead, isn't it?

Any idea how to change the lexer grammer so that it is able to tokenize 
all of the valid inputs?


Regards,
Thomas
> At 20:39 25/02/2009, Thomas Woelfle wrote:
> >thanks for the tip. Using a syntactic predicate works. But to me
> >this seems to be a bug in the algorithm that examines the minimal
> >amount of lookahread since it calculates a wrong minimal lookahead,
> >isn't it?
>
> Yes and no.  Because there's a loop involved, the minimal lookahead is 
> infinite, which is a little hard to express in static code :)
>
> ANTLR probably could be a little bit smarter and generate the synpred 
> for you behind the scenes (which is something I was suggesting myself 
> about a year ago), but the current architecture and its 
> half-ANTLR-v2-ness apparently doesn't lend itself too well to that.  
> Hopefully that'll get better before too long, especially with the 
> upcoming CSharp3 port being self-hosted ;)
>


-- 
Interactive Objects Software GmbH
Basler Strasse 61
79100 Freiburg, Germany

Phone:  +49 761 400 73 0
mailto:thomas.woelfle at interactive-objects.com


------------------------------------------------------------------------

Interactive Objects' Legacy Modernization Solutions 

Get Your Applications SOA-Ready!

See http://www.interactive-objects.com/ for more information.

------------------------------------------------------------------------


Interactive Objects Software GmbH | Freiburg | Geschäftsführer: Alberto Perandones, Andrea Hemprich
| AG Frbg. HRB 5810 | USt-ID: DE 197983057



More information about the antlr-interest mailing list