[antlr-interest] Problem when parsing numerics

Thomas Woelfle thomas.woelfle at interactive-objects.com
Wed Mar 4 02:15:25 PST 2009


Hi Jim,

thanks for the reply.

I am still running into the same problem.

The grammer now is:


lexer grammar Simple;

options
{
language = Java;
}

@header
{
  package test;
}

fragment DOT_PROG: ;
fragment DOT_SL: ;
fragment DOT_PRINT: ;
fragment DOT_ADD: ;
fragment DOT_SPP: ;

DOT: '.'
  (
  ('PROG')=>'PROG' {$type=DOT_PROG;}
  |('SL')=>'SL' {$type=DOT_SL;}
  |('PRT')=>'PRT' {$type=DOT_PRINT;}
//  |('ADD')=>'ADD' {$type=DOT_ADD;}
  |('SPP')=>'SPP' {$type=DOT_SPP;}
  )?
  ;

WORD: ('A'..'Z')+;

given the input ".S" the lexing result is a token DOT and then a token 
WORD. But as soon as the comment is removed from the fourth alternative 
using the same input the result is "no viable alternative at character 
'<EOF>'"

I've read a bit through the generated lexer code. The major difference 
between the version that works and the version that fails seems to be 
that in the working version no "dfa.predict" call is used. I have no 
idea why the ANTLR generator in one case generates code that uses the 
DFA and in the other case generates code that doesn't use the DFAs. But 
all in all this complete behaviour seems to me like a serious bug in 
ANTLR. I've tried the same lexer grammar in JavaCC without any problems. 
Is there any way to work around this bug without having to write a lexer 
on my own?

Regards,
Thomas
> Thomas Woelfle wrote:
>> Hi,
>>
>> I've been running in an almost similar problem again.
>>
>> The subject language that has to be parsed defines some keywords which 
>> begin with a '.'. Besides that there are specific names allowed and '.' 
>> is allowed to be a token too.
>>
>> The reduced lexer grammar that produces the problem is:
>>
>> DOT: '.';
>>
>> ARG: ('.ARG')=> '.ARG';
>>
>> ATT: ('.ATT')=> '.ATT';
>>
>> NAME
>>   :
>>   ('A'..'Z')*;
>>
>>
>>   
> This token allows a match of an empty string and is going to cause all 
> sorts of problems. You want:
>
> NAME : ('A'..'Z')+;
>
> Then if you still have problems, either do:
>
> DOT : '.';
> ARG: 'ARG';
> ATT : 'ATT';
>
> ident : ID
>         | DOT (ARG|ATT)
>         ;
>
> Or:
>
> fragment ARG : ;  // Define token number and document
> fragment ATT : ; // Define token number and document
> DOT : '.'
>            (  ('ARG')=>'ARG'  { $type = ARG; }
>              | ('ATT')=>'ATT'    { $type = ATT; }
>            )
>     
>
> Jim
> ------------------------------------------------------------------------
>
>
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address
>   


-- 
Interactive Objects Software GmbH
Basler Strasse 61
79100 Freiburg, Germany

Phone:  +49 761 400 73 0
mailto:thomas.woelfle at interactive-objects.com


------------------------------------------------------------------------

Interactive Objects' Legacy Modernization Solutions 

Get Your Applications SOA-Ready!

See http://www.interactive-objects.com/ for more information.

------------------------------------------------------------------------


Interactive Objects Software GmbH | Freiburg | Geschäftsführer: Alberto Perandones, Andrea Hemprich
| AG Frbg. HRB 5810 | USt-ID: DE 197983057

-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: SimpleWorks.java
Url: http://www.antlr.org/pipermail/antlr-interest/attachments/20090304/317ecef8/attachment.pl 
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: SimpleFails.java
Url: http://www.antlr.org/pipermail/antlr-interest/attachments/20090304/317ecef8/attachment-0001.pl 
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: Main.java
Url: http://www.antlr.org/pipermail/antlr-interest/attachments/20090304/317ecef8/attachment-0002.pl 
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: Simple.g
Url: http://www.antlr.org/pipermail/antlr-interest/attachments/20090304/317ecef8/attachment-0003.pl 


More information about the antlr-interest mailing list