[antlr-interest] Problems parsing numbers

Bolek Vrany lenochodpetiprsty at seznam.cz
Wed Oct 24 05:30:36 PDT 2007


I simplified the grammer to

grammar SquareD;

rule	:	expr+;

expr	:	litExpr | valExpr | padExpr;

litExpr	:	LITERAL ;
valExpr	:	VALFEATID;
padExpr	:	PAD;

fragment	// Any character allowed in identifiers
IDCHAR	:	('a'..'z'|'A'..'Z'|'_'|'0'..'9');

VALFEATID	:	// Value or short form feature id. It can be either a number
or an alphanumeric sequence
			IDCHAR+;
		
fragment		
DIGIT	: 	'0'..'9';

LITERAL	: 	// This is the numerical literal
		'$' DIGIT+ '$'
	|	'$' DIGIT+ '.' DIGIT+ '$'
	;
	
PAD	:	'{PAD}';	
	
// Newline and whitespace	
NEWLINE	:	'\r'? '\n' ;
WS  	:	(' '|'\t'|'\n'|'\r')+ {skip();} ;

and tried to interpret in ANTLRWorks the following
$12345.567890$ - I get the correct tree here (a signle litExpr)
1c234 - I get the correct tree here (a single valExpr)
$1c2345.567890$ - I expect to get a single error but instead I got two
nodes, having valExpr $1c2345 and .567890. Neither of them is a valid
valExpr
1c2345.567890 - the same behaviour as for $1c2345.567890$ is expected
and the same problem is observed.

Austin Hastings wrote:
> You are saying $c...$, but the LITERAL is supposed to be DIGIT+ which 
> doesn't include 'c'. What do you expect to happen for that input?
> 
> =Austin
> 
> Bolek Vrany wrote:
>> Hello,
>>
>> I'm using ANTLR for just a few days. I need to parse a language that has
>> numerical literals enclosed in $$ like $3.14$. It also allows
>> identifiers to start with a digit, so 017 is a valid identifier. I
>> started with creating the grammar
>>
>> grammar Test;
>>
>> rule    :    expr+;
>>
>> expr    :    LITERAL
>>     |     VALFEATID
>>     |     PAD
>>     ;
>>
>> //condition    :    // A single condition
>> //            (VALFEATID | LONGFEATID) ('=' | '>' | '>=' | '<' | '<=' 
>> | '<>')
>> (VALFEATID | LONGFEATID | LITERAL | PAD);
>>
>> fragment    // Any character allowed in identifiers
>> IDCHAR    :    ('a'..'z'|'A'..'Z'|'_'|'0'..'9');
>>
>> VALFEATID    :    // Value or short form feature id. It can be either 
>> a number
>> or an alphanumeric sequence
>>             IDCHAR+ ;
>>        //LONGFEATID  :    // Long form of feature reference, type 
>> 01[A].C3G
>> //        IDCHAR IDCHAR '[' ('A' | 'P' | 'B' | 'E' | 'C' | 'I' | 'L' | 
>> 'S')
>> '].' VALFEATID ;
>>
>> fragment       DIGIT    :     '0'..'9';
>>
>> LITERAL    :     // This is the numerical literal
>>         '$' DIGIT+ '$'
>>     |    '$' DIGIT+ '.' DIGIT+ '$'
>>     ;
>>     PAD    :    '{PAD}';       // Newline and whitespace   NEWLINE    
>> :    '\r'? '\n' ;
>> WS      :    (' '|'\t'|'\n'|'\r')+ {skip();} ;
>>     but I got into problems. When I try to parse $c12345.67890" and 
>> try to
>> interpret it inside ANTLRWorks, I end up with a rule having two
>> expressions, the first being $c12345 and the other one .67890 I don't
>> know why.
>>
>> I also tried to copy the first example in The definitive ANTLR reference
>> book and it behaved much like this but somehow miraculously stopped to
>> at once. Don't know if it is an installation issue.
>>
>> I use ANTLR 3.0.1, ANTRLWorks 1.1.3, StringTemplate 3.1b1, XJLibrary 2.0
>> and Java 1.6.0_03.
>>
>> Thanks for your help in advance
>>
>>
>>
> 
> 
> 






More information about the antlr-interest mailing list