[antlr-interest] Smalltalk-like grammar? Easy! Decimal number? Help!

Kevin Twidle kpt at doc.ic.ac.uk
Thu Nov 1 14:39:09 PDT 2007


Oti,

Thanks.  My problem was because '.' is also a statement separator and  
it was getting caught up with numbers.  My grammar is for a  
programming language so I have to allow spurious zeros.  Here is what  
I have now finished up with (and it works, thanks to Jim!!!):

NUMBERCONSTANT
	:	'-'? DIGIT+ DECIMAL (('e'|'E') '-'? DIGIT+)?
	|	DIGIT+ 'r' '-'? BIGDIGITS {setText(readNumber(getText()));}
	|	'0x' d=HEXDIGITS {setText(readNumber("16r"+$d.getText()));}
	;
fragment HEXDIGITS
	:	( DIGIT | 'A'..'F' | 'a'..'f' )+
	;
fragment BIGDIGITS
	:	( DIGIT | 'A'..'Z' )+
	;
fragment DECIMAL
	:	(DOT DIGIT) => (DOT DIGIT+) |
	;	

So I can accept normal numbers with exponents, unary minus only.  Any  
radix numbers e.g. octal  8r777 or base 23  23r75BT or C-style hex  
0xCC00.  That should be enough to keep people happy!

Kevin


On 1 Nov 2007, at 22:12, Oti wrote:

> Hi Kevin and Jim,
>
> the following NUMBER lexer rule works pretty well for me:
>
> NUMBER
> 	:	( '+' | '-' ) ?
> 		( ( ( '1' .. '9' ) ( '0' .. '9' )* ) | '0' )
> 		( '.' ( '0' | ( '0' .. '9' )* ( '1' .. '9' ) ) ) ?
> 		;
>
> Just as an example how to prevent leading and trailing zeroes.
> It reflects my recognition of how a simple "number" should look  
> like, so YMMV.
>
> Best wishes,
> Oti.
>
> On 11/1/07, Jim Idle <jimi at temporal-wave.com> wrote:
>> See much discussion of this issue over the last 2 or 3 weeks, but  
>> you need a
>> predicate on your number rule, and your DECIMAL and DIGIT and  
>> LETTER rules
>> should be fragments (though you probably don't need them as  
>> separate rules
>> at all really):
>>
>> NUMBER: ('0'..'9')+ (   ('.' '0'..'9')=> ('.' ('0'..'9')+)
>>                       |
>>                     )
>>       ;
>> fragment
>> LETTER
>>
>>  Etc...
>>
>> Hope that helps :-)
>>
>> Jim
>>
>>> -----Original Message-----
>>> From: antlr-interest-bounces at antlr.org [mailto:antlr-interest-
>>> bounces at antlr.org] On Behalf Of Kevin Twidle
>>> Sent: Thursday, November 01, 2007 8:59 AM
>>> To: antlr-interest at antlr.org
>>> Subject: [antlr-interest] Smalltalk-like grammar? Easy! Decimal  
>>> number?
>>> Help!
>>>
>>> Hi,
>>>
>>> I love ANTLR! I have a quite sophisticated Smalltalk-like grammar in
>>> ANTLR using an AST which works beautifully.  I have decided to allow
>>> floating point numbers and have tried adding them to the grammar.
>>> Now, Smalltalk uses '.' as a statement separator, numbers have a '.'
>>> in them - uh-oh.
>>>
>>> A number should have the form:
>>>
>>> 12 or 12.34 but not 12.
>>>
>>> I want to be able to parse
>>>
>>> 13.
>>> 13.word.
>>> 14.0.13.
>>>
>>> to get  13,13,word,14.0,13
>>>
>>> all I get is
>>>
>>> line 1:3 required (...)+ loop did not match anything at character  
>>> '\n'
>>> line 2:3 required (...)+ loop did not match anything at character  
>>> 'w'
>>> recoverFromMismatchedToken
>>> BR.recoverFromMismatchedToken
>>> line 3:4 mismatched input '.13' expecting EOF
>>>
>>> with tokens       ord     14.0
>>>
>>> I have simplified my problem to the following grammar.  The problem
>>> is that DECIMAL always matches the first '.' and then fails (I ran
>>> through the code) it never says DECIMAL is not there, it must be a
>>> statement separator!  I have tried the greedy option but then it
>>> never matches the DECIMAL.  I have tried reordering, fragments,
>>> greedy and now this mailing list!
>>>
>>> DECIMAL is optional, why does it fail?
>>>
>>> Any help really appreciated!
>>>
>>> Kevin
>>>
>>> grammar Number;
>>> options {output = AST;}
>>>
>>> start :       statement ( DOT statement? )+ EOF;
>>>
>>> statement :   WORD | NUMBER;
>>>
>>> WORD  :       LETTER (LETTER | DIGIT)+;
>>>
>>> NUMBER        :       DIGIT+ DECIMAL?;
>>>
>>> DECIMAL       :       DOT DIGIT+;
>>> DOT   :       '.';
>>> DIGIT :       '0'..'9';
>>> LETTER        :       'a'..'z' | 'A'..'Z';
>>> WS      :
>>>           (' '
>>>           | '\t'
>>>           | '\r' '\n'
>>>           | '\n'
>>>           ) +
>>>           { $channel=HIDDEN; }
>>>       ;
>>
>>
>>
>
>



More information about the antlr-interest mailing list