[antlr-interest] Smalltalk-like grammar? Easy! Decimal number? Help!
Kevin Twidle
kpt at doc.ic.ac.uk
Thu Nov 1 14:39:09 PDT 2007
Oti,
Thanks. My problem was because '.' is also a statement separator and
it was getting caught up with numbers. My grammar is for a
programming language so I have to allow spurious zeros. Here is what
I have now finished up with (and it works, thanks to Jim!!!):
NUMBERCONSTANT
: '-'? DIGIT+ DECIMAL (('e'|'E') '-'? DIGIT+)?
| DIGIT+ 'r' '-'? BIGDIGITS {setText(readNumber(getText()));}
| '0x' d=HEXDIGITS {setText(readNumber("16r"+$d.getText()));}
;
fragment HEXDIGITS
: ( DIGIT | 'A'..'F' | 'a'..'f' )+
;
fragment BIGDIGITS
: ( DIGIT | 'A'..'Z' )+
;
fragment DECIMAL
: (DOT DIGIT) => (DOT DIGIT+) |
;
So I can accept normal numbers with exponents, unary minus only. Any
radix numbers e.g. octal 8r777 or base 23 23r75BT or C-style hex
0xCC00. That should be enough to keep people happy!
Kevin
On 1 Nov 2007, at 22:12, Oti wrote:
> Hi Kevin and Jim,
>
> the following NUMBER lexer rule works pretty well for me:
>
> NUMBER
> : ( '+' | '-' ) ?
> ( ( ( '1' .. '9' ) ( '0' .. '9' )* ) | '0' )
> ( '.' ( '0' | ( '0' .. '9' )* ( '1' .. '9' ) ) ) ?
> ;
>
> Just as an example how to prevent leading and trailing zeroes.
> It reflects my recognition of how a simple "number" should look
> like, so YMMV.
>
> Best wishes,
> Oti.
>
> On 11/1/07, Jim Idle <jimi at temporal-wave.com> wrote:
>> See much discussion of this issue over the last 2 or 3 weeks, but
>> you need a
>> predicate on your number rule, and your DECIMAL and DIGIT and
>> LETTER rules
>> should be fragments (though you probably don't need them as
>> separate rules
>> at all really):
>>
>> NUMBER: ('0'..'9')+ ( ('.' '0'..'9')=> ('.' ('0'..'9')+)
>> |
>> )
>> ;
>> fragment
>> LETTER
>>
>> Etc...
>>
>> Hope that helps :-)
>>
>> Jim
>>
>>> -----Original Message-----
>>> From: antlr-interest-bounces at antlr.org [mailto:antlr-interest-
>>> bounces at antlr.org] On Behalf Of Kevin Twidle
>>> Sent: Thursday, November 01, 2007 8:59 AM
>>> To: antlr-interest at antlr.org
>>> Subject: [antlr-interest] Smalltalk-like grammar? Easy! Decimal
>>> number?
>>> Help!
>>>
>>> Hi,
>>>
>>> I love ANTLR! I have a quite sophisticated Smalltalk-like grammar in
>>> ANTLR using an AST which works beautifully. I have decided to allow
>>> floating point numbers and have tried adding them to the grammar.
>>> Now, Smalltalk uses '.' as a statement separator, numbers have a '.'
>>> in them - uh-oh.
>>>
>>> A number should have the form:
>>>
>>> 12 or 12.34 but not 12.
>>>
>>> I want to be able to parse
>>>
>>> 13.
>>> 13.word.
>>> 14.0.13.
>>>
>>> to get 13,13,word,14.0,13
>>>
>>> all I get is
>>>
>>> line 1:3 required (...)+ loop did not match anything at character
>>> '\n'
>>> line 2:3 required (...)+ loop did not match anything at character
>>> 'w'
>>> recoverFromMismatchedToken
>>> BR.recoverFromMismatchedToken
>>> line 3:4 mismatched input '.13' expecting EOF
>>>
>>> with tokens ord 14.0
>>>
>>> I have simplified my problem to the following grammar. The problem
>>> is that DECIMAL always matches the first '.' and then fails (I ran
>>> through the code) it never says DECIMAL is not there, it must be a
>>> statement separator! I have tried the greedy option but then it
>>> never matches the DECIMAL. I have tried reordering, fragments,
>>> greedy and now this mailing list!
>>>
>>> DECIMAL is optional, why does it fail?
>>>
>>> Any help really appreciated!
>>>
>>> Kevin
>>>
>>> grammar Number;
>>> options {output = AST;}
>>>
>>> start : statement ( DOT statement? )+ EOF;
>>>
>>> statement : WORD | NUMBER;
>>>
>>> WORD : LETTER (LETTER | DIGIT)+;
>>>
>>> NUMBER : DIGIT+ DECIMAL?;
>>>
>>> DECIMAL : DOT DIGIT+;
>>> DOT : '.';
>>> DIGIT : '0'..'9';
>>> LETTER : 'a'..'z' | 'A'..'Z';
>>> WS :
>>> (' '
>>> | '\t'
>>> | '\r' '\n'
>>> | '\n'
>>> ) +
>>> { $channel=HIDDEN; }
>>> ;
>>
>>
>>
>
>
More information about the antlr-interest
mailing list