[antlr-interest] about range float and stuff

Fri Nov 4 10:03:13 PDT 2011

For what it's worth, I found the Wiki entry I based my suggestion on:
http://www.antlr.org/wiki/pages/viewpage.action?pageId=3604497

Regards,

Bart.

On Fri, Nov 4, 2011 at 3:11 PM, Bart Kiers <bkiers at gmail.com> wrote:

> You're welcome Fabien, but note that it most likely looks a lot like
> something I found on the ANTLR Wiki: so I can't claim credit for it
> (perhaps a small part! :)).
> I'll have a look later on and see if I can dig up the Wiki page.
>
> Regards,
>
> Bart.
>
>
> On Fri, Nov 4, 2011 at 3:04 PM, Fabien Hermenier <
> hermenierfabien at gmail.com> wrote:
>
>>  Thanks Bart, I think I have understand your approach and indeed, it
>> seems beautiful and simple.
>> I will try your solution during the week-end.
>>
>> Fabien.
>>
>> Le 04/11/11 02:48, Bart Kiers a écrit :
>>
>> Hi Fabien,
>>
>>  Handling this in the parser will make your life much harder than it has
>> to. Doing it in the lexer, you will need a bit of custom code, but I'd go
>> for something similar to this (something like it is on the WIki somewhere,
>> but can't find it...):
>>
>>  grammar RangeDemo;
>>
>>  @lexer::members {
>>
>>    java.util.Queue<Token> tokens = new java.util.LinkedList<Token>();
>>
>>    public void offer(int ttype, String ttext) {
>>     emit(new CommonToken(ttype, ttext));
>>   }
>>
>>   @Override
>>   public void emit(Token t) {
>>     state.token = t;
>>     tokens.offer(t);
>>   }
>>
>>   @Override
>>   public Token nextToken() {
>>     super.nextToken();
>>     return tokens.isEmpty() ? Token.EOF_TOKEN : tokens.poll();
>>   }
>> }
>>
>>  parse
>>   :  (t=. {System.out.printf("\%-10s \%s\n", tokenNames[$t.type],
>> $t.text);})* EOF
>>   ;
>>
>>  FLOAT
>>   :  INT '..'   {offer(INT, $INT.text); offer(RANGE, "..");}
>>   |  OCTAL '..' {offer(OCTAL, $OCTAL.text); offer(RANGE, "..");}
>>   |  '.' DIGITS
>>   |  DIGITS '.' DIGITS?
>>   ;
>>
>>  RANGE
>>   :  '..'
>>   ;
>>
>>  INT
>>   :  '1'..'9' DIGIT*
>>   |  '0'
>>   ;
>>
>>  OCTAL
>>   :  '0' ('0'..'7')+
>>    ;
>>
>>  fragment DIGITS : DIGIT+;
>> fragment DIGIT  : '0'..'9';
>>
>>  SPACE
>>   :  (' ' | '\t' | '\r' | '\n') {skip();}
>>   ;
>>
>>  And if you run the class:
>>
>>  import org.antlr.runtime.*;
>>
>>  public class Main {
>>   public static void main(String[] args) throws Exception {
>>     String src = "..07..8.5 1.9..02 1..3.4";
>>     RangeDemoLexer lexer = new RangeDemoLexer(new ANTLRStringStream(src));
>>     RangeDemoParser parser = new RangeDemoParser(new
>> CommonTokenStream(lexer));
>>     System.out.println("Parsing: '" + src + "'");
>>     parser.parse();
>>   }
>> }
>>
>>  You'll see the following being printed to the console:
>>
>>  Parsing: '..07..8.5 1.9..02 1..3.4'
>> RANGE      ..
>> OCTAL      07
>> RANGE      ..
>> FLOAT      8.5
>> FLOAT      1.9
>> RANGE      ..
>>  OCTAL      02
>> INT        1
>> RANGE      ..
>> FLOAT      3.4
>>
>>
>>  Regards,
>>
>>  Bart.
>>
>>
>>
>>  On Fri, Nov 4, 2011 at 7:28 AM, Fabien Hermenier <
>> hermenierfabien at gmail.com> wrote:
>>
>>> Hi
>>>
>>> In an earlier version of my language, I had to parse range of integers
>>> in various base. Now I want to include float. I have read
>>>
>>> http://www.antlr.org/wiki/display/ANTLR3/Lexer+grammar+for+floating+point,+dot,+range,+time+specs
>>> but I've still got some questions.
>>>
>>> All the work seems to be done at the lexer level so the type of the
>>> following tokens will be as example:
>>> 5 : DECIMAL_LITTERAL
>>> 07 : OCTAL_LITTERAL
>>> 7.5: FLOATING_POINT_LITTERAL
>>> 5..7 : DOTDOT
>>>
>>> In the last example, the result is not very convenient because I will
>>> still have to extract the bounds
>>> and compute their type by myself which seems quite redundant with the
>>> job performed by the lexer.
>>> May be I am missing something ?
>>>
>>> I would rather be able to express the range at the parser level which
>>> seems much more convenient to me:
>>> range: FLOATING_POINT_LITTERAL DOTDOT FLOATING_POINT_LITTERAL.
>>> In this way, I will also be able to manage the possible spaces between
>>> the bounds and the DOTDOT.
>>>
>>> So, am I right to try to parse range at the parser level ? Or is there a
>>> solution to extract easily the bounds with their type if I am doing the
>>> job at the lexer level ?
>>>
>>> Thanks in advance,
>>> Fabien.
>>>
>>>
>>>
>>>
>>> List: http://www.antlr.org/mailman/listinfo/antlr-interest
>>> Unsubscribe:
>>> http://www.antlr.org/mailman/options/antlr-interest/your-email-address
>>>
>>
>>
>>
>