[antlr-interest] about range float and stuff

Fri Nov 4 01:48:00 PDT 2011

Hi Fabien,

Handling this in the parser will make your life much harder than it has to.
Doing it in the lexer, you will need a bit of custom code, but I'd go for
something similar to this (something like it is on the WIki somewhere, but
can't find it...):

grammar RangeDemo;

@lexer::members {

  java.util.Queue<Token> tokens = new java.util.LinkedList<Token>();

  public void offer(int ttype, String ttext) {
    emit(new CommonToken(ttype, ttext));
  }

  @Override
  public void emit(Token t) {
    state.token = t;
    tokens.offer(t);
  }

  @Override
  public Token nextToken() {
    super.nextToken();
    return tokens.isEmpty() ? Token.EOF_TOKEN : tokens.poll();
  }
}

parse
  :  (t=. {System.out.printf("\%-10s \%s\n", tokenNames[$t.type],
$t.text);})* EOF
  ;

FLOAT
  :  INT '..'   {offer(INT, $INT.text); offer(RANGE, "..");}
  |  OCTAL '..' {offer(OCTAL, $OCTAL.text); offer(RANGE, "..");}
  |  '.' DIGITS
  |  DIGITS '.' DIGITS?
  ;

RANGE
  :  '..'
  ;

INT
  :  '1'..'9' DIGIT*
  |  '0'
  ;

OCTAL
  :  '0' ('0'..'7')+
  ;

fragment DIGITS : DIGIT+;
fragment DIGIT  : '0'..'9';

SPACE
  :  (' ' | '\t' | '\r' | '\n') {skip();}
  ;

And if you run the class:

import org.antlr.runtime.*;

public class Main {
  public static void main(String[] args) throws Exception {
    String src = "..07..8.5 1.9..02 1..3.4";
    RangeDemoLexer lexer = new RangeDemoLexer(new ANTLRStringStream(src));
    RangeDemoParser parser = new RangeDemoParser(new
CommonTokenStream(lexer));
    System.out.println("Parsing: '" + src + "'");
    parser.parse();
  }
}

You'll see the following being printed to the console:

Parsing: '..07..8.5 1.9..02 1..3.4'
RANGE      ..
OCTAL      07
RANGE      ..
FLOAT      8.5
FLOAT      1.9
RANGE      ..
OCTAL      02
INT        1
RANGE      ..
FLOAT      3.4

Regards,

Bart.

On Fri, Nov 4, 2011 at 7:28 AM, Fabien Hermenier
<hermenierfabien at gmail.com>wrote:

> Hi
>
> In an earlier version of my language, I had to parse range of integers
> in various base. Now I want to include float. I have read
>
> http://www.antlr.org/wiki/display/ANTLR3/Lexer+grammar+for+floating+point,+dot,+range,+time+specs
> but I've still got some questions.
>
> All the work seems to be done at the lexer level so the type of the
> following tokens will be as example:
> 5 : DECIMAL_LITTERAL
> 07 : OCTAL_LITTERAL
> 7.5: FLOATING_POINT_LITTERAL
> 5..7 : DOTDOT
>
> In the last example, the result is not very convenient because I will
> still have to extract the bounds
> and compute their type by myself which seems quite redundant with the
> job performed by the lexer.
> May be I am missing something ?
>
> I would rather be able to express the range at the parser level which
> seems much more convenient to me:
> range: FLOATING_POINT_LITTERAL DOTDOT FLOATING_POINT_LITTERAL.
> In this way, I will also be able to manage the possible spaces between
> the bounds and the DOTDOT.
>
> So, am I right to try to parse range at the parser level ? Or is there a
> solution to extract easily the bounds with their type if I am doing the
> job at the lexer level ?
>
> Thanks in advance,
> Fabien.
>
>
>
>
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe:
> http://www.antlr.org/mailman/options/antlr-interest/your-email-address
>