[antlr-interest] about range float and stuff

Fri Nov 4 10:18:38 PDT 2011

You may prefer whatever solution you like of course (though these are not
the same solution), but you should be accurate about the other solutions
and take the time to read through them.

Jim

> -----Original Message-----
> From: antlr-interest-bounces at antlr.org [mailto:antlr-interest-
> bounces at antlr.org] On Behalf Of Bart Kiers
> Sent: Friday, November 04, 2011 10:13 AM
> To: antlr-interest at antlr.org interest
> Subject: Re: [antlr-interest] about range float and stuff
>
> If your (Jim) definition of "without code" means no @members section,
> then I find it a bit of an odd definition since the lexer rules from
> http://www.antlr.org/wiki/display/ANTLR3/Lexer+grammar+for+floating+poi
> nt,+dot,+range,+time+specs
> are
> littered with `{ ... }` code blocks: not what I'd call "without code".
> I much prefer the solution proposed by Terence in
> http://www.antlr.org/wiki/pages/viewpage.action?pageId=3604497 (which I
> based my suggestion on): far less verbose than the first option, IMO.
>
> Bart.
>
>
> On Fri, Nov 4, 2011 at 5:59 PM, Bart Kiers <bkiers at gmail.com> wrote:
>
> > The only wiki-link posted in this thread is
> >
> http://www.antlr.org/wiki/display/ANTLR3/Lexer+grammar+for+floating+po
> > int,+dot,+range,+time+specs which contains Java code, so you must
> mean
> > something else (of which, I have no idea of)...
> >
> > Bart.
> >
> >
> > On Fri, Nov 4, 2011 at 5:47 PM, Jim Idle <jimi at temporal-wave.com>
> wrote:
> >
> >> The example on the Wiki already does all of this in the lexer, but
> >> without any code.
> >>
> >> Jim
> >>
> >> > -----Original Message-----
> >> > From: antlr-interest-bounces at antlr.org [mailto:antlr-interest-
> >> > bounces at antlr.org] On Behalf Of Bart Kiers
> >> > Sent: Friday, November 04, 2011 7:12 AM
> >> > To: Fabien Hermenier
> >> > Cc: antlr-interest at antlr.org
> >> > Subject: Re: [antlr-interest] about range float and stuff
> >> >
> >> > You're welcome Fabien, but note that it most likely looks a lot
> >> > like something I found on the ANTLR Wiki: so I can't claim credit
> >> > for it (perhaps a small part! :)).
> >> > I'll have a look later on and see if I can dig up the Wiki page.
> >> >
> >> > Regards,
> >> >
> >> > Bart.
> >> >
> >> >
> >> > On Fri, Nov 4, 2011 at 3:04 PM, Fabien Hermenier
> >> > <hermenierfabien at gmail.com>wrote:
> >> >
> >> > >  Thanks Bart, I think I have understand your approach and
> indeed,
> >> > > it seems beautiful and simple.
> >> > > I will try your solution during the week-end.
> >> > >
> >> > > Fabien.
> >> > >
> >> > > Le 04/11/11 02:48, Bart Kiers a écrit :
> >> > >
> >> > > Hi Fabien,
> >> > >
> >> > >  Handling this in the parser will make your life much harder
> than
> >> > > it has to. Doing it in the lexer, you will need a bit of custom
> >> > > code,
> >> > but
> >> > > I'd go for something similar to this (something like it is on
> the
> >> > WIki
> >> > > somewhere, but can't find it...):
> >> > >
> >> > >  grammar RangeDemo;
> >> > >
> >> > >  @lexer::members {
> >> > >
> >> > >    java.util.Queue<Token> tokens = new
> >> > > java.util.LinkedList<Token>();
> >> > >
> >> > >    public void offer(int ttype, String ttext) {
> >> > >     emit(new CommonToken(ttype, ttext));
> >> > >   }
> >> > >
> >> > >   @Override
> >> > >   public void emit(Token t) {
> >> > >     state.token = t;
> >> > >     tokens.offer(t);
> >> > >   }
> >> > >
> >> > >   @Override
> >> > >   public Token nextToken() {
> >> > >     super.nextToken();
> >> > >     return tokens.isEmpty() ? Token.EOF_TOKEN : tokens.poll();
> >> > >   }
> >> > > }
> >> > >
> >> > >  parse
> >> > >   :  (t=. {System.out.printf("\%-10s \%s\n",
> tokenNames[$t.type],
> >> > > $t.text);})* EOF
> >> > >   ;
> >> > >
> >> > >  FLOAT
> >> > >   :  INT '..'   {offer(INT, $INT.text); offer(RANGE, "..");}
> >> > >   |  OCTAL '..' {offer(OCTAL, $OCTAL.text); offer(RANGE, "..");}
> >> > >   |  '.' DIGITS
> >> > >   |  DIGITS '.' DIGITS?
> >> > >   ;
> >> > >
> >> > >  RANGE
> >> > >   :  '..'
> >> > >   ;
> >> > >
> >> > >  INT
> >> > >   :  '1'..'9' DIGIT*
> >> > >   |  '0'
> >> > >   ;
> >> > >
> >> > >  OCTAL
> >> > >   :  '0' ('0'..'7')+
> >> > >    ;
> >> > >
> >> > >  fragment DIGITS : DIGIT+;
> >> > > fragment DIGIT  : '0'..'9';
> >> > >
> >> > >  SPACE
> >> > >   :  (' ' | '\t' | '\r' | '\n') {skip();}
> >> > >   ;
> >> > >
> >> > >  And if you run the class:
> >> > >
> >> > >  import org.antlr.runtime.*;
> >> > >
> >> > >  public class Main {
> >> > >   public static void main(String[] args) throws Exception {
> >> > >     String src = "..07..8.5 1.9..02 1..3.4";
> >> > >     RangeDemoLexer lexer = new RangeDemoLexer(new
> >> > ANTLRStringStream(src));
> >> > >     RangeDemoParser parser = new RangeDemoParser(new
> >> > > CommonTokenStream(lexer));
> >> > >     System.out.println("Parsing: '" + src + "'");
> >> > >     parser.parse();
> >> > >   }
> >> > > }
> >> > >
> >> > >  You'll see the following being printed to the console:
> >> > >
> >> > >  Parsing: '..07..8.5 1.9..02 1..3.4'
> >> > > RANGE      ..
> >> > > OCTAL      07
> >> > > RANGE      ..
> >> > > FLOAT      8.5
> >> > > FLOAT      1.9
> >> > > RANGE      ..
> >> > >  OCTAL      02
> >> > > INT        1
> >> > > RANGE      ..
> >> > > FLOAT      3.4
> >> > >
> >> > >
> >> > >  Regards,
> >> > >
> >> > >  Bart.
> >> > >
> >> > >
> >> > >
> >> > >  On Fri, Nov 4, 2011 at 7:28 AM, Fabien Hermenier <
> >> > > hermenierfabien at gmail.com> wrote:
> >> > >
> >> > >> Hi
> >> > >>
> >> > >> In an earlier version of my language, I had to parse range of
> >> > >> integers in various base. Now I want to include float. I have
> >> > >> read
> >> > >>
> >> > >>
> >> >
> http://www.antlr.org/wiki/display/ANTLR3/Lexer+grammar+for+floating
> >> > +p
> >> > >> oint,+dot,+range,+time+specs
> >> > >> but I've still got some questions.
> >> > >>
> >> > >> All the work seems to be done at the lexer level so the type of
> >> > >> the following tokens will be as example:
> >> > >> 5 : DECIMAL_LITTERAL
> >> > >> 07 : OCTAL_LITTERAL
> >> > >> 7.5: FLOATING_POINT_LITTERAL
> >> > >> 5..7 : DOTDOT
> >> > >>
> >> > >> In the last example, the result is not very convenient because
> I
> >> > will
> >> > >> still have to extract the bounds and compute their type by
> >> > >> myself which seems quite redundant with the job performed by
> the lexer.
> >> > >> May be I am missing something ?
> >> > >>
> >> > >> I would rather be able to express the range at the parser level
> >> > which
> >> > >> seems much more convenient to me:
> >> > >> range: FLOATING_POINT_LITTERAL DOTDOT FLOATING_POINT_LITTERAL.
> >> > >> In this way, I will also be able to manage the possible spaces
> >> > >> between the bounds and the DOTDOT.
> >> > >>
> >> > >> So, am I right to try to parse range at the parser level ? Or
> is
> >> > >> there a solution to extract easily the bounds with their type
> if
> >> > >> I
> >> > am
> >> > >> doing the job at the lexer level ?
> >> > >>
> >> > >> Thanks in advance,
> >> > >> Fabien.
> >> > >>
> >> > >>
> >> > >>
> >> > >>
> >> > >> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> >> > >> Unsubscribe:
> >> > >> http://www.antlr.org/mailman/options/antlr-interest/your-email-
> >> > addres
> >> > >> s
> >> > >>
> >> > >
> >> > >
> >> > >
> >> >
> >> > List: http://www.antlr.org/mailman/listinfo/antlr-interest
> >> > Unsubscribe:
> >> > http://www.antlr.org/mailman/options/antlr-interest/your-
> >> > email-address
> >>
> >> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> >> Unsubscribe:
> >> http://www.antlr.org/mailman/options/antlr-interest/your-email-
> addres
> >> s
> >>
> >
> >
>
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-
> email-address