[antlr-interest] about range float and stuff

Fabien Hermenier hermenierfabien at gmail.com
Mon Nov 7 08:41:06 PST 2011


Hello

I was able to write my code successfully during the week-end.
In practice, I have a mix between Jim Wiki page "How to lex numbers...." 
(a simpler version) and the tip provided by Bart from the wiki page "How 
can I emit more than a single token per lexer rule".

The lexer output example from Bart reveals to me that I make several 
mistakes when I work with a combined grammar. In this situation, the 
difference between the parser and the lexer is less clear to me. 
Splitting the two processes clarifies the situation (the window to show 
the input tokens in ANTLRworks is then very effective)

Thanks for your attention !

Fabien
Le 04/11/11 12:50, Bart Kiers a écrit :
> I wasn't talking "generally", but about this discussion.
> And I am subscribed to the ANTLR-interest list for a couple of years, so I
> know you contribute much to the list, of which I am grateful as well: I
> have learned a lot of your (verbose) contributions :)
>
> Bart.
>
>
> On Fri, Nov 4, 2011 at 7:40 PM, Jim Idle<jimi at temporal-wave.com>  wrote:
>
>> I am generally very verbose, but am currently very busy. A quick search
>> will back me up on that.
>>
>> Jim
>>
>>> -----Original Message-----
>>> From: antlr-interest-bounces at antlr.org [mailto:antlr-interest-
>>> bounces at antlr.org] On Behalf Of Bart Kiers
>>> Sent: Friday, November 04, 2011 11:30 AM
>>> To: antlr-interest at antlr.org
>>> Subject: Re: [antlr-interest] about range float and stuff
>>>
>>> Jim, this reply is far different than the clipped 1-liners you have
>>> contributed in this discussion so far.
>>>
>>> You can call my responses pedantic, but IMO you yourself are a part of
>>> the
>>> problem: by giving answers that are hard to interpret because of the
>>> lack of details you poor into it, I find it hard to comprehend what you
>>> mean.
>>>
>>> You must see the difference in this last reply of yours and the ones
>>> before it, no? Thank you for this last one, btw.
>>>
>>> Bart.
>>>
>>>
>>> On Fri, Nov 4, 2011 at 6:50 PM, Jim Idle<jimi at temporal-wave.com>
>>> wrote:
>>>
>>>> I meant that the code it uses is only for predicates. There are no
>>>> methods called to do the parse (though I never personally object to
>>>> that) or emit the tokens.
>>>>
>>>> The other code that is there is as examples on how you might handle
>>>> errors or range checks and so on. As you said you did not grasp it by
>>>> reading it, then you clearly cannot "win" by trying to make pedantic
>>>> arguments about whether there is any code or not.
>>>>
>>>> Anyway, my original point was that:
>>>>
>>>> a) The OP quoted the example I commented on;
>>>> b) He asked it do something that it already did;
>>>> c) The example originally quoted, covers all combinations of the use
>>>> of '.' including 1.method(), range and lots more, which is why it
>>>> seems verbose.
>>>>
>>>>
>>>> So, I don't know where you are going with the pedantry, but it is not
>>>> worth my time to follow it any more.
>>>>
>>>> Jim
>>>>
>>>>> -----Original Message-----
>>>>> From: antlr-interest-bounces at antlr.org [mailto:antlr-interest-
>>>>> bounces at antlr.org] On Behalf Of Bart Kiers
>>>>> Sent: Friday, November 04, 2011 10:34 AM
>>>>> To: antlr-interest at antlr.org
>>>>> Subject: Re: [antlr-interest] about range float and stuff
>>>>>
>>>>> And if you really meant that the code on
>>>>>
>>> http://www.antlr.org/wiki/display/ANTLR3/Lexer+grammar+for+floating+
>>>>> poi
>>>>> nt,+dot,+range,+time+specs
>>>>> is
>>>>> "without any code", then I disagree with that definition. Since you
>>>>> didn't comment on that anymore, I presume that _is_ what you were
>>>>> talking about.
>>>>>
>>>>> Bart.
>>>>>
>>>>>
>>>>> On Fri, Nov 4, 2011 at 6:30 PM, Bart Kiers<bkiers at gmail.com>
>>> wrote:
>>>>>> I only know that Terence's solution solves the OP's problem
>>> AFAIK,
>>>>>> whereas yours I am not sure of: I simply find it too verbose to
>>>>>> fully grasp by only reading through it. Sorry.
>>>>>>
>>>>>> Bart.
>>>>>>
>>>>>>
>>>>>> On Fri, Nov 4, 2011 at 6:18 PM, Jim Idle<jimi at temporal-wave.com>
>>>>> wrote:
>>>>>>> You may prefer whatever solution you like of course (though
>>> these
>>>>> are
>>>>>>> not the same solution), but you should be accurate about the
>>>>>>> other solutions and take the time to read through them.
>>>>>>>
>>>>>>> Jim
>>>>>>>
>>>>>>>> -----Original Message-----
>>>>>>>> From: antlr-interest-bounces at antlr.org [mailto:antlr-interest-
>>>>>>>> bounces at antlr.org] On Behalf Of Bart Kiers
>>>>>>>> Sent: Friday, November 04, 2011 10:13 AM
>>>>>>>> To: antlr-interest at antlr.org interest
>>>>>>>> Subject: Re: [antlr-interest] about range float and stuff
>>>>>>>>
>>>>>>>> If your (Jim) definition of "without code" means no @members
>>>>>>>> section, then I find it a bit of an odd definition since the
>>>>>>>> lexer rules from
>>>>>>>>
>>>>> http://www.antlr.org/wiki/display/ANTLR3/Lexer+grammar+for+floating
>>>>>>>> +poi
>>>>>>>> nt,+dot,+range,+time+specs
>>>>>>>> are
>>>>>>>> littered with `{ ... }` code blocks: not what I'd call
>>> "without
>>>>> code".
>>>>>>>> I much prefer the solution proposed by Terence in
>>>>>>>> http://www.antlr.org/wiki/pages/viewpage.action?pageId=3604497
>>>>>>>> (which I based my suggestion on): far less verbose than the
>>>>>>>> first
>>>>> option, IMO.
>>>>>>>> Bart.
>>>>>>>>
>>>>>>>>
>>>>>>>> On Fri, Nov 4, 2011 at 5:59 PM, Bart Kiers<bkiers at gmail.com>
>>>>> wrote:
>>>>>>>>> The only wiki-link posted in this thread is
>>>>>>>>>
>>>>> http://www.antlr.org/wiki/display/ANTLR3/Lexer+grammar+for+floating
>>>>>>>> +po
>>>>>>>>> int,+dot,+range,+time+specs which contains Java code, so you
>>>>> must
>>>>>>>> mean
>>>>>>>>> something else (of which, I have no idea of)...
>>>>>>>>>
>>>>>>>>> Bart.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Fri, Nov 4, 2011 at 5:47 PM, Jim Idle<jimi at temporal-
>>>>> wave.com>
>>>>>>>> wrote:
>>>>>>>>>> The example on the Wiki already does all of this in the
>>>>>>>>>> lexer, but without any code.
>>>>>>>>>>
>>>>>>>>>> Jim
>>>>>>>>>>
>>>>>>>>>>> -----Original Message-----
>>>>>>>>>>> From: antlr-interest-bounces at antlr.org [mailto:antlr-
>>>>> interest-
>>>>>>>>>>> bounces at antlr.org] On Behalf Of Bart Kiers
>>>>>>>>>>> Sent: Friday, November 04, 2011 7:12 AM
>>>>>>>>>>> To: Fabien Hermenier
>>>>>>>>>>> Cc: antlr-interest at antlr.org
>>>>>>>>>>> Subject: Re: [antlr-interest] about range float and stuff
>>>>>>>>>>>
>>>>>>>>>>> You're welcome Fabien, but note that it most likely looks
>>>>>>>>>>> a lot like something I found on the ANTLR Wiki: so I
>>> can't
>>>>> claim
>>>>>>>>>>> credit for it (perhaps a small part! :)).
>>>>>>>>>>> I'll have a look later on and see if I can dig up the
>>> Wiki
>>>>> page.
>>>>>>>>>>> Regards,
>>>>>>>>>>>
>>>>>>>>>>> Bart.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Fri, Nov 4, 2011 at 3:04 PM, Fabien Hermenier
>>>>>>>>>>> <hermenierfabien at gmail.com>wrote:
>>>>>>>>>>>
>>>>>>>>>>>>   Thanks Bart, I think I have understand your approach
>>>>>>>>>>>> and
>>>>>>>> indeed,
>>>>>>>>>>>> it seems beautiful and simple.
>>>>>>>>>>>> I will try your solution during the week-end.
>>>>>>>>>>>>
>>>>>>>>>>>> Fabien.
>>>>>>>>>>>>
>>>>>>>>>>>> Le 04/11/11 02:48, Bart Kiers a écrit :
>>>>>>>>>>>>
>>>>>>>>>>>> Hi Fabien,
>>>>>>>>>>>>
>>>>>>>>>>>>   Handling this in the parser will make your life much
>>>>> harder
>>>>>>>> than
>>>>>>>>>>>> it has to. Doing it in the lexer, you will need a bit
>>> of
>>>>>>>>>>>> custom code,
>>>>>>>>>>> but
>>>>>>>>>>>> I'd go for something similar to this (something like it
>>>>>>>>>>>> is on
>>>>>>>> the
>>>>>>>>>>> WIki
>>>>>>>>>>>> somewhere, but can't find it...):
>>>>>>>>>>>>
>>>>>>>>>>>>   grammar RangeDemo;
>>>>>>>>>>>>
>>>>>>>>>>>>   @lexer::members {
>>>>>>>>>>>>
>>>>>>>>>>>>     java.util.Queue<Token>  tokens = new
>>>>>>>>>>>> java.util.LinkedList<Token>();
>>>>>>>>>>>>
>>>>>>>>>>>>     public void offer(int ttype, String ttext) {
>>>>>>>>>>>>      emit(new CommonToken(ttype, ttext));
>>>>>>>>>>>>    }
>>>>>>>>>>>>
>>>>>>>>>>>>    @Override
>>>>>>>>>>>>    public void emit(Token t) {
>>>>>>>>>>>>      state.token = t;
>>>>>>>>>>>>      tokens.offer(t);
>>>>>>>>>>>>    }
>>>>>>>>>>>>
>>>>>>>>>>>>    @Override
>>>>>>>>>>>>    public Token nextToken() {
>>>>>>>>>>>>      super.nextToken();
>>>>>>>>>>>>      return tokens.isEmpty() ? Token.EOF_TOKEN :
>>>>> tokens.poll();
>>>>>>>>>>>>    }
>>>>>>>>>>>> }
>>>>>>>>>>>>
>>>>>>>>>>>>   parse
>>>>>>>>>>>>    :  (t=. {System.out.printf("\%-10s \%s\n",
>>>>>>>> tokenNames[$t.type],
>>>>>>>>>>>> $t.text);})* EOF
>>>>>>>>>>>>    ;
>>>>>>>>>>>>
>>>>>>>>>>>>   FLOAT
>>>>>>>>>>>>    :  INT '..'   {offer(INT, $INT.text); offer(RANGE,
>>>>> "..");}
>>>>>>>>>>>>    |  OCTAL '..' {offer(OCTAL, $OCTAL.text);
>>> offer(RANGE,
>>>>> "..");}
>>>>>>>>>>>>    |  '.' DIGITS
>>>>>>>>>>>>    |  DIGITS '.' DIGITS?
>>>>>>>>>>>>    ;
>>>>>>>>>>>>
>>>>>>>>>>>>   RANGE
>>>>>>>>>>>>    :  '..'
>>>>>>>>>>>>    ;
>>>>>>>>>>>>
>>>>>>>>>>>>   INT
>>>>>>>>>>>>    :  '1'..'9' DIGIT*
>>>>>>>>>>>>    |  '0'
>>>>>>>>>>>>    ;
>>>>>>>>>>>>
>>>>>>>>>>>>   OCTAL
>>>>>>>>>>>>    :  '0' ('0'..'7')+
>>>>>>>>>>>>     ;
>>>>>>>>>>>>
>>>>>>>>>>>>   fragment DIGITS : DIGIT+; fragment DIGIT  : '0'..'9';
>>>>>>>>>>>>
>>>>>>>>>>>>   SPACE
>>>>>>>>>>>>    :  (' ' | '\t' | '\r' | '\n') {skip();}
>>>>>>>>>>>>    ;
>>>>>>>>>>>>
>>>>>>>>>>>>   And if you run the class:
>>>>>>>>>>>>
>>>>>>>>>>>>   import org.antlr.runtime.*;
>>>>>>>>>>>>
>>>>>>>>>>>>   public class Main {
>>>>>>>>>>>>    public static void main(String[] args) throws
>>> Exception {
>>>>>>>>>>>>      String src = "..07..8.5 1.9..02 1..3.4";
>>>>>>>>>>>>      RangeDemoLexer lexer = new RangeDemoLexer(new
>>>>>>>>>>> ANTLRStringStream(src));
>>>>>>>>>>>>      RangeDemoParser parser = new RangeDemoParser(new
>>>>>>>>>>>> CommonTokenStream(lexer));
>>>>>>>>>>>>      System.out.println("Parsing: '" + src + "'");
>>>>>>>>>>>>      parser.parse();
>>>>>>>>>>>>    }
>>>>>>>>>>>> }
>>>>>>>>>>>>
>>>>>>>>>>>>   You'll see the following being printed to the console:
>>>>>>>>>>>>
>>>>>>>>>>>>   Parsing: '..07..8.5 1.9..02 1..3.4'
>>>>>>>>>>>> RANGE      ..
>>>>>>>>>>>> OCTAL      07
>>>>>>>>>>>> RANGE      ..
>>>>>>>>>>>> FLOAT      8.5
>>>>>>>>>>>> FLOAT      1.9
>>>>>>>>>>>> RANGE      ..
>>>>>>>>>>>>   OCTAL      02
>>>>>>>>>>>> INT        1
>>>>>>>>>>>> RANGE      ..
>>>>>>>>>>>> FLOAT      3.4
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>   Regards,
>>>>>>>>>>>>
>>>>>>>>>>>>   Bart.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>   On Fri, Nov 4, 2011 at 7:28 AM, Fabien Hermenier<
>>>>>>>>>>>> hermenierfabien at gmail.com>  wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> Hi
>>>>>>>>>>>>>
>>>>>>>>>>>>> In an earlier version of my language, I had to parse
>>>>>>>>>>>>> range of integers in various base. Now I want to
>>> include float.
>>>>> I
>>>>>>>>>>>>> have read
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>> http://www.antlr.org/wiki/display/ANTLR3/Lexer+grammar+for+floating
>>>>>>>>>>> +p
>>>>>>>>>>>>> oint,+dot,+range,+time+specs but I've still got some
>>>>>>>>>>>>> questions.
>>>>>>>>>>>>>
>>>>>>>>>>>>> All the work seems to be done at the lexer level so
>>> the
>>>>>>>>>>>>> type of the following tokens will be as example:
>>>>>>>>>>>>> 5 : DECIMAL_LITTERAL
>>>>>>>>>>>>> 07 : OCTAL_LITTERAL
>>>>>>>>>>>>> 7.5: FLOATING_POINT_LITTERAL
>>>>>>>>>>>>> 5..7 : DOTDOT
>>>>>>>>>>>>>
>>>>>>>>>>>>> In the last example, the result is not very convenient
>>>>>>>>>>>>> because
>>>>>>>> I
>>>>>>>>>>> will
>>>>>>>>>>>>> still have to extract the bounds and compute their
>>> type
>>>>>>>>>>>>> by myself which seems quite redundant with the job
>>>>>>>>>>>>> performed by
>>>>>>>> the lexer.
>>>>>>>>>>>>> May be I am missing something ?
>>>>>>>>>>>>>
>>>>>>>>>>>>> I would rather be able to express the range at the
>>>>>>>>>>>>> parser level
>>>>>>>>>>> which
>>>>>>>>>>>>> seems much more convenient to me:
>>>>>>>>>>>>> range: FLOATING_POINT_LITTERAL DOTDOT
>>>>> FLOATING_POINT_LITTERAL.
>>>>>>>>>>>>> In this way, I will also be able to manage the
>>> possible
>>>>>>>>>>>>> spaces between the bounds and the DOTDOT.
>>>>>>>>>>>>>
>>>>>>>>>>>>> So, am I right to try to parse range at the parser
>>> level ?
>>>>>>>>>>>>> Or
>>>>>>>> is
>>>>>>>>>>>>> there a solution to extract easily the bounds with
>>>>>>>>>>>>> their type
>>>>>>>> if
>>>>>>>>>>>>> I
>>>>>>>>>>> am
>>>>>>>>>>>>> doing the job at the lexer level ?
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks in advance,
>>>>>>>>>>>>> Fabien.
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> List:
>>>>>>>>>>>>> http://www.antlr.org/mailman/listinfo/antlr-interest
>>>>>>>>>>>>> Unsubscribe:
>>>>>>>>>>>>> http://www.antlr.org/mailman/options/antlr-
>>> interest/you
>>>>>>>>>>>>> r-
>>>>> em
>>>>>>>>>>>>> ail-
>>>>>>>>>>> addres
>>>>>>>>>>>>> s
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>> List: http://www.antlr.org/mailman/listinfo/antlr-
>>> interest
>>>>>>>>>>> Unsubscribe:
>>>>>>>>>>> http://www.antlr.org/mailman/options/antlr-interest/your-
>>>>>>>>>>> email-address
>>>>>>>>>> List: http://www.antlr.org/mailman/listinfo/antlr-interest
>>>>>>>>>> Unsubscribe:
>>>>>>>>>> http://www.antlr.org/mailman/options/antlr-interest/your-
>>> ema
>>>>>>>>>> il-
>>>>>>>> addres
>>>>>>>>>> s
>>>>>>>>>>
>>>>>>>>>
>>>>>>>> List: http://www.antlr.org/mailman/listinfo/antlr-interest
>>>>>>>> Unsubscribe:
>>>>>>>> http://www.antlr.org/mailman/options/antlr-interest/your-
>>>>>>>> email-address
>>>>>>> List: http://www.antlr.org/mailman/listinfo/antlr-interest
>>>>>>> Unsubscribe:
>>>>>>> http://www.antlr.org/mailman/options/antlr-interest/your-email-
>>>>> addres
>>>>>>> s
>>>>>>>
>>>>>>
>>>>> List: http://www.antlr.org/mailman/listinfo/antlr-interest
>>>>> Unsubscribe:
>>>>> http://www.antlr.org/mailman/options/antlr-interest/your-
>>>>> email-address
>>>> List: http://www.antlr.org/mailman/listinfo/antlr-interest
>>>> Unsubscribe:
>>>> http://www.antlr.org/mailman/options/antlr-interest/your-email-
>>> address
>>> List: http://www.antlr.org/mailman/listinfo/antlr-interest
>>> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-
>>> email-address
>> List: http://www.antlr.org/mailman/listinfo/antlr-interest
>> Unsubscribe:
>> http://www.antlr.org/mailman/options/antlr-interest/your-email-address
>>
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address



More information about the antlr-interest mailing list