[antlr-interest] Lexer question

dotnet fr dotnetfr at gmail.com
Thu Jul 27 09:04:31 PDT 2006


Greg,

It's good too.
But when I generate the AST, the double isn't like a block.
For example : 10.5
I will have :
kid : 10
sib : .
sib : 5
I would like to have 10.5 directly.

I'm a beginner with ANTLR, perhaps there is a solution to have a block
with your solution.

Cheers,

Tomy

2006/7/27, Greg Clemenson <greg at onvoq.com>:
> Tomy,
>
> I would try something like this:
>
> In the Lexer:
>
> INTEGER: ('0'..'9')+;
>
> In the parser:
>
> integer: INTEGER;
> double: INTEGER '.'  (INTEGER)?;
>
> Or to be a bit fancier, try this:
>
> integer :  ('-')? INTEGER;
> double  :  ('-')? INTEGER '.'  (INTEGER)?
>                |  ('-')? '.' INTEGER
>                ;
>
>
> The advantage of this approach is that it only requires 1 level of
> lookahead (k=1) and no semantic rules, so the lexer can do its work
> in a single pass over the input characters.
>
> More generally, if multiple rules begin with a non-trivial
> overlapping definition, you are better off to create a separate rule
> to recognize just the overlapping part, and rewriting the original
> rules to begin with the overlap rule.  In this case, the overlapping
> part was the INTEGER definition.
>
> Greg
>
> On Jul 27, 2006, at 6:05 AM, dotnet fr wrote:
>
> > Dominik,
> > Thank you for your solution, it works very well.
> > I have another one, it looks like the same ;)
> >
> > In the lexer -------------------
> > INTORDOUBLE
> >       : (INTEGER '.') => DOUBLE       { $setType(DOUBLE); }
> >       | INTEGER                       { $setType(INTEGER); }
> >       ;
> >
> > protected
> > DOUBLE                : ('-')? ('0'..'9')+ '.' ('0'..'9')* ;
> > protected
> > INTEGER               : ('0'..'9')+ ;
> >
> > and in the parser ---------------------------
> >
> > startRule : (line)* ;
> >
> > line : DOUBLE | INTEGER;
> >
> > Regards,
> > Tomy
> >
> > 2006/7/27, Dominik Holenstein <dholenstein at gmail.com>:
> >> Tomy,
> >> I have played around with your Lexer and Parser code and have found
> >> this solution:
> >>
> >>
> >> -------------------------------------------------------
> >> ANTLR Grammar (file n.g):
> >>
> >> class NumParser extends Parser;
> >>
> >> startRule : (line)* ;
> >>
> >> line      : (
> >>                  d:DOUBLE
> >>                  {System.out.println("Double: "+d.getText());}
> >>                  |
> >>                   i:INTEGER
> >>                  {System.out.println("Integer: "+i.getText());}
> >>                   )
> >>                   ;
> >>
> >>
> >> class NumLexer extends Lexer;
> >>
> >> DOUBLE          : (('-')? ('0'..'9')+ '.' ('0'..'9')* )=> ('-')?
> >> ('0'..'9')+ '.' ('0'..'9')* | ('0'..'9')+ {$setType(INTEGER);} ;
> >>
> >> INTEGER         : ('0'..'9')+ ;
> >>
> >> SEMICOLON    : ';' { $setType(Token.SKIP); } ;
> >>
> >> NEWLINE        : (('\r''\n')=> '\r''\n'
> >>               | '\r'
> >>               | '\n'
> >>               ) { $setType(Token.SKIP); }
> >>                        ;
> >> WS                  : (' '|'\t') { $setType(Token.SKIP); } ;
> >>
> >> ---------------------------------------------------
> >>
> >> The Java test code (Main.java):
> >>
> >> import java.io.DataInputStream;
> >> import java.io.FileInputStream;
> >> import java.io.FileNotFoundException;
> >> import java.io.FileWriter;
> >> import java.io.IOException;
> >>
> >> public class Main {
> >>        public static void main (String[] args) {
> >>                try {
> >>                        // Make sure you change the path for your
> >> input file
> >>                        DataInputStream input = new DataInputStream
> >> (new
> >> FileInputStream("E:\\ANTLR\\Examples\\Numbers\\input.txt"));
> >>                        NumLexer lexer = new NumLexer(input);
> >>                NumParser parser = new NumParser(lexer);
> >>                try {
> >>                        parser.startRule();
> >>                } catch(Exception e) {}
> >>        } catch (FileNotFoundException e) {
> >>                System.out.println("Error: Cannot open file for
> >> reading");
> >>        }
> >>        }
> >> }
> >>
> >> --------------------------------------------------------------
> >> Data in the input file (input.txt):
> >> 10;
> >> 1500;
> >> 0.50;
> >> 35;
> >> 7.25;
> >> 3000;
> >>
> >> ---------------------------------------------------------------
> >>
> >> I have added all files as attachments to this e-mail.
> >>
> >> You can set k=1 because of the semantic predicate what makes the
> >> parser a bit faster.
> >> The System.out... messages are for testing purposes. I can see
> >> then in
> >> the console the output of the parser. I am working with Eclipse 3.2
> >> and ANTLR Studio. I am not sure whether this is 'good' programming
> >> style but it works ;-) . Inputs, feedbacks and better solutions are
> >> welcomed.
> >>
> >> I hope it helps!
> >>
> >> Regards,
> >> Dominik
> >>
> >>
> >>
> >>
> >>
> >> On 7/27/06, dotnet fr <dotnetfr at gmail.com> wrote:
> >> > Hi Dominik,
> >> >
> >> > I'm happy to meet a person like me!
> >> > I'm a beginner with antlr and codeworker too ;)
> >> > I'm each minute I'm learning new key. Antlr seems very powerful
> >> yeah.
> >> > My project is to create first a class generator, structure
> >> generator
> >> > and in final a structure (or class loader). It means I use
> >> parsing and
> >> > generation code.
> >> > What do you do with antlr, what is your interest in informatics ?
> >> >
> >> > Cheers
> >> > Tomy
> >> >
> >> > 2006/7/27, Dominik Holenstein <dholenstein at gmail.com>:
> >> > > Hi Tomy,
> >> > > I don't know codeworker but will have a look at it.
> >> > > ANTLR is very powerful and with v3 coming in fall it will get
> >> much better.
> >> > > I am a beginner with Java and ANTLR so everything is
> >> 'difficult' at
> >> > > the moment. But I am progressing and learning every day!
> >> > > I will look at your issue this afternoon.
> >> > >
> >> > > Regards,
> >> > > Dominik
> >> > >
> >> > >
> >> > >
> >> > > On 7/27/06, dotnet fr <dotnetfr at gmail.com> wrote:
> >> > > > Hi Dominik,
> >> > > >
> >> > > > I have seen in the Predicated LL(k) Lexing in the ANTLR
> >> documentation
> >> > > > witch treats about this kind of problem. It works but it's
> >> not the
> >> > > > best solution I think ;)
> >> > > > I thought that the antlr lexer try the first token and if it
> >> doesn't
> >> > > > match, it go to
> >> > > > the second etc..
> >> > > >
> >> > > > My parser grammar :
> >> > > >
> >> > > > startRule
> >> > > >        :
> >> > > >                nbp_debug
> >> > > >        ;
> >> > > >
> >> > > > protected
> >> > > > debug    :
> >> > > >        (
> >> > > >                DATE
> >> > > >        |       DOUBLE
> >> > > >        |       INTEGER
> >> > > >        |       SEMICOLON
> >> > > >        )*
> >> > > >        ;
> >> > > >
> >> > > > What do you think about Antlr ? I have to do the same
> >> project with
> >> > > > codeworker and antlr. Antlr seems more difficult to manipulate.
> >> > > >
> >> > > > Cheers,
> >> > > >
> >> > > > Tomy
> >> > > >
> >> > > > 2006/7/27, Dominik Holenstein <dholenstein at gmail.com>:
> >> > > > > Tomy,
> >> > > > >
> >> > > > > What is you grammar in the parser?
> >> > > > > Thanks.
> >> > > > >
> >> > > > > Dominik
> >> > > > >
> >> > > > >
> >> > > > > On 7/27/06, dotnet fr <dotnetfr at gmail.com> wrote:
> >> > > > > > Hi Everyone,
> >> > > > > >
> >> > > > > > I have a problem about the antlr lexer.
> >> > > > > >
> >> > > > > > In input I have :
> >> > > > > > 10;
> >> > > > > > 1500;
> >> > > > > > 0.50;
> >> > > > > >
> >> > > > > > In my lexer I have :
> >> > > > > > DOUBLE          : ('-')? ('0'..'9')+ '.' ('0'..'9')* ;
> >> > > > > > INTEGER         : ('0'..'9')+ ;
> >> > > > > > SEMICOLON       : ';' ;
> >> > > > > >
> >> > > > > > In my parser and lexer I have k=5.
> >> > > > > >
> >> > > > > > But I've got an error, the lexer seems to get his TOKENS
> >> in the order.
> >> > > > > > It gets the 10 like a double (the first in the list) and
> >> send an
> >> > > > > > exception
> >> > > > > > (exception: expecting ''.'', found '';'')
> >> > > > > >
> >> > > > > > I want the lexer to skip and try the next TOKEN and send
> >> an exception
> >> > > > > > only if there isn't any solutions.
> >> > > > > >
> >> > > > > > Is anyone got this problem too ?
> >> > > > > >
> >> > > > > > Cheers,
> >> > > > > >
> >> > > > > > Tomy
> >> > > > > >
> >> > > > >
> >> > > >
> >> > > >
> >> > > > --
> >> > > > dotnet
> >> > > >
> >> > >
> >> >
> >> >
> >> > --
> >> > dotnet
> >> >
> >>
> >>
> >>
> >
> >
> > --
> > dotnet
> >
>
>


-- 
dotnet


More information about the antlr-interest mailing list