[antlr-interest] Troubles lexing a decimal, (from an antlr beginner)

Igor Murashkin downtown1 at gmail.com
Tue Jul 24 09:45:22 PDT 2007


Hello,

Well let me just say, its my first time using ANTLR. I needed a C# parser
generator so using flex/bison as I have done before was simply out of the
question, and I figured learning an LL(k) parser should be a nice variation
to just using LR(k).

Unfortunately before I can even get to the parsing, I need to fix my
lexing.. right now it doesn't work for matching decimals properly. Here are
the lexing rules in question:

===============

DOT        : '.'   ;
INTEGER    :    Digit+;
DECIMAL    :    Digit+ '.' Digit+;
fragment Digit
    :    '0'..'9';
IDENT    :     ('a'..'z'|'A'..'Z'|'_') ('a'..'z'|'A'..'Z'|'0'..'9'|'_')*;

NL    :    ('\r\n' // DOS/Windows
    |     '\r'  // Macintosh
          |     '\n') // Unix
          { $channel=HIDDEN; };

WS
      :     (' '
        |     '\t'
        |     '\f')
        { $channel=HIDDEN; };

===============

Unfortunately with simple output such as this it crashes with an
EarlyExitException:

===============
console.flushBuffer
general.holdMsec 1000
object 1.doSomeAction withThis
===============
The third line should produce "IDENT INTEGER DOT IDENT IDENT" but instead it
tries to match "1." as a DECIMAL and then once it sees the "d" it fails and
throws an EarlyExitException.

I am completely unsure what is going on.. I tried to set k=2 in options
figuring that if it looked at the period AND the next character it would get
a ('.' , 'd') clearly that does not match the DECIMAL rule.. but then I just
got a bunch of warnings in my lexer grammar so I removed the k=2 line
altogether. Looking at the generated code though its always calling LA(1)
and maybe there should be a way to get it to call LA(2) ?

Probably I am completely misunderstanding how the whole process of lexing is
working too. Looking at the generated code it is generating some DFAs, which
would imply some kind of regular language being at work here? Or does it
still use LL(k) parsing even for lexing?

I'm going to try to get the book asap too, probably it explains some of
this...
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.antlr.org/pipermail/antlr-interest/attachments/20070724/951fcb7a/attachment.html 


More information about the antlr-interest mailing list