[antlr-interest] Syntaxerror not found if first symbol is wrong

Benjamin Niemann pink at odahoda.de
Thu Aug 14 11:31:07 PDT 2008


Hi Oliver,

On Wed, Aug 13, 2008 at 11:12 PM, Oliver B. Fischer <o.b.fischer at gmx.de> wrote:
> I am reading the input from ANTLRStringStream, so I don't get an EOF.
> How to fix this?

You should. EOF is in fact more an 'end of input' and all streams
return it, when the reach their end.

-Ben

> Benjamin Niemann schrieb:
> | Hi Oliver,
> |
> | If the very first token is an ID, the code will throw an
> | EarlyExitException - doesn't it? (alt1 sticks to its default value 2,
> | which is handled by the default case in the switch, which in turn will
> | raise the exception, because cnt1 is 0)
> |
> | If you have a sequence INT ID ';' ID, then the loop will exit after
> | the semicolon, and the parser will not complain about the dangling ID.
> | If that's what the problem is, then you can solve it by using
> |
> | program : (a b ';')+  EOF;
> |
> | -Ben
> |
> | On Sun, Aug 10, 2008 at 6:11 PM, Oliver B. Fischer
> <o.b.fischer at gmx.de> wrote:
> | Hello,
> |
> | my grammar is not able to recognize an syntax error if already the first
> | ~ symbol found is not an expected one:
> |
> | My grammar looks like this:
> |
> | INT : 'int' ;
> |
> | ID : ('a'..'z')+ ;
> |
> | WS : (' '|'\t'|'\n'|'\r')+ { $channel=HIDDEN; };
> |
> | program : (a b ';')+ ;
> |
> | a : INT ;
> |
> | b : ID ;
> |
> | So, any valid input must start with 'int', but if the first symbol found
> | by the lexer is an ID, the generated parser does not recognize the error.
> |
> | ANTLR generates the following code:
> | public final void program() throws RecognitionException {
> | ~  try {
> | ~    {
> | ~      int cnt1 = 0;
> | ~      loop1:
> | ~      do {
> | ~        int alt1 = 2;
> | ~        int LA1_0 = input.LA(1);
> |
> | ~        if ((LA1_0 == INT)) {
> | ~          alt1 = 1;
> | ~        }
> |
> |
> | ~        switch (alt1) {
> | ~          case 1: {
> | ~            pushFollow(FOLLOW_a_in_program154);
> | ~            a();
> | ~            _fsp--;
> |
> | ~            pushFollow(FOLLOW_b_in_program156);
> | ~            b();
> | ~            _fsp--;
> |
> | ~            match(input, SEM, FOLLOW_SEM_in_program158);
> |
> | ~          }
> | ~          break;
> |
> | ~          default:
> | ~            if (cnt1 >= 1) break loop1;
> | ~            EarlyExitException eee =
> | ~                    new EarlyExitException(1, input);
> | ~            throw eee;
> | ~        }
> | ~        cnt1++;
> | ~      } while (true);
> |
> |
> | ~    }
> |
> | ~  }
> |
> | ~  catch (RecognitionException e) {
> | ~    throw e;
> | ~  }
> | ~  finally {
> | ~  }
> | ~  return;
> | }
> |
> | So, the ID token falls through the switch-statement. How can I avoid this?
> |
> | Thank you for your help!
> |
> | Bye
> |
> | Oliver
> |
> |
> |
> |>
>
> - --
> Oliver B. Fischer, Schönhauser Allee 64, 10437 Berlin
> Tel. +49 30 44793251, Mobil: +49 178 7903538
> Mail: o.b.fischer at gmx.de Blog: http://www.sw-blog.net
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.4.8 (MingW32)
> Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
>
> iEYEARECAAYFAkijTjwACgkQbyc4N0PdoAzGWQCgw8ShlojJ5SoEgBoTbOaeU3Sv
> jbkAn0HCJsHYUXdBSdvj9bgjjdL8mNPv
> =wH8y
> -----END PGP SIGNATURE-----
>


More information about the antlr-interest mailing list