[antlr-interest] Syntaxerror not found if first symbol is wrong

Benjamin Niemann pink at odahoda.de
Sun Aug 10 10:53:47 PDT 2008


Hi Oliver,

If the very first token is an ID, the code will throw an
EarlyExitException - doesn't it? (alt1 sticks to its default value 2,
which is handled by the default case in the switch, which in turn will
raise the exception, because cnt1 is 0)

If you have a sequence INT ID ';' ID, then the loop will exit after
the semicolon, and the parser will not complain about the dangling ID.
If that's what the problem is, then you can solve it by using

program : (a b ';')+  EOF;

-Ben

On Sun, Aug 10, 2008 at 6:11 PM, Oliver B. Fischer <o.b.fischer at gmx.de> wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> Hello,
>
> my grammar is not able to recognize an syntax error if already the first
> ~ symbol found is not an expected one:
>
> My grammar looks like this:
>
> INT : 'int' ;
>
> ID : ('a'..'z')+ ;
>
> WS : (' '|'\t'|'\n'|'\r')+ { $channel=HIDDEN; };
>
> program : (a b ';')+ ;
>
> a : INT ;
>
> b : ID ;
>
> So, any valid input must start with 'int', but if the first symbol found
> by the lexer is an ID, the generated parser does not recognize the error.
>
> ANTLR generates the following code:
> public final void program() throws RecognitionException {
> ~  try {
> ~    {
> ~      int cnt1 = 0;
> ~      loop1:
> ~      do {
> ~        int alt1 = 2;
> ~        int LA1_0 = input.LA(1);
>
> ~        if ((LA1_0 == INT)) {
> ~          alt1 = 1;
> ~        }
>
>
> ~        switch (alt1) {
> ~          case 1: {
> ~            pushFollow(FOLLOW_a_in_program154);
> ~            a();
> ~            _fsp--;
>
> ~            pushFollow(FOLLOW_b_in_program156);
> ~            b();
> ~            _fsp--;
>
> ~            match(input, SEM, FOLLOW_SEM_in_program158);
>
> ~          }
> ~          break;
>
> ~          default:
> ~            if (cnt1 >= 1) break loop1;
> ~            EarlyExitException eee =
> ~                    new EarlyExitException(1, input);
> ~            throw eee;
> ~        }
> ~        cnt1++;
> ~      } while (true);
>
>
> ~    }
>
> ~  }
>
> ~  catch (RecognitionException e) {
> ~    throw e;
> ~  }
> ~  finally {
> ~  }
> ~  return;
> }
>
> So, the ID token falls through the switch-statement. How can I avoid this?
>
> Thank you for your help!
>
> Bye
>
> Oliver
>
>
>
> - --
> Oliver B. Fischer, Schönhauser Allee 64, 10437 Berlin
> Tel. +49 30 44793251, Mobil: +49 178 7903538
> Mail: o.b.fischer at gmx.de Blog: http://www.sw-blog.net
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.4.8 (MingW32)
> Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
>
> iEYEARECAAYFAkifExsACgkQbyc4N0PdoAxl+wCgnsLC4BF3d8+BJP049F7w0KoT
> F78AoM3aS4yRPVvnVNLTas0ynMXQ1Ul+
> =4EvS
> -----END PGP SIGNATURE-----
>


More information about the antlr-interest mailing list