[antlr-interest] a simple (not for me :)) grammar problem

Mon Jan 7 15:57:57 PST 2008

I have at least a partial answer to my own question.
The generated lexer class contains the method mToken which throws a
NoViableAltException when the illegal space after between the 3 and
the dot is encountered. This is caught in the the nextToken method of
the standard Lexer class. That catch calls reportError and recover. I
don't want it to recover though.

The techniques in section 10.4 of the book, "Exiting the Recognizer
upon First Error" won't work here. I suppose I could try to override
the Lexer nextToken method, but that seems like too much work. The
only solution I've come up with so far is to override the recover
method in the generated Lexer class so it does a System.exit(1).

Is there a better way to stop processing when the lexer throws a
NoViableAltException?

On Jan 7, 2008 4:18 PM, Mark Volkmann <r.mark.volkmann at gmail.com> wrote:
> Here is my attempt to stop parsing after the lexer gets a
> NoViableAltException. It doesn't stop. Can someone tell me why? Here's
> the output I get when I process "3 .14.hello". Note that the space
> between the 3 and the dot isn't allowed by the grammar.
>
> matched!
> line 1:1 no viable alternative at character ' '
>
> grammar Sample;
>
> @lexer::members {
>   protected void mismatch(IntStream input, int ttype, BitSet follow)
>   throws RecognitionException {
>     throw new MismatchedTokenException(ttype, input);
>   }
>
>   public void recoverFromMismatchedSet(
>     IntStream input, RecognitionException e, BitSet follow)
>   throws RecognitionException {
>     throw e;
>   }
> }
>
> @parser::members {
>   protected void mismatch(IntStream input, int ttype, BitSet follow)
>   throws RecognitionException {
>     throw new MismatchedTokenException(ttype, input);
>   }
>
>   public void recoverFromMismatchedSet(
>     IntStream input, RecognitionException e, BitSet follow)
>   throws RecognitionException {
>     throw e;
>   }
> }
>
> @lexer::rulecatch {
>   catch (RecognitionException e) {
>     throw e;
>   }
> }
>
> @parser::rulecatch {
>   catch (RecognitionException e) {
>     throw e;
>   }
> }
>
> start
>   options { backtrack = true; }
>   : (floatValue | integerValue) DOT IDENTIFIER
>     { System.out.println("matched!"); }
>   ;
>
> floatValue: NUMBER DOT NUMBER;
> integerValue: NUMBER;
>
> DOT: '.';
> IDENTIFIER: LETTER+;
> NUMBER: DIGIT+;
> fragment LETTER: 'a'..'z';
> fragment DIGIT: '0'..'9';
>
> NEWLINE: '\r'? '\n' { skip(); };
>
>
> On Jan 7, 2008 4:15 PM, Mark Volkmann <r.mark.volkmann at gmail.com> wrote:
> > On Jan 7, 2008 2:35 PM, Fırat Küçük <firatkucuk at gmail.com> wrote:
> > > no,
> > > this is what i did.
> > >
> > > this grammar parses "3     .    4    . hello".
> >
> > The solution I emailed out doesn't parse that because it doesn't skip
> > whitespace. Well, I should be more clear. This is the output I get.
> >
> >      [java] line 1:1 no viable alternative at character ' '
> >      [java] matched!
> >      [java] line 1:2 no viable alternative at character ' '
> >      [java] line 1:4 no viable alternative at character ' '
> >      [java] line 1:5 no viable alternative at character ' '
> >      [java] line 1:8 no viable alternative at character ' '
> >      [java] line 1:9 no viable alternative at character ' '
> >      [java] line 1:11 no viable alternative at character ' '
> >      [java] line 1:12 no viable alternative at character ' '
> >
> > So you see I get the message "matched!", but I also get all the "no
> > viable alternative" messages. What we need is a way to make the parser
> > stop when it gets one of those. I think this is addressed in section
> > 10.4. I'll try that and let you know what happens.
> >
> >
> > > so as gavin said.
> > >
> > >
> > > "It's not a solution if it doesn't work :)"
> > >
> > >
> > >
> > >  2008/1/7, Mark Volkmann <r.mark.volkmann at gmail.com>:
> > >
> > > > On Jan 7, 2008 6:24 AM, Gavin Lambert <antlr at mirality.co.nz> wrote:
> > > > > At 21:20 7/01/2008, =?ISO-8859-9?Q?F=FDrat_K=FC=E7=FCk?= wrote:
> > > > >  >
> > > > >  >this is my simple solution:
> > > > >
> > > > > It's not a solution if it doesn't work :)
> > > > >
> > > > > Try doing what I suggested.  You really should handle the floats
> > > > > in the lexer, since you don't have to worry about whitespace
> > > > > weirdness that way.  And if you do it the way I said, it should
> > > > > work.
> > > >
> > > > I think this is what you want or at least really close.
> > > >
> > > > grammar Sample;
> > > >
> > > > start
> > > >   options { backtrack = true; }
> > > >   : (floatValue | integerValue) DOT IDENTIFIER
> > > >     { System.out.println("matched!"); }
> > > >   ;
> > > >
> > > > floatValue: NUMBER DOT NUMBER;
> > > > integerValue: NUMBER;
> > > >
> > > > DOT: '.';
> > > > IDENTIFIER: LETTER+;
> > > > NUMBER: DIGIT+;
> > > > fragment LETTER: 'a'..'z';
> > > > fragment DIGIT: '0'..'9';
> > > >
> > > > NEWLINE: '\r'? '\n' { skip(); };
> > > >
> > > > --
> > > > R. Mark Volkmann
> > > > Object Computing, Inc.
> > > >
> > >
> > >
> > >
> > >
> > > --
> > > Öğr. Gör. Fırat Küçük
> > > ADAMYO Distance Learning
> > > SAKARYA University / TURKEY
> >
> >
> >
> > --
> >
> > R. Mark Volkmann
> > Object Computing, Inc.
> >
>
>
>
> --
>
> R. Mark Volkmann
> Object Computing, Inc.
>

-- 
R. Mark Volkmann
Object Computing, Inc.