[antlr-interest] Why won't this match...

Mark Volkmann r.mark.volkmann at gmail.com
Sun Feb 24 15:23:45 PST 2008


On Sun, Feb 24, 2008 at 5:12 PM, Mark Volkmann
<r.mark.volkmann at gmail.com> wrote:
>
> On Sun, Feb 24, 2008 at 9:40 AM, alan brown <listbrownie at gmail.com> wrote:
>  > It must be something obvious but why won't this language parse the word
>  > 'wibble'?  I would expect the lexer to be unable to match the input to
>  > BIG_TOKEN but successfully match to LITTLE_TOKEN followed by SEMI_TOKEN.  If
>  > I change the BIG_TOKEN definition to 'wobble' then all is well but I don't
>  > know why this is failing.
>  >
>  > Any help is appreciated
>  >
>  > root                       : tokenizer2 | tokenizer1 ;
>  >
>  > tokenizer1              : BIG_TOKEN ;
>  > tokenizer2             : LITTLE_TOKEN SEMI_TOKEN ;
>  >
>  > BIG_TOKEN           : 'wibbled' ;
>  >  LITTLE_TOKEN     : 'wi';
>  > SEMI_TOKEN            : 'bble' ;
>
>  This is "bang my head on the wall" frustrating!
>  It looks so simple, but I can't get it to work either!
>  Just when I thought I was getting the hang of it ...
>  I hope someone else has an answer.

Here's the grammar I used to test this.

grammar Wibble;
root: (PREFIX SUFFIX | WHOLE) { System.out.println("got it!"); };
WHOLE: 'wibbled';
PREFIX: 'wi';
SUFFIX: 'bble';
WHITESPACE: '\r' | '\n' { skip(); };

I stepped through the generated lexer code with the input "wibble". It
basically says
- got a 'w'
- got an 'i'
- got a 'b'
- okay, stop looking because the next token must be 'wibbled'

But that's wrong. The next token needs to be 'wi' since it's never
going to find the 'd' at the end.
Doesn't this seem wrong?

-- 
R. Mark Volkmann
Object Computing, Inc.


More information about the antlr-interest mailing list