[antlr-interest] Keeping lookahead low

Andrew Deren andrew at adersoftware.com
Wed Aug 24 15:38:58 PDT 2005

Why don't you change your ID rule check if the matched ID starts with Thing
(or whatever logic you have for THING_ID) and do $setType(THING_ID)

-----Original Message-----
From: antlr-interest-bounces at antlr.org
[mailto:antlr-interest-bounces at antlr.org] On Behalf Of Ciaran Treanor
Sent: Wednesday, August 24, 2005 3:26 PM
To: antlr-interest at antlr.org
Subject: [antlr-interest] Keeping lookahead low

Following on from the help provided by Alexey and Olivier yesterday
I've cleaned up a test grammar I was working on and am left with one
question outstanding.

I have a test data file that looks like the following:
System {
 foo = Th       ! case 1:  BROKEN - rhs should be considered an ID
 foo = Thing    ! case 2: BROKEN - rhs should be considered an ID
 foo = Thing123 ! case 3: GOOD - rhs is a THING_ID
 foo = Thingy   ! case 4: GOOD - rhs is a regular id
 foo = foo      ! case 5: GOOD - rhs is a regular id
 Th = foo       ! case 6: BROKEN - lhs should be considered an ID
 Thing = foo    ! case 7: BROKEN - lhs should be considered an ID
 Thing123 = foo ! case 8: Why is error "expecting '}'" instead of expecting
 Thingy = foo   ! case 9: GOOD - lhs is a regular id

Can anyone tell me why the parser fails with the following error when
it endounters 'Th' or 'Thing'?
Exception in thread "main" line 2:11: expecting 'i', found ' '
       at com.ct.test.TestLexer.nextToken(TestLexer.java:120)
       at antlr.TokenBuffer.fill(TokenBuffer.java:69)
       at antlr.TokenBuffer.LA(TokenBuffer.java:80)
       at antlr.LLkParser.LA(LLkParser.java:52)
       at antlr.Parser.match(Parser.java:210)
       at com.ct.test.TestParser.systemBlock(TestParser.java:82)
       at com.ct.test.TestParser.testFile(TestParser.java:61)
       at com.ct.test.TestParser.main(TestParser.java:31)

Increasing lookahead to 6 fixes case 1 and case 2. Unfortunately
increasing the lookahead isn't really an option for me since, in
reality, 'Thingy' is actually a 20 character word.

What's the simplest thing I can do to the grammar to fix the cases
above that I've flagged as broken?

Oh, can anyone explain the error reported for case 8. This case is an
assignment that looks like:

Since the grammar is expecting assignments of the form ID = ( ID |
THING_ID) I would have thought the parser would complain that it found
a THING_ID when it was expecting a regular ID. Instead it compains
about expecting '}'. Why is that?

Thanks a million (oh, grammar and test file attached)

More information about the antlr-interest mailing list