[antlr-interest] Lookahead problems

Martin Probst mail at martin-probst.com
Thu Sep 16 04:29:17 PDT 2004


Hello,
I have a lookahead problem with my grammar. I have a parser which has k=1
but it actually seems to be looking ahead further than it should. See this
output of ANTLR with -traceParser -traceLexer:

In the state before these steps my parser has recognized a "dirAttribute".
It looks ahead, finds a "=" and a '"' and then descends into a
dirAttributeValue. That's expected and good.

=== snip ===
> dirAttributeValue; LA(1)== > lexer mNEXT; c==104
  > lexer mQUOT_ATTR_CONTENT; c==104
  < lexer mQUOT_ATTR_CONTENT; c==123
 < lexer mNEXT; c==123
"
 > lexer mNEXT; c==123
  > lexer mLCURLY; c==123
  < lexer mLCURLY; c==32
 < lexer mNEXT; c==32
 > quotAttrValueContent; LA(1)==http://www.w3
 < quotAttrValueContent; LA(1)== > lexer mNEXT; c==32
  > lexer mWS; c==32
  < lexer mWS; c==34
 < lexer mNEXT; c==34
 > lexer mNEXT; c==34
  > lexer mSTRING_LITERAL; c==34
   > lexer mQUOT; c==34
   < lexer mQUOT; c==46
   > lexer mQUOT; c==34
   < lexer mQUOT; c==32
  < lexer mSTRING_LITERAL; c==32
 < lexer mNEXT; c==32
{
 > quotAttrValueContent; LA(1)=={
  > attrCommonContent; LA(1)=={
   > expr; LA(1)== > lexer mNEXT; c==32
  > lexer mWS; c==32
  < lexer mWS; c==125
 < lexer mNEXT; c==125
 > lexer mNEXT; c==125
  > lexer mRCURLY; c==125
  < lexer mRCURLY; c==32
 < lexer mNEXT; c==32
.org
[ ca. 15 grammatical steps removed ]
    > literal; LA(1)==.org
     > stringLiteral; LA(1)==.org
     < stringLiteral; LA(1)== > lexer mNEXT; c==32
  > lexer mWS; c==32
  < lexer mWS; c==47
 < lexer mNEXT; c==47
 > lexer mNEXT; c==47
  > lexer mSLASH; c==47
  < lexer mSLASH; c==49
 < lexer mNEXT; c==49
}
    < literal; LA(1)==}

=== snap ===

Now the rule for "attrCommonContent" states:
attrCommonContent:
  /* some more alts */
  | LCURLY expr RCURLY
The lookeahed of the RCURLY should by that be sufficient to exit the
attrCommonContent rule. So why does the parser require more lookahead from
the lexer when exiting stringLiteral?

The problem with that is that within dirAttributeValue the lexer has to
throw tokens in a different manner than within the following expr rules.
This means I have to switch the lexer to a different state (done with
actions within {} in the grammar). I can't switch the state before the
parser leaves the attrCommonContent section (that means, the statement has
to be directly behind the RCURLY within that one). But at that point the
parser has obviously already fetched more tokens behind the RCURLY which
leads to errors.

My lexer has k=2 and the whole stuff uses C++ with the runtime and
generator from antlr-2.7.4. Can anyone help me with this? Am I
missunderstanding ANTLRs behaviour in general or is this a bug or what?

Thanks,
Martin


 
Yahoo! Groups Links

<*> To visit your group on the web, go to:
    http://groups.yahoo.com/group/antlr-interest/

<*> To unsubscribe from this group, send an email to:
    antlr-interest-unsubscribe at yahoogroups.com

<*> Your use of Yahoo! Groups is subject to:
    http://docs.yahoo.com/info/terms/
 





More information about the antlr-interest mailing list