[antlr-interest] Problem with semantic predicate in protected lexer
rule
Dee Holtsclaw
dee at pestcontroldata.com
Wed Aug 24 13:54:57 PDT 2005
I've no idea if this has been mentioned on the lists before but I've got a
grammar which contains a protected lexer rule using a semantic predicate. The
referencing rules properly match the input unless the semantic condition is
unmet. In this case an exception is thrown instead of matching an alternative
(non-predicated) rule. Interestingly enough, input which does not match any
of the referring rules does match the alternative rule and everything is
peachy. In case it's a code generation issue, the output language is C++.
The lexer grammar rules in question are (many others omitted for brevity):
COMMENT : DOTPREFIX! '.' ;
BREAK1 : DOTPREFIX! 'b' ;
CENTER : DOTPREFIX! 'c' ;
PERIOD : '.' ;
protected
DOTPREFIX : {getColumn()==1}? '.' ;
With input of ".b", a BREAK1 is matched, input of "foo." matches PERIOD but
"foo.bar" throws an exception "getColumn()=1". I can only presume it's
attempting to match ".b" even though the semantic condition is false.
It seems to me that the DOTPREFIX predicate is not being hoisted high enough
to prevent calling the rule when the condition is unmet. I've worked around
the problem by eliminating the DOTPREFIX rule and adding the semantic
predicate on all of the referencing rules but shouldn't a semantic predicate
on a rule be considered by those rules which reference it? In other words, if
"DOTPREFIX" is to be matched only when "getColumn()==1" then shouldn't BREAK1
also require "getColumn()==1" to be true before it's matched?
In case anyone is curious, this bizarre grammar is used for processing text
similar to nroff/troff. Works great unless the text includes a URL...
Thanks.
Lawrence "Dee" Holtsclaw
More information about the antlr-interest
mailing list