[antlr-interest] Problem with semantic predicate in protected lexer rule

Wed Aug 24 13:54:57 PDT 2005

I've no idea if this has been mentioned on the lists before but I've got a 
grammar which contains a protected lexer rule using a semantic predicate. The 
referencing rules properly match the input unless the semantic condition is 
unmet. In this case an exception is thrown instead of matching an alternative 
(non-predicated) rule. Interestingly enough, input which does not match any 
of the referring rules does match the alternative rule and everything is 
peachy. In case it's a code generation issue, the output language is C++.

The lexer grammar rules in question are (many others omitted for brevity):

COMMENT	:	DOTPREFIX! '.'	;

BREAK1	:	DOTPREFIX! 'b' ;

CENTER	:	DOTPREFIX! 'c' ;

PERIOD :		'.'  ;

protected
DOTPREFIX : {getColumn()==1}? '.' ;

With input of ".b", a BREAK1 is matched, input of "foo." matches PERIOD but 
"foo.bar" throws an exception "getColumn()=1". I can only presume it's 
attempting to match ".b" even though the semantic condition is false.

It seems to me that the DOTPREFIX predicate is not being hoisted high enough 
to prevent calling the rule when the condition is unmet. I've worked around 
the problem by eliminating the DOTPREFIX rule and adding the semantic 
predicate on all of the referencing rules but shouldn't a semantic predicate 
on a rule be considered by those rules which reference it? In other words, if 
"DOTPREFIX" is to be matched only when "getColumn()==1" then shouldn't BREAK1 
also require "getColumn()==1" to be true before it's matched?

In case anyone is curious, this bizarre grammar is used for processing text 
similar to nroff/troff. Works great unless the text includes a URL...

Thanks.

Lawrence "Dee" Holtsclaw