[antlr-interest] Re: [antlr-dev] Hitting method size limit

Thu Jun 22 14:22:08 PDT 2006

[switching to antlr-interest]

Emond,  I changed the grammar to use fragment instead of protected  
and then chagned

fragment
NESTED_METHOD_BODY_ACTION
	: {bodyFollow}?=>
	'{'
	( NESTED_METHOD_BODY_ACTION
	| CHAR_LITERAL_ACTION
	| STRING_LITERAL_ACTION
	| .
	)*
	'}'
	;

to use ~('{'|'\''|'"') instead of .  It now terminates, but rules like:

METHOD_SIG_ACTION: {sigFollow}?=>
                    {sigFollow = false;}
                    (~(';'|'{'))+
                    {bodyFollow = true;};

EXPRESSION_ACTION: {exprFollow}?=>
                    {exprFollow = false;}
                    (~';')+;

severely overlap with each other and all the other rules.  Moreover,  
some of these rules are recursive, which means the DFA cannot see  
inside.  ANTLR is going crazy building bigger and bigger DFAs trying  
to see past the recursion.

A better way is to use the filter=true option.  Then just list the  
tokens with your predicates; ANTLR will backtrack so it will be  
slower but you can do some really ambiguous stuff.  Things seen first  
take precedence.

Look at codegen/action.g in distribution, which is the $x.y  
translator for actions.  It does stuff like:

ENCLOSING_RULE_SCOPE_ATTR
	:	'$' x=ID '.' y=ID	{enclosingRule!=null &&
	                         $x.text.equals(enclosingRule.name) &&
	                         enclosingRule.getLocalAttributeScope 
($y.text)!=null}?
		...
	;

TOKEN_SCOPE_ATTR
	:	'$' x=ID '.' y=ID	{enclosingRule!=null &&
	                         (enclosingRule.getTokenLabel($x.text)!=null||
	                          isTokenRefInAlt($x.text)) &&
	                         AttributeScope.tokenScope.getAttribute 
($y.text)!=null}?
		...
	;

LABEL_REF
	:	'$' ID {enclosingRule!=null &&
	            getElementLabel($ID.text)!=null &&
		        enclosingRule.getRuleLabel($ID.text)==null}?
		...
	;

These are all highly ambiguous but are resolved with predicates that  
fail/succeed after the syntax matches.  The default backtracking of  
the filter makes the lexer rewind and try next rule.  You should see  
a stream of tokens come out like normal even though this is "filter"  
mode.

Ter