[antlr-interest] Combining rewrite rules and syntactic predicates

Andreas Stefik stefika at gmail.com
Fri Mar 26 08:35:20 PDT 2010


Hello folks,

I've written a number of little compilers and VMs, but am relatively
new to ANTLR. I am currently working on a custom programming language
that is being designed through a series of statistical studies on how
humans process language. While the language is pretty intuitive
because of this, parsing can often get a bit complicated. I am
currently working on getting a form of if statements into the
language, and because of the parsing complexities in the language as a
whole, it can occasionally be difficult to tweak seemingly simple
parser rules. Here's a snippet of the rule in my parser:

if_statement	
	:
	IF expression THEN block END
	((ELSE IF expression THEN block  END ) => (ELSE IF expression THEN
block END))*  //else if blocks
	((ELSE block END) => (ELSE block END) )?
	
	
	-> ^(IF expression THEN block END (ELSE_IF_STATEMENT ELSE IF
expression THEN block END)* (FINAL_ELSE ELSE block END)? )
	;

(assume ELSE_IF_STATEMENT and FINAL_ELSE are appropriately defined
hidden tokens)

First, the problem I'm having is that I'm receiving a
org.antlr.runtime.tree.RewriteEmptyStreamException: token ELSE

I've dug through the generated parser where it says the error is:

while ( stream_block.hasNext()||stream_END.hasNext()||stream_expression.hasNext()||stream_IF.hasNext()||stream_ELSE_IF_STATEMENT.hasNext()||stream_THEN.hasNext()||stream_ELSE.hasNext()
) {
                    adaptor.addChild(root_1,
stream_ELSE_IF_STATEMENT.nextNode());
                    adaptor.addChild(root_1, stream_ELSE.nextNode());
                    adaptor.addChild(root_1, stream_IF.nextNode());
                    adaptor.addChild(root_1, stream_expression.nextTree());
                    adaptor.addChild(root_1, stream_THEN.nextNode());
                    adaptor.addChild(root_1, stream_block.nextTree());
                    adaptor.addChild(root_1, stream_END.nextNode());

                }

Oddly enough, this block appears to result in true, as it claims there
is another END and Block in the code I've passed it, which is as
follows:

if a = b then
    a = a + 1
end

Clearly no additional block after the if has ended. Now, again
somewhat oddly, if I change my IF rule to be something like this:

if_statement	
	:
	IF expression THEN block END
	((ELSE if_statement ) => (ELSE if_statement))*  //else if blocks
	((ELSE block END) => (ELSE block END) )?
	
	
	-> ^(IF expression THEN block END (ELSE_IF_STATEMENT ELSE
if_statement)* (FINAL_ELSE ELSE block END)? )
	;

The rewrite rule exceptions go away and things generally work fine.
I'm having trouble wrapping my head around why that would be the case.

1. Anyone have any clues as to what might be going on here?

2. Is there something going on underneath the surface in the way I'm
combining rewrite rules and syntactic predicates that I'm not
understanding?

Any help would be appreciated,

Stefik


More information about the antlr-interest mailing list