[antlr-interest] How to resolve ambiguous grammar of property file: example A = B C K = L

John B. Brodie jbb at acm.org
Thu Apr 14 09:57:42 PDT 2005


Greetings!

You wrote (in part):
>I have ambiguous grammar (properties for some system). I attached 
>grammar file below. Examples of the statements are
>
>A = B ;
>A = B C ;
>A = B K = L ;
>A = B C K = L ;
>A -X = B C K  = L ;
>A -X = B C K - M  = L ;
>
>
>These statements should be grouped like this.
>
>(A = B) ;
>(A = B C) ;
>(A = B ) (K = L) ;
>(A = B C) (K = L) ;
>(A - X = B C) (K - M  = L) ;
>
>The human rule is simple. If equal sign follows identifier or identifier 
>dash identifier than it is the name of the property. Otherwise it is 
>part of the lsit
>The ambiguous grammar would is attached.
>I tried different lookaheads but so far was not able to solve the 
>problem. I hope somebody else could propose ideas to try.
>
>....snipped....

Attached please find a non-ambiguous (k=2) version of your grammar. I ran it
through antlr.Tool and got no complaints, but haven't actually tried to parse
any of your examples...

It probably isn't the really answer to your problem, tho.  It will produce a
rather different tree than the one you are looking for above.  I think it will
produce a tree something like the following (but again haven't actually tried
it):

(#s (#p A) = (#v B) ;)
(#s (#p A) = (#v B (#v C)) ;)
(#s (#p A) = (#v B (#s (#p K) = (#v L) ;)))
(#s (#p A) = (#v B (#v C (#s (#p K) = (#v L) ;))))
(#s (#p A - X) = (#v B (#v C (#s (#p K - M)  = (#v L) ;))))

where #s is the tree token for statement, #p is the tree token for variable
and #v is the tree token for value.

So that the next statement is actually at the tail of the current value
list. Rather than a flat list of statements.  (i think, but my knowledge of
tree construction/walking is rather weak, sorry).

Hope this helps...
   -jbb

-------------------------begin attachment-------------------------
header{
package com.equ;
import java.util.*;
}

class EquParser extends Parser;
options {
    k = 2;
	buildAST = true;
}

program:
    ( statement | SEMI )* EOF
;

statement:
     variable EQ value
;

variable:
   IDENT ( MINUS IDENT )?
;

value:
   IDENT ( value | statement | SEMI )
;



//----------------------------------------------------------------------------
class EquLexer extends Lexer;
//----------------------------------------------------------------------------
options {
	importVocab=EquParser;       // call the vocabulary "Equ"
	testLiterals=true;          // automatically test for literals
	caseSensitive=false;
	caseSensitiveLiterals = false;
}


IDENT		    :	('a' .. 'z' )+		;
EQ		        :	'=' ;
SEMI			:	';'		;
MINUS			:	'-'		;


// Whitespace -- ignored
WS	:	SPACE
		{ _ttype = Token.SKIP; }
	;

// Whitespace -- ignored
protected
SPACE	:	(	' '
		|	'\t'
		|	'\f'
			// handle newlines
		|	(	options {generateAmbigWarnings=false;}
			:	"\r\n"  // Evil DOS
			|	'\r'    // Macintosh
			|	'\n'    // Unix (the right way)
			)
			{ newline(); }
		)+
	;

-------------------------end attachment-------------------------


More information about the antlr-interest mailing list