[antlr-interest] parsing boolean expressions: not not or abc

Jim Idle jimi at temporal-wave.com
Thu Jan 14 08:59:57 PST 2010


Change your grammar to:

grammar T;
options {
	output=AST;
}
tokens {
	EXPR;
}

content :	orexpression EOF
		->^(EXPR orexpression)
	;
	
orexpression 
	:	andexpression (OR^ andexpression)*
	;
andexpression 
	:	expression (AND^ expression)*
	;
expression 
	:	(NOT^)? term
	;
term 	: (
		  t=WORD
		| t=AND
		| t=OR
		| t=NOT
	  )
	  {
	  	$t.setType(WORD);
	  }
	;

NOT 	:	'not'
	;
AND 	:	'and'
	;
OR 	:	'or'
	;
WORD	:	('a'..'z' | '0'..'9' | '%' | '_')+
	;
WS 	:	(' ' | '\r' | '\n' | '\t')  { skip(); }


However note that the grammar has to make some assumptions here such as the word 'not' on its own is a term and not (pun not intended) a syntax error where the not is the operator and should expect a term.

Also I suspect that your not processing rule should actually be:

expression 
	:	NOT^ expression
	|	term
	;

But this would eat not not not as a repeated not as in NOT NOT WORD

If the expression rule gets more complicated then ANTLR may not be able to predict properly.

Jim

> -----Original Message-----
> From: antlr-interest-bounces at antlr.org [mailto:antlr-interest-
> bounces at antlr.org] On Behalf Of lord.of.board at gmx.de
> Sent: Thursday, January 14, 2010 1:10 AM
> To: antlr-interest at antlr.org
> Subject: [antlr-interest] parsing boolean expressions: not not or abc
> 
> Hello,
> 
> I am trying to build a grammar which accepts boolean expressions for
> filtering. I found some interesting articles on the web, but now I got
> stuck.
> I try to parse something like this:
> 
>   not not or abc
> 
> The first "not" is the boolean operator and the second is a text.
> 
> Or even worse
> 
>   not not and not or and not and
> 
> My grammar look like this:
> 
> grammar TextFilterGrammar;
> options {
> 	output=AST;
> }
> content :	orexpression
> 	;
> orexpression
> 	:	andexpression (OR^ andexpression)*
> 	;
> andexpression
> 	:	expression (AND^ expression)*
> 	;
> expression
> 	:	(NOT^)? term
> 	;
> term 	:	WORD
> 	;
> 
> NOT 	:	'not'
> 	;
> AND 	:	'and'
> 	;
> OR 	:	'or'
> 	;
> WORD	:	('a'..'z' | '0'..'9' | '%' | '_')+
> 	;
> WS 	:	(' ' | '\r' | '\n' | '\t')  { skip(); }
> 	;
> 
> In ANTLRWorks I always get a MismatchedTokenException when trying to
> parse "not not or ljsdf". Parsing e.g. "not noti or ljsdf" works fine.
> 
> I managed to get it working with quotation marks, but I would prefer to
> have a solution without.
> 
> Best regards,
> Lordi
> 
> --
> GRATIS für alle GMX-Mitglieder: Die maxdome Movie-FLAT!
> Jetzt freischalten unter http://portal.gmx.net/de/go/maxdome01
> 
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-
> email-address





More information about the antlr-interest mailing list