[antlr-interest] DMQL Grammar - ANTLR Eats Characters

Indhu Bharathi indhu.b at s7software.com
Tue Mar 10 00:48:37 PDT 2009

Try this: 

Today: ( (Today_) => 'Today' ) ; 
fragment Today_ 
: 'Today' 

However, I'm not sure if this's the most elegant way to fix it. 

Read the following thread to understand more on why exactly this happens: 

- Indhu 

----- Original Message ----- 
From: Mihai Danila <viridium at gmail.com> 
To: antlr-interest at antlr.org 
Sent: Tuesday, March 10, 2009 6:30:43 AM GMT+0530 Asia/Calcutta 
Subject: [antlr-interest] DMQL Grammar - ANTLR Eats Characters 


I thought I had my DMQL grammar nailed after several months of no issues, until recently a query failed. I've already massaged the grammar in a few ways so I'm a bit at a loss as to what the problem is this time. Do I have to enumerate all the possible token prefixes (including TO, TOD, TODA, N, NO, A, AN, O) in the alphanumericToken rule to fix this one? Am I missing something? 

Here's the query: 

If I debug this, here's what ANTLR parses: 

Here's the grammar: 

grammar Dmql; 

options { 

tokens { 
Or; And; Not; 
LookupAnd; LookupNot; LookupOr; LookupAny; 
StringList; StringEquals; StringStartsWith; 
StringContains; StringChar; EmptyString; 
RangeList; RangeBetween; RangeGreater; RangeLower; 

@header { package com.stratusdata.dmql.parser.antlr; } 
@lexer::header { package com.stratusdata.dmql.parser.antlr; } 

@rulecatch { 
catch (RecognitionException re) { 
throw re; 

dmql: searchCondition; 
searchCondition: queryClause (('|' | BoolOr) queryClause)* -> ^(Or queryClause+); 
queryClause: booleanElement ((',' | BoolAnd) booleanElement)* -> ^(And booleanElement+); 
booleanElement: queryElement | ('~' | BoolNot) queryElement -> ^(Not queryElement); 
queryElement: '('! (fieldCriteria | searchCondition) ')'!; 

fieldCriteria: field '=' fieldValue -> ^(FieldCriteria field fieldValue); 
field: ('_' | alphanumericToken)+ -> ConstantValue[$field.text]; 
fieldValue: lookupList | stringList | rangeList | nonInteger | period | stringLiteral | empty; 
stringLiteral: StringLiteral; 
empty: '.EMPTY.' -> EmptyString; 

lookupList: lookupOr | lookupAnd | lookupNot | lookupAny; 
lookupOr: '|' lookup (',' lookup)* -> ^(LookupOr lookup+); 
lookupAnd: '+' lookup (',' lookup)* -> ^(LookupAnd lookup+); 
lookupNot: '~' lookup (',' lookup)* -> ^(LookupNot lookup+); 
lookupAny: '.ANY.' -> LookupAny; 
lookup: alphanumeric | stringLiteral; 

stringList: string (',' string)* -> ^(StringList string+); 
string: stringEq | stringStart | stringContains | stringChar; 
stringEq: alphanumeric -> ^(StringEquals alphanumeric); 
stringStart: alphanumeric '*' -> ^(StringStartsWith alphanumeric); 
stringContains: '*' alphanumeric '*' -> ^(StringContains alphanumeric); 
stringChar: alphanumeric? ('?' alphanumeric?)+ -> ^(StringChar ConstantValue[$stringChar.text]); 

rangeList: dateTimeRangeList | dateRangeList | timeRangeList | numericRangeList; 
dateTimeRangeList: dateTimeRange (',' dateTimeRange)* -> ^(RangeList dateTimeRange+); 
dateRangeList: dateRange (',' dateRange)* -> ^(RangeList dateRange+); 
timeRangeList: timeRange (',' timeRange)* -> ^(RangeList timeRange+); 
numericRangeList: numericRange (',' numericRange)* -> ^(RangeList numericRange+); 
dateTimeRange: x=dateTime '-' y=dateTime -> ^(RangeBetween $x $y) 
| x=dateTime '-' -> ^(RangeLower $x) 
| x=dateTime '+' -> ^(RangeGreater $x); 
dateRange: x=date '-' y=date -> ^(RangeBetween $x $y) 
| x=date '-' -> ^(RangeLower $x) 
| x=date '+' -> ^(RangeGreater $x); 
timeRange: x=time '-' y=time -> ^(RangeBetween $x $y) 
| x=time '-' -> ^(RangeLower $x) 
| x=time '+' -> ^(RangeGreater $x); 
numericRange: x=number '-' y=number -> ^(RangeBetween $x $y) 
| x=number '-' -> ^(RangeLower $x) 
| x=number '+' -> ^(RangeGreater $x); 
period: (isoDateTime | isoDate | isoTime) -> ConstantValue[$period.text]; 
dateTime: (isoDateTime | Now) -> ConstantValue[$dateTime.text]; 
date: (isoDate | Today) -> ConstantValue[$date.text]; 
time: isoTime -> ConstantValue[$time.text]; 
number: integer | nonInteger; 
integer: D+ -> ConstantValue[$integer.text]; 
nonInteger: (negativeNumber | positiveDecimal) -> ConstantValue[$nonInteger.text]; 
negativeNumber: '-' D+ ('.' D+)?; 
positiveDecimal: D+ '.' D+; 

timeZoneOffset: ('+' | '-') D D ':' D D; 
isoDate: D D D D '-' D D '-' D D; 
isoTime: D D ':' D D ':' D D ('.' D (D D?)?)?; 
isoDateTime: isoDate 'T' isoTime ('Z' | timeZoneOffset)?; 

alphanumeric: alphanumericToken+ -> ConstantValue[$alphanumeric.text]; 
alphanumericToken: (D | A | BoolNot | BoolAnd | BoolOr | Now | Today | 'T' | 'Z'); 

BoolNot: 'NOT'; 
BoolAnd: 'AND'; 
BoolOr: 'OR'; 
Now: 'NOW'; 
Today: 'TODAY'; 
StringLiteral: ('"' (~('\u0000'..'\u001F' | '\u007F' | '"') | ('""'))* '"'); 
A: (('A'..'Z') | ('a'..'z')); 
D: ('0'..'9'); 
Whitespace: (' ' | '\t' | '\n') { $channel = HIDDEN; }; 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.antlr.org/pipermail/antlr-interest/attachments/20090310/5a287d5f/attachment.html 

More information about the antlr-interest mailing list