[antlr-interest] Parsing question

Jim Idle jimi at temporal-wave.com
Thu Aug 2 09:53:50 PDT 2012


If you put it in the lexer, how will you parse:

6 -4

When this is meant to be "subtract 4 from 6".

Jim

> -----Original Message-----
> From: Vinay Pandit [mailto:vpandit at quantivo.com]
> Sent: Thursday, August 02, 2012 9:49 AM
> To: Jim Idle; antlr-interest at antlr.org
> Subject: RE: [antlr-interest] Parsing question
>
> The date parsing made sense to me. I was just wondering about the
> signed and unsigned integer comment. If I make the decision about the
> sign in the parser I just thought it would clutter it all up. Which is
> the reason why I moved it into the LEXER.
>
> Regards
> Vinay
>
>
>
> -----Original Message-----
> From: antlr-interest-bounces at antlr.org [mailto:antlr-interest-
> bounces at antlr.org] On Behalf Of Jim Idle
> Sent: Thursday, August 02, 2012 9:41 AM
> To: antlr-interest at antlr.org
> Subject: Re: [antlr-interest] Parsing question
>
> That's what I thought. You are applying way too much context in to the
> mix.
>
> Take out all the special attempts to handle date in either the lexer or
> the parser and just accept SQUOTE (as in the simple string). And also
> fix trying to have a signed and unsigned integer in the lexer - the
> parser will have to do that.
>
> Then when you verify your AST (or as you parse if no AST), call a
> function that validates the date (you can just use standard Java Date
> stuff). Then you issue a semantic error if it is invalid.
>
> In SQL you may not be able to tell this until execution time unless you
> have access to the table metadata so that you can see that a field is a
> date type:
>
> ... WHERE T.myDate < '1964-07-14'
>
>
> Jim
>
>
> > -----Original Message-----
> > From: Vinay Pandit [mailto:vpandit at quantivo.com]
> > Sent: Thursday, August 02, 2012 9:35 AM
> > To: Jim Idle; antlr-interest at antlr.org
> > Subject: RE: [antlr-interest] Parsing question
> >
> > Yes, I think I was not clear enough. Here is what I wanted to do. In
> > SQL we have a date string of the form date '2001-01-01'. I wanted to
> > try and parse this date literal. I was just trying to figure out the
> > dateValue subrule in my earlier mail.
> >
> > Here is the grammar I came up with (which does not seem to work). I
> am
> > excluding timeLiteral and timestamp literal for brevity. I was just
> > not sure that I could get rid of the ambiguity by moving things into
> > the lexer. For e.g. ultimately '2001-01-01' fragment of the input
> > would match a STRING token, but because I have the 'date' in from of
> > it the parser should use that rule. I am used to Javacc where you can
> > provide lookaheads in order to tackle ambiguities.
> >
> > Hope this email clarifies my problem. Please let me know if you need
> > any more input
> >
> > Thanks for your help
> > Vinay
> >
> > -------------------------------------------
> > datetimeLiteral
> >     	: dateLiteral | timeLiteral | timestampLiteral;
> >
> > dateLiteral : DATE dateString;
> >
> > dateString : QUOTE dateValue QUOTE;
> >
> > dateValue : UNSIGNED_INTEGER MINUS UNSIGNED_INTEGER MINUS
> > UNSIGNED_INTEGER;
> >
> > The Lexer rules are
> >
> > fragment
> > DIGIT : ('0'..'9');
> > DATE          : ('D'|'d')('A'|'a')('T'|'t')('E'|'e');
> > UNSIGNED_INTEGER : (DIGIT) +;
> > MINUS         : '-' ;
> > QUOTE         : '\'';
> >
> >
> >
> > -----Original Message-----
> > From: antlr-interest-bounces at antlr.org [mailto:antlr-interest-
> > bounces at antlr.org] On Behalf Of Jim Idle
> > Sent: Thursday, August 02, 2012 9:22 AM
> > To: antlr-interest at antlr.org
> > Subject: Re: [antlr-interest] Parsing question
> >
> > OK - your example was not clear enough. You do need a fragment there.
> >
> > However it sounds like you are trying to get the lexer to handle
> > negative numbers and that is usually the wrong way - you want to
> > handle that in the parser's expression tree. However, I might be
> > tempted to handle the date literal in the lexer rather than the
> parser
> > as you will otherwise create a lot of conflicts.
> >
> >
> > MINUS : '-';
> > fragment DATE :;
> > INTEGER : '0'..'9'+
> >           (('-' '0'..'9'+ '-' '0'..'9')=>('-' '0'..'9'+ '-'
> '0'..'9'+)
> > { $type = DATE; })?
> > ;
> >
> > Are you sure that your language allows date strings that are not
> quote
> > delimited? There is an obvious conflict with the subtract operator
> > unless there are separate expression trees based on context.
> >
> > Jim
> >
> > > -----Original Message-----
> > > From: Vinay Pandit [mailto:vpandit at quantivo.com]
> > > Sent: Wednesday, August 01, 2012 11:14 PM
> > > To: Jim Idle; antlr-interest at antlr.org
> > > Subject: RE: [antlr-interest] Parsing question
> > >
> > > Thanks for the reply. That did not work either.
> > >
> > > Regards
> > > Vinay
> > >
> > > -----Original Message-----
> > > From: antlr-interest-bounces at antlr.org [mailto:antlr-interest-
> > > bounces at antlr.org] On Behalf Of Jim Idle
> > > Sent: Wednesday, August 01, 2012 10:48 PM
> > > To: antlr-interest at antlr.org
> > > Subject: Re: [antlr-interest] Parsing question
> > >
> > > That should be:
> > >
> > > fragment
> > > DIGIT ....
> > >
> > > And you don't need separate parser rules for yearValue and the
> other
> > > two - they are the same thing, just use UNSIGNED_INTEGER directly.
> > >
> > > Jim
> > >
> > > > -----Original Message-----
> > > > From: antlr-interest-bounces at antlr.org [mailto:antlr-interest-
> > > > bounces at antlr.org] On Behalf Of Vinay Pandit
> > > > Sent: Wednesday, August 01, 2012 9:44 PM
> > > > To: antlr-interest at antlr.org
> > > > Subject: [antlr-interest] Parsing question
> > > >
> > > > I am trying to parse a date time literal in ANTLR and I am having
> > > > issues with the grammar.
> > > >
> > > > Here are the rules defined in the parser
> > > >
> > > > dateValue : ( yearValue MINUS monthValue MINUS dayValue);
> > > >
> > > > yearValue : datetimeValue ;
> > > >
> > > > monthValue : datetimeValue;
> > > >
> > > > dayValue : datetimeValue;
> > > >
> > > > datetimeValue : UNSIGNED_INTEGER;
> > > >
> > > > The Lexer has
> > > >
> > > > MINUS         : '-' ;
> > > > DIGIT : ('0'..'9');
> > > > UNSIGNED_INTEGER : (DIGIT) +;
> > > >
> > > >
> > > > When I parse a date like 2012-01-01 for the dateValue rule, the
> > > parser
> > > > throws an exception.
> > > >
> > > > com. qexpr.ParseException: line 1:4 - mismatched input '-01'
> > > expecting
> > > > MINUS
> > > >                at
> > > >
> > >
> >
> com.quantivo.qexpr.AbstractQParser.reportError(AbstractQParser.java:77
> > > )
> > > >                at
> > > > com.quantivo.qexpr.SQLGrammar.dateValue(SQLGrammar.java:4730)
> > > >                at
> > > >
> > >
> >
> com.quantivo.qexpr.model.SQLGrammarTest.testDateValue(SQLGrammarTest.j
> > > > a
> > > > va:25)
> > > > ...
> > > >
> > > > Looking at the error message it is obvious that I am not getting
> > the
> > > > Minus token. Instead the internal token that I get is an INTEGER
> > > > (signed). I tried the greedy=false option, but that did not seem
> > > > to help either. I am running out of ideas as to why the input
> does
> > > > not match. Obviously I am doing something wrong, but I am not
> sure
> > what!
> > > >
> > > > Regards
> > > > Vinay
> > > >
> > > >
> > > > List: http://www.antlr.org/mailman/listinfo/antlr-interest
> > > > Unsubscribe: http://www.antlr.org/mailman/options/antlr-
> > > interest/your-
> > > > email-address
> > >
> > > List: http://www.antlr.org/mailman/listinfo/antlr-interest
> > > Unsubscribe: http://www.antlr.org/mailman/options/antlr-
> > interest/your-
> > > email-address
> >
> > List: http://www.antlr.org/mailman/listinfo/antlr-interest
> > Unsubscribe: http://www.antlr.org/mailman/options/antlr-
> interest/your-
> > email-address
>
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-
> email-address


More information about the antlr-interest mailing list