[antlr-interest] How can I avoid "mismatched input" error?

Tue Mar 24 05:09:12 PDT 2009

Gabriel Petrovay schrieb:
> Hi all,
> 
> I am trying to parse XQuery.

Did you see this link? <http://code.google.com/p/xqpretty/>

Johannes

> It is known that XQuery is not keyword
> based. For example you can have the following function declaration:
> 
> declare function declare:function($declare as element(declare)) as
> element(function) {
>   <function declare="declare function">{$declare/function}</function>
> }
> 
> This declares a function called "function" in the namespace that has the
> prefix "declare". This function takes the parameter "declare" of type
> XML element having the tag name "declare", and returns also an XML
> element having the tag name "function". And so forth...
> :)
> 
> Can I solve this?
> 
> Thanks!
> Gabriel
> 
> On Tue, Mar 24, 2009 at 10:46 AM, Indhu Bharathi <indhu.b at s7software.com
> <mailto:indhu.b at s7software.com>> wrote:
> 
>     I assume when you say 'keyword' you are talking about the keywords
>     (like 'if', 'else'. 'type', etc) in the language you are trying to
>     parse. Those must be finite and the regular approach is to have a
>     production for each. Under that assumption the proposed model is
>     certainly scalable.
> 
>     But I'm just getting a doubt if you are meaning something else by
>     'keyword'. Are you trying to parse a input containing name value
>     pair where name and value can be anything?
> 
>     And what do you mean by Name1, Name2, ... NameN? I dont see any such
>     thing in grammar. Is name not a plain ID (like a variable name)?
> 
>     Throwing some more light on what exactly you are trying to parse
>     will be helpful.
> 
> 
>     - Indhu
> 
>     Gabriel Petrovay wrote:
>>     Hi Indhu,
>>
>>     I was trying to simplify the example such that I still get the
>>     error and the example is simple enough for everybody to understand
>>     the problem.
>>
>>     Here is the corrected grammar:
>>
>>     //========================================
>>     grammar k;
>>     options {
>>     output=AST;
>>     }
>>
>>     rule : KEYWORD1 (KEYWORD2 Name)? ';' ;
>>
>>     KEYWORD1 : 'keywordA';
>>     KEYWORD2 : 'keywordB';
>>
>>     Name : ('a'..'z' | 'A'..'Z')+ ;
>>     S : ('\t' | ' ' | '\n' | '\r')+  { $channel = HIDDEN; } ;
>>     //========================================
>>
>>     With this the problems you mentioned are eliminated.
>>
>>     As I can see your proposed solution is not scalable if I have the
>>     keywords: keywordA, keywordB,...,keywordZ, and the Name rules:
>>     Name1, Name2,..., NameN. Or is it?
>>
>>     Any solution for this?
>>
>>
>>     Regards,
>>     Gabriel
>>
>>
>>     On Tue, Mar 24, 2009 at 9:29 AM, Indhu Bharathi
>>     <indhu.b at s7software.com <mailto:indhu.b at s7software.com>> wrote:
>>
>>         Looks like you are trying to use keyword as identifier. AFAIK,
>>         this cannot be resolved in the lexer. You have to use
>>         predicates in the parser rule. Something like this:
>>
>>         rule : keyKEYWORD1 (keyKEYWORD2 enc=Name)? ';' ;
>>
>>         keyKEYWORD1
>>             :    {input.LT(1).getText().equals("keyword1")}? Name ;
>>
>>         keyKEYWORD2
>>             :    {input.LT(1).getText().equals("keyword2")}? Name ;
>>
>>
>>         One more problem I see is the production "Name : Letter* ;".
>>         Lexer production cannot define a zero length string.
>>
>>         Another problem is you are expecting 'keyword1' to be parsed
>>         as Name but production for Name doesn't allow numbers.
>>
>>         - Indhu
>>
>>         Gabriel Petrovay wrote:
>>>         Hi all,
>>>
>>>         I have the following grammar file:
>>>
>>>         //========================================
>>>         grammar k;
>>>         options {
>>>         output=AST;
>>>         }
>>>
>>>         rule : KEYWORD1 (KEYWORD2 enc=Name)? ';' ;
>>>
>>>         KEYWORD1 : 'keyword1';
>>>         KEYWORD2 : 'keyword2';
>>>
>>>         Name : Letter* ;
>>>         fragment Letter : 'a'..'z' | 'A'..'Z' ;
>>>
>>>         S            :    ('\t' | ' ' | '\n' | '\r')+  { $channel =
>>>         HIDDEN; } ;
>>>         //========================================
>>>
>>>
>>>         The following text is not a valid one.
>>>
>>>         INPUT:
>>>         =====
>>>         keyword1 keyword2 keyword1 ;
>>>
>>>         OUTPUT:
>>>         =======
>>>         line 1:18 mismatched input 'keyword1' expecting Name
>>>         <mismatched token: [@4,18:25='keyword1',<4>,1:18],
>>>         resync=keyword1 keyword2 keyword1 ;>
>>>
>>>
>>>         How can I make a parser to recognize this input? I want to be
>>>         able to allow the keywords in the places where any char
>>>         combination is allowed. How can I make this?
>>>
>>>         Regards,
>>>         Gabriel
>>>         ------------------------------------------------------------------------
>>>
>>>
>>>         List: http://www.antlr.org/mailman/listinfo/antlr-interest
>>>         Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address
>>>           
>>
>>
>>
>>
>>     -- 
>>     MSc Gabriel Petrovay
>>     MCSA, MCDBA, MCAD
>>     Mobile: +41(0)787978034
> 
> 
> 
> 
> -- 
> MSc Gabriel Petrovay
> MCSA, MCDBA, MCAD
> Mobile: +41(0)787978034
> 
> 
> ------------------------------------------------------------------------
> 
> 
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address