[antlr-interest] How can I avoid "mismatched input" error?

Indhu Bharathi indhu.b at s7software.com
Tue Mar 24 03:33:50 PDT 2009


Well, then
1. declare and function in "declare:function"
2. declare in "$declare"
3. declare is "element(declare)"
4. function in "element(function)"
etc.. can be anything. In that case we just have to match them with ID ( 
ID: 'a'..'z'+ ).

Something like:
'declare' 'function' ID ':' ID '(' '$' ID 'as' 'element' '(' ID ')' ')' 
'as' 'element' '(' ID ')' ...

Later after forming the AST you can verify is semantic verification. For 
example "declare function declare:function($foo as element" might be 
invalid.

- Indhu

P.S: I don't know XQuery. So some XQuery specific things might be wrong 
in the mail. Please adjust :-)


Gabriel Petrovay wrote:
> Hi all,
>
> I am trying to parse XQuery. It is known that XQuery is not keyword 
> based. For example you can have the following function declaration:
>
> declare function declare:function($declare as element(declare)) as 
> element(function) {
>   <function declare="declare function">{$declare/function}</function>
> }
>
> This declares a function called "function" in the namespace that has 
> the prefix "declare". This function takes the parameter "declare" of 
> type XML element having the tag name "declare", and returns also an 
> XML element having the tag name "function". And so forth...
> :)
>
> Can I solve this?
>
> Thanks!
> Gabriel
>
> On Tue, Mar 24, 2009 at 10:46 AM, Indhu Bharathi 
> <indhu.b at s7software.com <mailto:indhu.b at s7software.com>> wrote:
>
>     I assume when you say 'keyword' you are talking about the keywords
>     (like 'if', 'else'. 'type', etc) in the language you are trying to
>     parse. Those must be finite and the regular approach is to have a
>     production for each. Under that assumption the proposed model is
>     certainly scalable.
>
>     But I'm just getting a doubt if you are meaning something else by
>     'keyword'. Are you trying to parse a input containing name value
>     pair where name and value can be anything?
>
>     And what do you mean by Name1, Name2, ... NameN? I dont see any
>     such thing in grammar. Is name not a plain ID (like a variable name)?
>
>     Throwing some more light on what exactly you are trying to parse
>     will be helpful.
>
>
>     - Indhu
>
>     Gabriel Petrovay wrote:
>>     Hi Indhu,
>>
>>     I was trying to simplify the example such that I still get the
>>     error and the example is simple enough for everybody to
>>     understand the problem.
>>
>>     Here is the corrected grammar:
>>
>>     //========================================
>>     grammar k;
>>     options {
>>     output=AST;
>>     }
>>
>>     rule : KEYWORD1 (KEYWORD2 Name)? ';' ;
>>
>>     KEYWORD1 : 'keywordA';
>>     KEYWORD2 : 'keywordB';
>>
>>     Name : ('a'..'z' | 'A'..'Z')+ ;
>>     S : ('\t' | ' ' | '\n' | '\r')+  { $channel = HIDDEN; } ;
>>     //========================================
>>
>>     With this the problems you mentioned are eliminated.
>>
>>     As I can see your proposed solution is not scalable if I have the
>>     keywords: keywordA, keywordB,...,keywordZ, and the Name rules:
>>     Name1, Name2,..., NameN. Or is it?
>>
>>     Any solution for this?
>>
>>
>>     Regards,
>>     Gabriel
>>
>>
>>     On Tue, Mar 24, 2009 at 9:29 AM, Indhu Bharathi
>>     <indhu.b at s7software.com <mailto:indhu.b at s7software.com>> wrote:
>>
>>         Looks like you are trying to use keyword as identifier.
>>         AFAIK, this cannot be resolved in the lexer. You have to use
>>         predicates in the parser rule. Something like this:
>>
>>         rule : keyKEYWORD1 (keyKEYWORD2 enc=Name)? ';' ;
>>
>>         keyKEYWORD1
>>             :    {input.LT(1).getText().equals("keyword1")}? Name ;
>>
>>         keyKEYWORD2
>>             :    {input.LT(1).getText().equals("keyword2")}? Name ;
>>
>>
>>         One more problem I see is the production "Name : Letter* ;".
>>         Lexer production cannot define a zero length string.
>>
>>         Another problem is you are expecting 'keyword1' to be parsed
>>         as Name but production for Name doesn't allow numbers.
>>
>>         - Indhu
>>
>>         Gabriel Petrovay wrote:
>>>         Hi all,
>>>
>>>         I have the following grammar file:
>>>
>>>         //========================================
>>>         grammar k;
>>>         options {
>>>         output=AST;
>>>         }
>>>
>>>         rule : KEYWORD1 (KEYWORD2 enc=Name)? ';' ;
>>>
>>>         KEYWORD1 : 'keyword1';
>>>         KEYWORD2 : 'keyword2';
>>>
>>>         Name : Letter* ;
>>>         fragment Letter : 'a'..'z' | 'A'..'Z' ;
>>>
>>>         S            :    ('\t' | ' ' | '\n' | '\r')+  { $channel =
>>>         HIDDEN; } ;
>>>         //========================================
>>>
>>>
>>>         The following text is not a valid one.
>>>
>>>         INPUT:
>>>         =====
>>>         keyword1 keyword2 keyword1 ;
>>>
>>>         OUTPUT:
>>>         =======
>>>         line 1:18 mismatched input 'keyword1' expecting Name
>>>         <mismatched token: [@4,18:25='keyword1',<4>,1:18],
>>>         resync=keyword1 keyword2 keyword1 ;>
>>>
>>>
>>>         How can I make a parser to recognize this input? I want to
>>>         be able to allow the keywords in the places where any char
>>>         combination is allowed. How can I make this?
>>>
>>>         Regards,
>>>         Gabriel
>>>         ------------------------------------------------------------------------
>>>
>>>
>>>         List: http://www.antlr.org/mailman/listinfo/antlr-interest
>>>         Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address
>>>           
>>
>>
>>
>>
>>     -- 
>>     MSc Gabriel Petrovay
>>     MCSA, MCDBA, MCAD
>>     Mobile: +41(0)787978034
>
>
>
>
> -- 
> MSc Gabriel Petrovay
> MCSA, MCDBA, MCAD
> Mobile: +41(0)787978034

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.antlr.org/pipermail/antlr-interest/attachments/20090324/26808d0e/attachment.html 


More information about the antlr-interest mailing list