[antlr-interest] Adding a Space Leads to Mismatch
Quintin Beukes
quintin.beukes at signio.co.za
Fri Feb 10 01:09:32 PST 2012
I have further simplified the grammer to the following.
Changing the "If " to "If" causes a perfectly fine match. Still
ALPHANUMERICSPACE is predicted as the input. It results in this error:
line 1:3 required (...)+ loop did not match anything at character '<EOF>'
It keeps predicting the wrong input. I have read through tons of
documents and am not seeing how to fix this whilst keeping
ALPHANUMERICSPACE (which is needed to match multiword tokens).
grammar DebugA;
@members {
public static void main(String[] args) throws Exception {
DebugALexer lex = new DebugALexer(new ANTLRStringStream("If "));
Token token;
while ((token = lex.nextToken())!=null) {
if ("<EOF>".equals(token.getText())) break;
System.out.println("Token: " + token.getType() + "/" + token.getText());
}
}
}
ruleExpression
: IF NEWLINE?
EOF
;
IF
: 'If';
ALPHANUMERICSPACE
: ('a'..'z' | 'A'..'Z' | '0'..'9')+ (' '+ ('a'..'z' | 'A'..'Z' | '0'..'9')+)*
;
WS
: (' '|'\t')+ {skip();}
;
NEWLINE
: '\r'? '\n'
;
Quintin Beukes
On Fri, Feb 10, 2012 at 10:17 AM, Quintin Beukes
<quintin.beukes at signio.co.za> wrote:
> I have tried to skip whitespace and have used tokens. The above
> grammar is mostly just in debug state.
>
> If I can narrow down the problem even further. The lexer keeps
> predicting the "If " to be ALPHANUMERICSPACE, so the lexer fails. I
> can actually not see why it would even do that, because this string
> can never even match ALPHANUMERICSPACE.
>
> Input:
> (If )
>
> grammar DebugA;
>
> tokens {
> IF = 'If';
> OB = '(';
> CB = ')';
> }
>
> fieldRules
> : rule
> EOF
> ;
>
> rule
> : OB ruleExpression CB NEWLINE
> ;
>
> ruleExpression
> : IF ALPHANUMERIC
> ;
>
> ALPHANUMERIC
> : ('a'..'z' | 'A'..'Z' | '0'..'9')+
> ;
>
> ALPHANUMERICSPACE
> : ('a'..'z' | 'A'..'Z' | '0'..'9')+ (' '+ ('a'..'z' | 'A'..'Z' | '0'..'9')+)*
> ;
>
> WS
> : (' '|'\t')+ {skip();}
> ;
>
> NEWLINE
> : '\r'? '\n'
> ;
>
>
> Quintin Beukes
>
> On Thu, Feb 9, 2012 at 9:30 PM, Jim Idle <jimi at temporal-wave.com> wrote:
>> Don't use 'strings' in your parser, create real tokens and list the
>> keywords and punctuation in the lexer before the generic rule. Also, it
>> does not look like you need the spaces, so try skipping them:
>>
>> LPAREN: '(' ;
>> ...
>> KEYWORD: 'keyword';
>> ....
>> ALPHANUMERICSPACE: 'A'..'Z'+ ... etc
>>
>> WS: (' '|'\t')+ { skip(); } ; // Then remove WS refs in your parser
>>
>>
>> Jim
>>
>>> -----Original Message-----
>>> From: antlr-interest-bounces at antlr.org [mailto:antlr-interest-
>>> bounces at antlr.org] On Behalf Of Quintin Beukes
>>> Sent: Thursday, February 09, 2012 11:20 AM
>>> To: antlr-interest at antlr.org
>>> Subject: Re: [antlr-interest] Adding a Space Leads to Mismatch
>>>
>>> I debugged the Lexer, and it seems that it's predictions for the next
>>> token always seems to match against ALPHANUMERICSPACE.
>>>
>>> How can I resolve such a prediction error? Even if just pointing me to
>>> the wiki.
>>>
>>> thanks,
>>> Quintin Beukes
>>>
>>> List: http://www.antlr.org/mailman/listinfo/antlr-interest
>>> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-
>>> email-address
>>
>> List: http://www.antlr.org/mailman/listinfo/antlr-interest
>> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address
More information about the antlr-interest
mailing list