[antlr-interest] lexer problem (BUG?)

Ruth Karl ruth.karl at gmx.de
Fri Jul 27 07:18:33 PDT 2007



Ruth Karl schrieb:
> Hi Andrew,
>
> thanks a lot for finding a smaller example to illustrate the problem.
> (Did you do it for java target or for c# - as I did?)

ok, I could have seen that ;-)
But for c# target it is exactly the same... :-(
>
> Now: what can I do?
> I could (...) try to find a workaround in my grammar, but if it IS a 
> bug - than a similar thing might happen in other cases as well....
>
> Thanks for any further suggestions,
>
> Ruth
>
>
> Andrew Lentvorski schrieb:
>> Ruth Karl wrote:
>>> Thanks, but I looked at it several times (even before I ever wrote 
>>> to this list) and still I can not see why when I start an input with 
>>> with '<sx' the lexer should loose itself in a rule wanting '<script' 
>>> as an input. (given the grammar I attached in my last posting).
>>> Any other suggestions?
>>
>> Looks like a bug to me:
>>
>> grammar jsp;
>>
>> JAVASCRIPT    :    '<script>' ( options {greedy=false;} : . )* 
>> '</script>' {System.out.print("J");};   ANY    :    . 
>> {System.out.print("A");};
>>
>> jsp        :    (ANY | JAVASCRIPT)* EOF;
>>
>> with input:
>>
>> <script>foo</script>
>> <s>bar</s>
>>
>>
>> Produces a token stream of:
>> "<script>foo</script>", "a", "r", "<", "/", "s", ">"
>>
>> aka
>>
>> JAVASCRIPT, ANY, ANY, ANY, ANY, ANY, ANY
>>
>> Something vacuums up the "<s>b"
>>
>> The output is:
>> line 2:2 mismatched character '>' expecting 'c'
>> JAAAAAAAA
>>
>> You might want to file it and see what the response is.
>>
>> -a
>>
>


More information about the antlr-interest mailing list