[antlr-interest] Whitespace matching

Thu Apr 12 23:12:10 PDT 2012

Both the interpreter and the debugger from ANTLRWorks (1.4.3) parse the
input just fine.

I'm assuming you're not entering "\r" and "\n" as literals, but are
actually entering line breaks in the text areas of ANTLRWorks'
interpreter... Perhaps you've selected ANTLRWorks to start parsing with a
different rule than the `start` rule? Anyway, forget about ANTLRWorks for a
moment and whip up a manual test:

public class Main {
  public static void main(String[] args) throws Exception {
    TLexer lexer = new TLexer(new ANTLRStringStream("\r\nL\r\n"));
    TParser parser = new TParser(new CommonTokenStream(lexer));
    parser.start();
  }
}

Bart.

On Fri, Apr 13, 2012 at 12:09 AM, Jason Jones <jmjones5 at gmail.com> wrote:

> Hi Bart,
>
> I thing we're using different version of ANTLR (or something along those
> lines) as using your grammar I get a MismatchedTokenException using the
> input you've used "\r\nL\r\n". I'm currently using ANTLRWorks version
> 1.4.3, could this be the reason why your end seems to be working and mine
> not?
>
> Jason.
>
>
> On 12 April 2012 22:06, Bart Kiers <bkiers at gmail.com> wrote:
>
>> Hi Jason,
>>
>> Then there's something other than what you've posted going wrong, since
>> the parser generated from:
>>
>> start      : program EOF;
>> program    : WHITESPACE line+ WHITESPACE (query WHITESPACE)*;
>> line       : 'L';
>> query      : 'Q';
>> WHITESPACE : (' ' | '\t' | '\r' | '\n')+;
>>
>> parses the input "\r\nL\r\n" just fine.
>>
>> Regards,
>>
>> Bart.
>>
>>
>> On Thu, Apr 12, 2012 at 10:48 PM, Jason Jones <jmjones5 at gmail.com> wrote:
>>
>>> Hi Bart,
>>>
>>> Thanks for the suggestion, although it doesn't work either... The skip
>>> option does work but since I'll be doing something with the whitespace
>>> later I don't want to take this option. Is there something else we're
>>> missing?
>>>
>>> Jason.
>>>
>>>
>>> On 12 April 2012 19:10, Bart Kiers <bkiers at gmail.com> wrote:
>>>
>>>> Hi Jason,
>>>>
>>>> On Thu, Apr 12, 2012 at 6:43 PM, Jason Jones <jmjones5 at gmail.com>wrote:
>>>>
>>>>> ...
>>>>>
>>>>>
>>>>> start : program ;
>>>>> program : WHITESPACE line+ WHITESPACE (query WHITESPACE)*;
>>>>>
>>>>> WHITESPACE  : (' ' | '\t' | '\r' | '\n')* ; //currently only used in
>>>>> string
>>>>>
>>>>>
>>>> A lexer rule must always match something: if it can match zero chars,
>>>> it can/will go in an infinite loop.
>>>>
>>>> Do something like this:
>>>>
>>>> start : program ;
>>>> program : WHITESPACE? line+ WHITESPACE? (query WHITESPACE?)*;
>>>> WHITESPACE  : (' ' | '\t' | '\r' | '\n')+ ;
>>>>
>>>> or simply skip spaces like this:
>>>>
>>>> start : program ;
>>>> program : line+ query*;
>>>> WHITESPACE  : (' ' | '\t' | '\r' | '\n')+ {skip();} ;
>>>>
>>>> Regards,
>>>>
>>>> Bart.
>>>>
>>>
>>>
>>
>