[antlr-interest] Whitespace matching

Bart Kiers bkiers at gmail.com
Fri Apr 13 04:39:04 PDT 2012


You must be doing something wrong/different. Perhaps you're running an old
.class file?
I copied your prolog.g grammar and Main.java file and did this:

wget http://www.antlr.org/download/antlr-3.4-complete.jar
java -cp antlr-3.4-complete.jar org.antlr.Tool prolog.g
javac -cp antlr-3.4-complete.jar *.java
java -cp .:antlr-3.4-complete.jar Main

which didn't produce any error or warning.

Regards,

Bart.


On Fri, Apr 13, 2012 at 1:06 PM, Jason Jones <jmjones5 at gmail.com> wrote:

> Stranger... Okay will I've done a manual test using this class:
>
> import org.antlr.runtime.*;
>
>
> public class Main {
>           public static void main(String[] args) throws Exception {
>               prologLexer lexer = new prologLexer(new
> ANTLRStringStream("\r\nL\r\n"));
>               prologParser parser = new prologParser(new
> CommonTokenStream(lexer));
>               parser.start();
>           }
> }
>
> After running it like so:
>
> $ java -cp .:/usr/local/antlr-3.4/lib/antlr-3.4-complete.jar Main
> line 1:0 mismatched input '\r\n' expecting WHITESPACE
>
> I still seem to be getting the same issue ^. Here's the current grammar
> that I used to create the parser and lexer:
>
>
> start : program EOF;
> program : WHITESPACE line+ WHITESPACE (query WHITESPACE)*;
> line    :       'L';
> query   :       'Q';
>
> WHITESPACE  : (' ' | '\t' | '\r' | '\n')+ ;
>
> Jason.
>
>
> On 13 April 2012 07:12, Bart Kiers <bkiers at gmail.com> wrote:
>
>> Both the interpreter and the debugger from ANTLRWorks (1.4.3) parse the
>> input just fine.
>>
>> I'm assuming you're not entering "\r" and "\n" as literals, but are
>> actually entering line breaks in the text areas of ANTLRWorks'
>> interpreter... Perhaps you've selected ANTLRWorks to start parsing with a
>> different rule than the `start` rule? Anyway, forget about ANTLRWorks for a
>> moment and whip up a manual test:
>>
>> public class Main {
>>   public static void main(String[] args) throws Exception {
>>     TLexer lexer = new TLexer(new ANTLRStringStream("\r\nL\r\n"));
>>     TParser parser = new TParser(new CommonTokenStream(lexer));
>>     parser.start();
>>   }
>> }
>>
>>
>> Bart.
>>
>>
>> On Fri, Apr 13, 2012 at 12:09 AM, Jason Jones <jmjones5 at gmail.com> wrote:
>>
>>> Hi Bart,
>>>
>>> I thing we're using different version of ANTLR (or something along those
>>> lines) as using your grammar I get a MismatchedTokenException using the
>>> input you've used "\r\nL\r\n". I'm currently using ANTLRWorks version
>>> 1.4.3, could this be the reason why your end seems to be working and mine
>>> not?
>>>
>>> Jason.
>>>
>>>
>>> On 12 April 2012 22:06, Bart Kiers <bkiers at gmail.com> wrote:
>>>
>>>> Hi Jason,
>>>>
>>>> Then there's something other than what you've posted going wrong, since
>>>> the parser generated from:
>>>>
>>>> start      : program EOF;
>>>> program    : WHITESPACE line+ WHITESPACE (query WHITESPACE)*;
>>>> line       : 'L';
>>>> query      : 'Q';
>>>> WHITESPACE : (' ' | '\t' | '\r' | '\n')+;
>>>>
>>>> parses the input "\r\nL\r\n" just fine.
>>>>
>>>> Regards,
>>>>
>>>> Bart.
>>>>
>>>>
>>>> On Thu, Apr 12, 2012 at 10:48 PM, Jason Jones <jmjones5 at gmail.com>wrote:
>>>>
>>>>> Hi Bart,
>>>>>
>>>>> Thanks for the suggestion, although it doesn't work either... The skip
>>>>> option does work but since I'll be doing something with the whitespace
>>>>> later I don't want to take this option. Is there something else we're
>>>>> missing?
>>>>>
>>>>> Jason.
>>>>>
>>>>>
>>>>> On 12 April 2012 19:10, Bart Kiers <bkiers at gmail.com> wrote:
>>>>>
>>>>>> Hi Jason,
>>>>>>
>>>>>> On Thu, Apr 12, 2012 at 6:43 PM, Jason Jones <jmjones5 at gmail.com>wrote:
>>>>>>
>>>>>>> ...
>>>>>>>
>>>>>>>
>>>>>>> start : program ;
>>>>>>> program : WHITESPACE line+ WHITESPACE (query WHITESPACE)*;
>>>>>>>
>>>>>>> WHITESPACE  : (' ' | '\t' | '\r' | '\n')* ; //currently only used in
>>>>>>> string
>>>>>>>
>>>>>>>
>>>>>> A lexer rule must always match something: if it can match zero chars,
>>>>>> it can/will go in an infinite loop.
>>>>>>
>>>>>> Do something like this:
>>>>>>
>>>>>> start : program ;
>>>>>> program : WHITESPACE? line+ WHITESPACE? (query WHITESPACE?)*;
>>>>>> WHITESPACE  : (' ' | '\t' | '\r' | '\n')+ ;
>>>>>>
>>>>>> or simply skip spaces like this:
>>>>>>
>>>>>> start : program ;
>>>>>> program : line+ query*;
>>>>>> WHITESPACE  : (' ' | '\t' | '\r' | '\n')+ {skip();} ;
>>>>>>
>>>>>> Regards,
>>>>>>
>>>>>> Bart.
>>>>>>
>>>>>
>>>>>
>>>>
>>>
>>
>


More information about the antlr-interest mailing list