[antlr-interest] Antlr Token Issue

James jameselliot at gmail.com
Tue Apr 3 04:05:20 PDT 2007


Hi,

I am having a problem with keywords being extracted to tokens and then
matching against more general requirements.

Is there a simple way to stop this in my grammar or do I need to reconsider
my rules?


An example grammar is:

=====================================================
grammar expr;
options {
    k=2;
    backtrack=true;
    memoize=true;
}

@header {
    package tests;
}

@lexer::header {
    package tests;
}

aprog    :    (WS | anitem)+
    ;
anitem    :     'hello' EQUALS QUOTE CHARS QUOTE
        {
            System.out.println("Have quoted text :  " + $CHARS.text);
        }
    ;
CHARS     :     ('a'..'z'|'A'..'Z')+
    ;
QUOTE    :    '"'
    ;
EQUALS    :    '='
    ;
WS    :    (' ' | '\t' | '\n') +
    ;
=========================================================================

A test class is:
========================================================================
package tests;

import org.antlr.runtime.ANTLRStringStream;
import org.antlr.runtime.CommonTokenStream;

public class DoTest {

    public static void main(String[] args) throws Throwable {
        if (args.length == 0) {
            System.out.println("Please provide input on command line");
        }
        else {
            exprLexer l = new exprLexer(new ANTLRStringStream(args[0]));
            CommonTokenStream tokens = new CommonTokenStream();
            tokens.setTokenSource(l);
            exprParser p = new exprParser(tokens);


            p.aprog();
        }
    }
}

========================================================================
Sample usage is:
========================================================================

$ java tests.DoTest "hello=\"there\""

Have quoted text :  there

$ java tests.DoTest "hello=\"hello\""

line 1:7 mismatched input 'hello' expecting CHARS
line 1:12 mismatched input '"' expecting EQUALS
line 0:-1 mismatched input '<EOF>' expecting CHARS

========================================================================

I am guessing that the second "hello" is matched by the tokenizer as type
HELLO.  Can I tell the tokenizer not to do this?
Or is there a simple way to refactor this?

Thank you,

James.

(All files attached).
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.antlr.org/pipermail/antlr-interest/attachments/20070403/e4a82410/attachment.html 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: DoTest.java
Type: text/x-java
Size: 509 bytes
Desc: not available
Url : http://www.antlr.org/pipermail/antlr-interest/attachments/20070403/e4a82410/attachment.bin 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: expr.g
Type: application/octet-stream
Size: 377 bytes
Desc: not available
Url : http://www.antlr.org/pipermail/antlr-interest/attachments/20070403/e4a82410/attachment.obj 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: build.xml
Type: text/xml
Size: 436 bytes
Desc: not available
Url : http://www.antlr.org/pipermail/antlr-interest/attachments/20070403/e4a82410/attachment.xml 


More information about the antlr-interest mailing list