[antlr-interest] strange? eat not match char

Mon Aug 13 09:50:49 PDT 2007

Hi, I do a simple grammar, as follows,
and do follow tests with antlrworks interpreter, and because message
whitespace, and I bracket string message with[] pair,
interpreter with "strings" rule:
1,[hello world] [ hello world] --->[hello world], ok
2,[ hello word ][ hello word ,][ hello word , ]-->NoViableAltException
3,[ hello word ,s]-->[ hello word ,s]
4,[ hello word , s]-->[s]
why get 3 and 4 result, it let me questionfull:)
at 3, comma not a char, but it present
at 4, message before comma eated, I not understand.
Could someone give helps?
Thanks.

grammar On16;

/*
options{
    k=2;
    output=AST;
}
*/
tokens{
COMMA    =    ',';
SEMI    =    ';';
COLON    =    ':';
LBAK    =    '{';
RBAK    =    '}';
SQUOTE    =    '\'';
DQUOTE    =    '"';
}
@header{package on;}
@lexer::header{package on;}

//document:    string|strings|object|objects|pairs;
/*******************************************
* parser rulers
********************************************/
strings    :    string  (COMMA string)* COMMA?;
name    :    words|WORD;
string    :    words|WORD|STRING;
words    :    WORDS;

WORDS    :    WORD (WHITE WORD)+;
//idname returns [string s] {s = " ";}:
// t=ID { s += t.getText(); }
//(options{greedy=true;}: ws=WS { s += ws.getText(); } t2=ID! { s +=
t2.getText(); } )*;

//must not be fragment
WORD    :    CHAR+;
STRING    :    SQUOTE (~(SQUOTE))* SQUOTE
    |    DQUOTE (~(DQUOTE))* DQUOTE
    ;

/*******************************************
* lexer rulers
********************************************/
fragment
WHITE    :    SPACE+ {$channel=0;};
WS    :    (SPACE | LINE)+ {$channel=HIDDEN;};
//META CHARACTOR;
fragment
CHAR    :    ~(COMMA | SEMI | COLON | LBAK | RBAK | SQUOTE | DQUOTE | SPACE);
fragment
SPACE    :    ' ' | '\t' | '\f';
fragment
CRLF    :    '\r' | '\n';
LINE    :    '\r'? '\n';
//WHITESPACE : ( '\t' | ' ' | '\r' | '\n'| '\u000C' )+;

-- 
致敬
向秦贤