[antlr-interest] Last symbol disappered on wrong expression

Dmitry Pavlov pavlov.dmitry.n at gmail.com
Mon Feb 21 04:33:04 PST 2011


Hi, all.

I'm writing a math expressions highlighter.
It'll be used in a text editor, so on text change we need to reparse the
expression and higlight it again.
I'm performing a text parsing with AST as a target, then some tree parsers
do some additional processing.

During grammar testing i've stucked with the following problem:
if i try to parse expression: sin("
or even: sin("hello there antlr
then parser create and AST with a single an error node with the text
sin(
but if we add the ending quote sin("" or sin("hello there antlr"
then error node contains all input text: sin("hello there antlr"

This was tested in AntlrWorks with standard Java target language (in debug
mode the input parsed string does not contains qouted text) and in simple
app with ActionScript target language.

Is this a bug or feature? Is there a way to fix this problem and get all
input symbols?

Simplified grammar which can reproduce this case:

grammar EatLast;

options {
  output = AST;
}

script: exp=expression EOF!;

expression
    :    additive;

additive
    :    (a=atom->$a)
        (op=SIGN b=atom
            -> ^($op $additive $b))*;

atom
    :    constant
    |    func
    |    LEFT_PAREN expression RIGHT_PAREN -> expression
    ;

constant
    :    NUMBER | STRING;

func
    :    id=ID LEFT_PAREN functionParams? RIGHT_PAREN -> ^(ID
functionParams?)
    ;

functionParams
    :    expression ( PARAM_SEPARATOR! expression)*
    ;


/*            LIXER RULES            */
PARAM_SEPARATOR  :     ',';

//PARANTHESIS
LEFT_PAREN: '(';
RIGHT_PAREN: ')';

//ARITHMETIC OPERATIONS
SIGN: '+' | '-';

//NUMBERS
NUMBER: INT;

fragment
INT :    DIGIT+ ;

ID  :    (LETTER|'_') (LETTER|DIGIT|'_')* ;

//WHITESPACES
WS  :   ( ' ' | '\t' | '\r' | '\n') {$channel=HIDDEN;} ;
//STRING ELEMENTS
STRING
    :  '"' ( ~('\\'|'"') )* '"'
    ;

fragment LETTER: LOWER | UPPER;
fragment LOWER: 'a'..'z';
fragment UPPER: 'A'..'Z';
fragment DIGIT: '0'..'9';


-- 
Sincerely, Pavlov Dmitry


More information about the antlr-interest mailing list