[antlr-interest] Last symbol disappered on wrong expression

Dmitry Pavlov pavlov.dmitry.n at gmail.com
Tue Feb 22 01:07:03 PST 2011


Found a great antlr wiki's article that describe exactly what i need.
http://www.antlr.org/wiki/pages/viewpage.action?pageId=5341230
Question is closed.

2011/2/21 Dmitry Pavlov <pavlov.dmitry.n at gmail.com>

> Hi, all.
>
> I'm writing a math expressions highlighter.
> It'll be used in a text editor, so on text change we need to reparse the
> expression and higlight it again.
> I'm performing a text parsing with AST as a target, then some tree parsers
> do some additional processing.
>
> During grammar testing i've stucked with the following problem:
> if i try to parse expression: sin("
> or even: sin("hello there antlr
> then parser create and AST with a single an error node with the text
> sin(
> but if we add the ending quote sin("" or sin("hello there antlr"
> then error node contains all input text: sin("hello there antlr"
>
> This was tested in AntlrWorks with standard Java target language (in debug
> mode the input parsed string does not contains qouted text) and in simple
> app with ActionScript target language.
>
> Is this a bug or feature? Is there a way to fix this problem and get all
> input symbols?
>
> Simplified grammar which can reproduce this case:
>
> grammar EatLast;
>
> options {
>   output = AST;
> }
>
> script: exp=expression EOF!;
>
> expression
>     :    additive;
>
> additive
>     :    (a=atom->$a)
>         (op=SIGN b=atom
>             -> ^($op $additive $b))*;
>
> atom
>     :    constant
>     |    func
>     |    LEFT_PAREN expression RIGHT_PAREN -> expression
>     ;
>
> constant
>     :    NUMBER | STRING;
>
> func
>     :    id=ID LEFT_PAREN functionParams? RIGHT_PAREN -> ^(ID
> functionParams?)
>     ;
>
> functionParams
>     :    expression ( PARAM_SEPARATOR! expression)*
>     ;
>
>
> /*            LIXER RULES            */
> PARAM_SEPARATOR  :     ',';
>
> //PARANTHESIS
> LEFT_PAREN: '(';
> RIGHT_PAREN: ')';
>
> //ARITHMETIC OPERATIONS
> SIGN: '+' | '-';
>
> //NUMBERS
> NUMBER: INT;
>
> fragment
> INT :    DIGIT+ ;
>
> ID  :    (LETTER|'_') (LETTER|DIGIT|'_')* ;
>
> //WHITESPACES
> WS  :   ( ' ' | '\t' | '\r' | '\n') {$channel=HIDDEN;} ;
> //STRING ELEMENTS
> STRING
>     :  '"' ( ~('\\'|'"') )* '"'
>     ;
>
> fragment LETTER: LOWER | UPPER;
> fragment LOWER: 'a'..'z';
> fragment UPPER: 'A'..'Z';
> fragment DIGIT: '0'..'9';
>
>
> --
> Sincerely, Pavlov Dmitry
>



-- 
Sincerely, Pavlov Dmitry


More information about the antlr-interest mailing list