[antlr-interest] Antlr dropping tokens?
Mark Volkmann
r.mark.volkmann at gmail.com
Mon Jan 21 07:56:34 PST 2008
Could this be related to not having a rule that ends in EOF?
On Jan 21, 2008 9:27 AM, Jon Schewe <jpschewe at mtu.net> wrote:
> Here's a piece of my grammar that I'm testing with junit. It appears
> that some tokens are getting dropped.
> The input is this:
> b[1 ... (param1 - 5 - 1) * 4]
>
> The resulting tree is this:
> (b (SUM (PRODUCT (NUMBER 1))) (SUM (PRODUCT (SUM (PRODUCT param1)
> (PRODUCT (NUMBER -1) (NUMBER 5)) (PRODUCT (NUMBER -1) (NUMBER 1))))))
>
> Where did the "* 4" go? I didn't think ANTLR could drop tokens like
> that. This is using antlr 3.0.1.
>
> The grammar fragment is below:
> /**
> * Name used in a variable declaration.
> */
> nameDecl :
> IDENT -> IDENT
> | IDENT LBRACK lb=subscriptAddExpr[false] ELLIPSIS
> ub=subscriptAddExpr[false] RBRACK -> ^(IDENT $lb $ub)
> ;
> subscriptAddExpr[boolean negate]
> :
> subscriptMultExpr[negate] (PLUS subscriptMultExpr[negate] | MINUS
> subscriptMultExpr[!negate])* -> ^(SUM subscriptMultExpr+)
>
> ;
>
> /**
> * @param negate if true, negate all expressions by multiplying by -1
> */
> subscriptMultExpr[boolean negate]
> :
> a+=subscriptAtom (a+=subscriptMultHelp)* -> {negate}? ^(PRODUCT
> ^(NUMBER NUM_INT["-1"]) $a)
> -> ^(PRODUCT $a)
> ;
>
> subscriptMultHelp : PRODUCT subscriptAtom -> subscriptAtom ;
>
> /**
> * Base type that can be inside a subscript.
> */
> subscriptAtom
> :
> IDENT
> | numint
> | subscriptParExpression
> ;
>
> subscriptParExpression
> :
> LPAREN subscriptAddExpr[false] RPAREN -> subscriptAddExpr
> ;
>
>
> /**
> * A finite integer number. May be negative.
> */
> numint
> :
> MINUS NUM_INT -> ^(NUMBER ^(MINUS NUM_INT))
> | NUM_INT -> ^(NUMBER NUM_INT)
> ;
>
> // ----------- Lexer ---------------------
> // Operators
> LPAREN : '(' ;
> RPAREN : ')' ;
> LBRACK : '[' ;
> RBRACK : ']' ;
> ELLIPSIS : '...' ;
> EQ : '=' ;
> MINUS : '-' ;
> PLUS : '+' ;
> SEMI : ';' ;
> LCURLY : '{' ;
> RCURLY : '}' ;
> LE : '<=' ;
> COLON : ':' ;
> COMMA : ',' ;
> PRODUCT : '*' ;
>
> // Keywords
> IN : 'in' ;
>
> // Functions
> SUMMATION : 'SUM' ;
> LOOP : 'LOOP' ;
> TAN : 'tan' ;
> COS : 'cos' ;
> SIN : 'sin' ;
> LOG : 'log' ;
> LOG10 : 'log10' ;
> EXP : 'exp' ;
> POW : 'pow' ;
>
>
> /** Single-line comments */
> SL_COMMENT
> : '//' ~( '\n'|'\r' )* '\r'? '\n' { $channel=HIDDEN; }
> ;
>
> /** multiple-line comments */
> ML_COMMENT
> : '/*'
> ( options {greedy=false;} : . )*
> '*/'
> {$channel=HIDDEN;}
> ;
>
> IDENT :
> ('a'..'z'|'A'..'Z'|'_') ('a'..'z'|'A'..'Z'|'_'|'0'..'9')*
> ;
>
> // a numeric literal
> NUM_INT
> : ('0'..'9')+ EXPONENT?
> ;
>
> NUM_FLOAT
> : DIGITS '.' DIGITS? EXPONENT?
> | '.' DIGITS EXPONENT?
> ;
>
> fragment
> DIGITS : ('0'..'9')+ ;
>
>
> // a protected method to assist in matching floating point numbers
> fragment
> EXPONENT
> : ('e'|'E') ('+'|'-')? ('0'..'9')+
> ;
>
> // Whitespace -- ignored
> WS : ( ' '
> | '\t'
> | '\f'
> // handle newlines
> | ( '\r\n' // Evil DOS
> | '\n' // Unix (the right way)
> )
> )+
> { $channel=HIDDEN; }
> ;
>
>
>
> --
> Jon Schewe | http://mtu.net/~jpschewe
> If you see an attachment named signature.asc, this is my digital
> signature.
> See http://www.gnupg.org for more information.
>
> For I am convinced that neither death nor life, neither angels
> nor demons, neither the present nor the future, nor any
> powers, neither height nor depth, nor anything else in all
> creation, will be able to separate us from the love of God that
> is in Christ Jesus our Lord. - Romans 8:38-39
>
>
>
--
R. Mark Volkmann
Object Computing, Inc.
More information about the antlr-interest
mailing list