[antlr-interest] Antlr dropping tokens?

Jon Schewe jpschewe at mtu.net
Mon Jan 21 07:27:27 PST 2008


Here's a piece of my grammar that I'm testing with junit.  It appears
that some tokens are getting dropped.
The input is this:
b[1 ... (param1 - 5 - 1) * 4]

The resulting tree is this:
(b (SUM (PRODUCT (NUMBER 1))) (SUM (PRODUCT (SUM (PRODUCT param1)
(PRODUCT (NUMBER -1) (NUMBER 5)) (PRODUCT (NUMBER -1) (NUMBER 1))))))

Where did the "* 4" go?  I didn't think ANTLR could drop tokens like
that.  This is using antlr 3.0.1.

The grammar fragment is below:
/**
* Name used in a variable declaration.
*/
nameDecl :
  IDENT -> IDENT
| IDENT LBRACK lb=subscriptAddExpr[false] ELLIPSIS
ub=subscriptAddExpr[false] RBRACK -> ^(IDENT $lb $ub)
;
subscriptAddExpr[boolean negate]
:
subscriptMultExpr[negate] (PLUS subscriptMultExpr[negate] | MINUS
subscriptMultExpr[!negate])* -> ^(SUM subscriptMultExpr+)

;

/**
* @param negate if true, negate all expressions by multiplying by -1
*/
subscriptMultExpr[boolean negate]
:
  a+=subscriptAtom (a+=subscriptMultHelp)* -> {negate}? ^(PRODUCT
^(NUMBER NUM_INT["-1"]) $a)
                                           ->           ^(PRODUCT $a)
;

subscriptMultHelp : PRODUCT subscriptAtom -> subscriptAtom ;

/**
* Base type that can be inside a subscript.
*/
subscriptAtom
:
  IDENT
| numint
| subscriptParExpression
;

subscriptParExpression
:
LPAREN subscriptAddExpr[false] RPAREN -> subscriptAddExpr
;


/**
* A finite integer number.  May be negative.
*/
numint
:
  MINUS NUM_INT -> ^(NUMBER ^(MINUS NUM_INT))
| NUM_INT -> ^(NUMBER NUM_INT)
;

// ----------- Lexer ---------------------
// Operators
LPAREN          :   '('     ;
RPAREN          :   ')'     ;
LBRACK          :   '['     ;
RBRACK          :   ']'     ;
ELLIPSIS        :   '...'   ;
EQ              :   '='     ;
MINUS           :   '-'     ;
PLUS            :   '+'     ;
SEMI            :   ';'     ;
LCURLY          :   '{'     ;
RCURLY          :   '}'     ;
LE              :   '<='    ;
COLON           :   ':'     ;
COMMA           :   ','     ;
PRODUCT         :   '*'     ;

// Keywords
IN              :   'in'    ;

// Functions
SUMMATION       :   'SUM'   ;
LOOP            :   'LOOP'  ;
TAN             :   'tan'   ;
COS             :   'cos'   ;
SIN             :   'sin'   ;
LOG             :   'log'   ;
LOG10           :   'log10' ;
EXP             :   'exp'   ;
POW             :   'pow'   ;


/** Single-line comments */
SL_COMMENT
  : '//' ~( '\n'|'\r' )* '\r'? '\n' { $channel=HIDDEN; }
    ;

/** multiple-line comments */
ML_COMMENT
    :    '/*'
        ( options {greedy=false;} : . )*
        '*/'
        {$channel=HIDDEN;}
    ;

IDENT :
  ('a'..'z'|'A'..'Z'|'_') ('a'..'z'|'A'..'Z'|'_'|'0'..'9')*
  ;

// a numeric literal
NUM_INT
  : ('0'..'9')+ EXPONENT?
  ;

NUM_FLOAT
    :     DIGITS '.' DIGITS? EXPONENT?
    | '.' DIGITS EXPONENT?
    ;

fragment
DIGITS : ('0'..'9')+ ;


// a protected method to assist in matching floating point numbers
fragment
EXPONENT
  : ('e'|'E') ('+'|'-')? ('0'..'9')+
  ;

// Whitespace -- ignored
WS    :    (    ' '
        |    '\t'
        |    '\f'
            // handle newlines
        |    (    '\r\n'  // Evil DOS
            |    '\n'    // Unix (the right way)
            )
        )+
        { $channel=HIDDEN; }
    ;



-- 
Jon Schewe | http://mtu.net/~jpschewe
If you see an attachment named signature.asc, this is my digital
signature.
See http://www.gnupg.org for more information.

For I am convinced that neither death nor life, neither angels
nor demons, neither the present nor the future, nor any
powers, neither height nor depth, nor anything else in all
creation, will be able to separate us from the love of God that
is in Christ Jesus our Lord. - Romans 8:38-39




More information about the antlr-interest mailing list