[antlr-interest] Why does antlr not know alternative?

John B. Brodie jbb at acm.org
Tue Jan 10 08:01:39 PST 2012


Greetings!

On 01/10/2012 12:20 AM, James Ladd wrote:
> I fixed this issue with NUMBER by making it a parser rule.  See grammar below.

are comments permitted to be embedded inside your numbers?
by making this a parser rule something like 123"comment"."another"456 
will be parsed as a number.

> number returns [Number number]
>    : (d1=DIGITS r='r')? (m1='-')? d2=DIGITS ('.' d3=DIGITS)? (e='e' (m2='-')? d4=DIGITS)? {number = new Number($d1.text, $r.text, $m1.text, $d2.text, $d2.line,$d3.text, $e.text, $m2.text, $d4.text);}
>    ;
>
> Some numbers I can't parse are "4r1" "2r-3e4" "9e4", and I don't understand why.

i think the r1 in the first example is being lexed as an IDENTIFIER and 
likewise for the e4's in the other two examples.

you might try to dump out the token stream between your lexing and 
parsing phases and see if the stream contains what you expect.

>
> There is also an issue parsing '#-' and again i'm not sure why since '#' occurs only in the symbol constant rule.
'#' is also in the array_constant rule, but i do not think that is 
relevant to this particular problem.

you have a '-' in your number parsing rule and '-' in your 
BINARY_SELECTOR lexing rule and of course these are not the same tokens.

again dumping the token stream will probably point out the issue...


>
> Any suggestions?

all of the above is just speculation since you did not supply a grammar 
that i could actually run in order to reproduce your problem. hope this 
helps anyway...

>
> Rgs, James.
>
>
> grammar Temp;
>
> options {
>    language = Java;
> }
> @header {
>    package st.redline.compiler;
> }
> @lexer::header {
>    package st.redline.compiler;
> }
> @lexer::members {
>    List<RecognitionException>  exceptions = new ArrayList<RecognitionException>();
>    public List<RecognitionException>  getExceptions() { return exceptions; }
>    public void reportError(RecognitionException e) { super.reportError(e); exceptions.add(e); }
> }
>
> program
>    : primary* EOF
>    ;
>
> primary returns [Primary primary]
>    : WHITESPACE?
>      ( IDENTIFIER {primary = new Identifier($IDENTIFIER.text, $IDENTIFIER.line);}
>      | number {primary = $number.number;}
>      | symbol_constant {primary = $symbol_constant.symbolConstant;}
>      | CHARACTER_CONSTANT {primary = new CharacterConstant($CHARACTER_CONSTANT.text.substring(1), $CHARACTER_CONSTANT.line);}
>      | STRING {primary = new StringConstant($STRING.text, $STRING.line);}
>      | array_constant {primary = $array_constant.arrayConstant; }
>      | block {primary = $block.block;}
>      | '(' expression WHITESPACE? ')'
>      )
>    ;
>
> statements returns [Statements statements]
>    : non_empty_statements? { statements = $non_empty_statements.statements; }
>    ;
>
> non_empty_statements returns [Statements statements]
>    : WHITESPACE? a='^'  expression '.' {statements = new AnswerStatement($a.line, $expression.expression);}
>    ;
>
> expression returns [Expression expression]
>    :  WHITESPACE? IDENTIFIER WHITESPACE? ':=' e=expression {expression = new AssignmentExpression($IDENTIFIER.text, $IDENTIFIER.line, $e.expression);}
>    | simple_expression {expression = $simple_expression.simpleExpression;}
>    ;
>
> simple_expression returns [SimpleExpression simpleExpression]
>    @init { simpleExpression = new SimpleExpression(); }
>    : primary {simpleExpression.add($primary.primary);}
>    ;
>
> block returns [Block block]
>    : o= '[' WHITESPACE? block_arguments? WHITESPACE? temporaries? ']' {block = new Block($o.line, $block_arguments.blockArguments, $temporaries.temporaries);}
>    ;
>
> temporaries returns [List<Identifier>  temporaries]
>    @init { temporaries = new ArrayList<Identifier>(); }
>    : ('|' | '||' | '|' WHITESPACE? '|')  WHITESPACE? (IDENTIFIER WHITESPACE? {temporaries.add(new Identifier($IDENTIFIER.text, $IDENTIFIER.line));})+ '|' WHITESPACE?
>    ;
>
> block_arguments returns [List<BlockArgument>  blockArguments]
>    @init { blockArguments = new ArrayList<BlockArgument>(); }
>    : (BLOCK_ARGUMENT WHITESPACE? {blockArguments.add(new BlockArgument($BLOCK_ARGUMENT.text.substring(1), $BLOCK_ARGUMENT.line));})+ '|'? WHITESPACE?
>    ;
>
> array_constant returns [ArrayConstant arrayConstant]
>    : h='#' array {arrayConstant = new ArrayConstant($array.array, $h.line);}
>    ;
>
> array returns [Array array]
>    @init { array = new Array(); }
>    : '(' (array_element {array.add($array_element.arrayElement);})* ')'
>    ;
>
> array_element returns [ArrayElement arrayElement]
>    : WHITESPACE
>    | number {arrayElement = $number.number;}
>    | symbol {arrayElement = $symbol.symbol;}
>    | STRING {arrayElement = new StringConstant($STRING.text, $STRING.line);}
>    | CHARACTER_CONSTANT {arrayElement = new CharacterConstant($CHARACTER_CONSTANT.text.substring(1), $CHARACTER_CONSTANT.line);}
>    | array {arrayElement = $array.array;}
>    ;
>
> symbol_constant returns [SymbolConstant symbolConstant]
>    : '#' symbol {symbolConstant = new SymbolConstant($symbol.symbol.value(), $symbol.symbol.line());}
>    ;
>
> symbol returns [Symbol symbol]
>    @init { symbol = new Symbol(); }
>    :  BINARY_SELECTOR {symbol.valueAndLine($BINARY_SELECTOR.text, $BINARY_SELECTOR.line);}
>    | IDENTIFIER {symbol.valueAndLine($IDENTIFIER.text, $IDENTIFIER.line); }
>    | (KEYWORD {symbol.addValueAndLine($KEYWORD.text, $KEYWORD.line);} )+    // Decision can match input such as "KEYWORD" using multiple alternatives: 1, 2
>    ;
>
> number returns [Number number]
>    : (d1=DIGITS r='r')? (m1='-')? d2=DIGITS ('.' d3=DIGITS)? (e='e' (m2='-')? d4=DIGITS)? {number = new Number($d1.text, $r.text, $m1.text, $d2.text, $d2.line,$d3.text, $e.text, $m2.text, $d4.text);}
>    ;
>
> WHITESPACE:        (' '|'\t'|'\r'|'\n')+;
> COMMENT:        '"' .* '"' {$channel = HIDDEN;};
> BINARY_SELECTOR:    ('-' (SPECIAL_CHAR)?) | (SPECIAL_CHAR)+;
> KEYWORD:        IDENTIFIER ':';
> BLOCK_ARGUMENT:    ':' IDENTIFIER;
> IDENTIFIER:        LETTER (LETTER | DIGIT)*;
> CHARACTER_CONSTANT:    '$' ('\'' | '"' | SPECIAL_CHAR | NORMAL_CHAR | DIGIT | LETTER);
> STRING:        '\'' (~'\''|'\'\'')* '\'';
> DIGITS:        DIGIT+;
>
> fragment LETTER:        ('a'..'z' | 'A'..'Z');
> fragment DIGIT:        '0'..'9';
> fragment SPECIAL_CHAR:        '+'|'/'|'\\'|'*'|'~'|'<'|'>'|'='|'@'|'%'|'|'|'&'|'?'|'!'|',';
> fragment NORMAL_CHAR:        '['|']'|'{'|'}'|'('|')'|'^'|'_'|';'|'$'|'#'|':'|'.'|'\'';
>
> *end*
>   		 	   		
>
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address



More information about the antlr-interest mailing list