[antlr-interest] Ambiguity between floating point literal and method call

Ross Bamford roscoml at gmail.com
Wed Nov 2 05:18:08 PDT 2011


Thanks, Jim. I'd seen that FAQ page before, and had played with integrating
that approach into my grammar, however I still don't seem to be able to get
it to work - parsing input such as: "1.foo()" results in the 1 and it's
period being matched together (outputting '1.'), meaning that my parser
never sees the INTEGER DOT ID production, and I get NoViableAlt exceptions.
Interestingly, after integrating the changes you suggested hex literal
method calls also no longer work, which they do with my "normal" literal
lexing.

I would very much like to look at the JavaFX source and see how it's done
over there. Unfortunately though I have very limited Internet service here
(I live in a very rural area) and I wonder if you know if it's browseable
online rather than having to download the source tree?

Thanks again,
Ross

On Thu, Oct 27, 2011 at 12:02 AM, Jim Idle <jimi at temporal-wave.com> wrote:

> Please see the FAQ:
>
> http://www.antlr.org/wiki/display/ANTLR3/Lexer+grammar+for+floating+point%
> 2C+dot%2C+range%2C+time+specs
>
> Which you can modify for your purpose, then you can add INTEGER DOT ID in
> your parser. If you were to download the source code for the JavaFX
> compiler, you will see that it supports that exact syntax.
>
>
> Jim
>
> > -----Original Message-----
> > From: antlr-interest-bounces at antlr.org [mailto:antlr-interest-
> > bounces at antlr.org] On Behalf Of Ross Bamford
> > Sent: Wednesday, October 26, 2011 3:37 PM
> > To: antlr-interest at antlr.org
> > Subject: [antlr-interest] Ambiguity between floating point literal and
> > method call
> >
> > Hi all,
> >
> > Have posted here recently, and thanks again for all your help in
> > getting my various problems fixed. I'm implementing a basic scripting
> > language for use in embedded systems, and I've come across another
> > problem that, after much googling and tinkering I still can seem to
> > fix. In this language, numbers are first-class objects, and I need to
> > be able to call methods on them, in the standard way, e.g. 1.foo() .
> > However, I'm coming up against a problem whereby the parser can't
> > distinguish between this and floating point literals. I've tried
> > various combinations of predicates and the like, but just don't seem to
> > be able to get it working. Any help would be much appreciated!
> >
> > Thanks in advance,
> > Ross Bamford
> >
> > /* ** GRAMMAR FOLLOWS ** */
> > grammar BasicLang;
> >
> > options {
> >     output=AST;
> >     ASTLabelType=CommonTree;
> >     backtrack=true;
> >     memoize=true;
> > }
> >
> > tokens {
> >   ASSIGN;
> >   METHOD_CALL;
> >   ARGS;
> >   BLOCK;
> >   ORBLOCK;
> >   SELF;
> >   ASSIGN_RECEIVER;
> >   ASSIGN_LOCAL;
> >   FIELD_ACCESS;
> >   LVALUE;
> > }
> >
> > start_rule
> >   :   script
> >   ;
> >
> > script
> >   :   statement+
> >   |   EOF!
> >   ;
> >
> > statement
> >   :   expr terminator!
> >   ;
> >
> > expr
> >   :   assign_expr
> >   |   math_expr
> >   ;
> >
> > assign_expr
> > @init {boolean explicitReceiver=false;}
> >   :   (rec=IDENTIFIER DOT {explicitReceiver=true;})? id=IDENTIFIER
> > ASSIGN
> > expr -> {explicitReceiver}? ^(ASSIGN ASSIGN_RECEIVER[$rec.getText()]
> > LVALUE[$id.getText()] expr) -> ^(ASSIGN ASSIGN_LOCAL
> > LVALUE[$id.getText()]
> > expr)
> >   ;
> >
> > math_expr
> >   :   mult_expr ((ADD^|SUB^) mult_expr)*
> >   ;
> >
> > mult_expr
> >   :   pow_expr ((MUL^|DIV^|MOD^) pow_expr)*
> >   ;
> >
> > pow_expr
> >   :   unary_expr ((POW^) unary_expr)*
> >   ;
> >
> > unary_expr
> >   :   NOT? atom
> >   ;
> >
> > meth_call
> > @init {boolean explicitReceiver=false;}
> >   :   (IDENTIFIER DOT {explicitReceiver=true;})? func_call_expr ->
> > {explicitReceiver}? ^(METHOD_CALL IDENTIFIER func_call_expr) ->
> > ^(METHOD_CALL SELF func_call_expr)
> >   |   literal DOT func_call_expr -> ^(METHOD_CALL literal
> > func_call_expr)
> >   ;
> >
> > fragment
> > func_call_expr
> >   :   IDENTIFIER^ argument_list block? orblock?
> >   ;
> >
> > fragment
> > block
> >   :   LCURLY TERMINATOR? statement* RCURLY -> ^(BLOCK statement*)
> >   ;
> >
> > fragment
> > orblock
> >   :   OR LCURLY TERMINATOR? statement* RCURLY -> ^(ORBLOCK statement*)
> >   ;
> >
> > fragment
> > argument_list
> >   :   LPAREN (expr (COMMA expr)*)? RPAREN -> ^(ARGS expr expr*)?
> >   ;
> >
> > class_identifier
> >   :     rec=IDENTIFIER DOT id=IDENTIFIER -> ^(FIELD_ACCESS $rec $id)
> >   ;
> >
> > literal
> >   :     DECIMAL_LITERAL
> >   |     OCTAL_LITERAL
> >   |     HEX_LITERAL
> >   |     FLOATING_POINT_LITERAL
> >   |     STRING_LITERAL
> >   |     CHARACTER_LITERAL
> >   ;
> >
> > atom
> >   :     literal
> >   |     meth_call
> >   |     IDENTIFIER
> >   |     class_identifier
> >   |     LPAREN! expr RPAREN!
> >   ;
> >
> > terminator
> >   :     TERMINATOR
> >   |     EOF
> >   ;
> >
> > OR  :   'or';
> >
> > POW :   '^' ;
> > MOD :   '%' ;
> > ADD :   '+' ;
> > SUB :   '-' ;
> > DIV :   '/' ;
> > MUL :   '*' ;
> > NOT :   '!' ;
> >
> > ASSIGN
> >     :   '='
> >     ;
> >
> > LPAREN
> >     :   '('
> >     ;
> >
> > RPAREN
> >     :   ')'
> >     ;
> >
> > LCURLY
> >     :   '{'
> >     ;
> >
> > RCURLY
> >     :   '}'
> >     ;
> >
> > COMMA
> >     :   ','
> >     ;
> >
> > DOT :   '.' ;
> >
> > IDENTIFIER
> >   : ID_LETTER (ID_LETTER|'0'..'9')*
> >   ;
> >
> > fragment
> > ID_LETTER
> >   : '$'
> >   | 'A'..'Z'
> >   | 'a'..'z'
> >   | '_'
> >   ;
> >
> > CHARACTER_LITERAL
> >     :   '\'' ( EscapeSequence | ~('\''|'\\') ) '\''
> >     ;
> >
> > STRING_LITERAL
> >     :  '"' ( EscapeSequence | ~('\\'|'"') )* '"'
> >     ;
> >
> > HEX_LITERAL : '0' ('x'|'X') HexDigit+ IntegerTypeSuffix? ;
> >
> > DECIMAL_LITERAL : ('0' | '1'..'9' '0'..'9'*) IntegerTypeSuffix? ;
> >
> > OCTAL_LITERAL : '0' ('0'..'7')+ IntegerTypeSuffix? ;
> >
> > fragment
> > HexDigit : ('0'..'9'|'a'..'f'|'A'..'F') ;
> >
> > fragment
> > IntegerTypeSuffix
> >   : ('l'|'L')
> >   | ('u'|'U')  ('l'|'L')?
> >   ;
> >
> > FLOATING_POINT_LITERAL
> >     :   ('0'..'9')+ '.' ('0'..'9')* Exponent? FloatTypeSuffix?
> >     |   '.' ('0'..'9')+ Exponent? FloatTypeSuffix?
> >     |   ('0'..'9')+ Exponent? FloatTypeSuffix?
> >   ;
> >
> > fragment
> > Exponent : ('e'|'E') ('+'|'-')? ('0'..'9')+ ;
> >
> > fragment
> > FloatTypeSuffix : ('f'|'F'|'d'|'D') ;
> >
> > fragment
> > EscapeSequence
> >     :   '\\' ('b'|'t'|'n'|'f'|'r'|'\"'|'\''|'\\'|'/')
> >     |   OctalEscape
> >     |   UnicodeEscape
> >     ;
> >
> > fragment
> > OctalEscape
> >     :   '\\' ('0'..'3') ('0'..'7') ('0'..'7')
> >     |   '\\' ('0'..'7') ('0'..'7')
> >     |   '\\' ('0'..'7')
> >     ;
> >
> > fragment
> > UnicodeEscape
> >     :   '\\' 'u' HexDigit HexDigit HexDigit HexDigit
> >     ;
> > COMMENT
> >     :   '/*' ( options {greedy=false;} : . )* '*/' {$channel=HIDDEN;}
> >     ;
> >
> > LINE_COMMENT
> >     : '//' ~('\n'|'\r')* '\r'? '\n' {$channel=HIDDEN;}
> >     ;
> >
> > TERMINATOR
> >   : '\r'? '\n'
> >   | ';'
> >   ;
> >
> > WS  :  (' '|'\r'|'\t'|'\u000C') {$channel=HIDDEN;}
> >     |  '...' '\r'? '\n'  {$channel=HIDDEN;}
> >     ;
> >
> > List: http://www.antlr.org/mailman/listinfo/antlr-interest
> > Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-
> > email-address
>
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe:
> http://www.antlr.org/mailman/options/antlr-interest/your-email-address
>


More information about the antlr-interest mailing list