[antlr-interest] Ambiguity between floating point literal and method call

Ross Bamford roscoml at gmail.com
Thu Nov 3 13:18:15 PDT 2011


Thanks again Jim - got it integrated today and it's all working well now. I
just stubbed out an error collector and grabbed a few methods from the
JavaFX tree to get it working. I had to take out the logic that guards
against floating point hex and octal literals as I found they were stopping
me parsing method calls on those literals (which ideally I do want to be
able to do). I'm thinking that I'll probably check for any such invalid
floats either in the parser or the compiler.

Anyway, it's working like a charm now, so thanks again!

Ross

On Wed, Nov 2, 2011 at 4:48 PM, Jim Idle <jimi at temporal-wave.com> wrote:

> You can browse it here. Try not to depart from the lexer until you have
> this working.
>
>
>
>
> http://kenai.com/projects/openjfx-compiler/sources/jfx-debug/show/src/share/classes/com/sun/tools/javafx/antlr?rev=6727
>
>
>
> Jim
>
>
>
> *From:* Ross Bamford [mailto:roscoml at gmail.com]
> *Sent:* Wednesday, November 02, 2011 5:18 AM
> *To:* Jim Idle
> *Cc:* antlr-interest at antlr.org
> *Subject:* Re: [antlr-interest] Ambiguity between floating point literal
> and method call
>
>
>
> Thanks, Jim. I'd seen that FAQ page before, and had played with integrating
> that approach into my grammar, however I still don't seem to be able to get
> it to work - parsing input such as: "1.foo()" results in the 1 and it's
> period being matched together (outputting '1.'), meaning that my parser
> never sees the INTEGER DOT ID production, and I get NoViableAlt exceptions.
> Interestingly, after integrating the changes you suggested hex literal
> method calls also no longer work, which they do with my "normal" literal
> lexing.
>
>
>
> I would very much like to look at the JavaFX source and see how it's done
> over there. Unfortunately though I have very limited Internet service here
> (I live in a very rural area) and I wonder if you know if it's browseable
> online rather than having to download the source tree?
>
>
>
> Thanks again,
>
> Ross
>
> On Thu, Oct 27, 2011 at 12:02 AM, Jim Idle <jimi at temporal-wave.com> wrote:
>
> Please see the FAQ:
>
> http://www.antlr.org/wiki/display/ANTLR3/Lexer+grammar+for+floating+point%
> 2C+dot%2C+range%2C+time+specs<
> http://www.antlr.org/wiki/display/ANTLR3/Lexer+grammar+for+floating+point%25%0d%0a2C+dot%2C+range%2C+time+specs
> >
>
> Which you can modify for your purpose, then you can add INTEGER DOT ID in
> your parser. If you were to download the source code for the JavaFX
> compiler, you will see that it supports that exact syntax.
>
>
> Jim
>
> > -----Original Message-----
> > From: antlr-interest-bounces at antlr.org [mailto:antlr-interest-
> > bounces at antlr.org] On Behalf Of Ross Bamford
> > Sent: Wednesday, October 26, 2011 3:37 PM
> > To: antlr-interest at antlr.org
> > Subject: [antlr-interest] Ambiguity between floating point literal and
> > method call
>
> >
> > Hi all,
> >
> > Have posted here recently, and thanks again for all your help in
> > getting my various problems fixed. I'm implementing a basic scripting
> > language for use in embedded systems, and I've come across another
> > problem that, after much googling and tinkering I still can seem to
> > fix. In this language, numbers are first-class objects, and I need to
> > be able to call methods on them, in the standard way, e.g. 1.foo() .
> > However, I'm coming up against a problem whereby the parser can't
> > distinguish between this and floating point literals. I've tried
> > various combinations of predicates and the like, but just don't seem to
> > be able to get it working. Any help would be much appreciated!
> >
> > Thanks in advance,
> > Ross Bamford
> >
> > /* ** GRAMMAR FOLLOWS ** */
> > grammar BasicLang;
> >
> > options {
> >     output=AST;
> >     ASTLabelType=CommonTree;
> >     backtrack=true;
> >     memoize=true;
> > }
> >
> > tokens {
> >   ASSIGN;
> >   METHOD_CALL;
> >   ARGS;
> >   BLOCK;
> >   ORBLOCK;
> >   SELF;
> >   ASSIGN_RECEIVER;
> >   ASSIGN_LOCAL;
> >   FIELD_ACCESS;
> >   LVALUE;
> > }
> >
> > start_rule
> >   :   script
> >   ;
> >
> > script
> >   :   statement+
> >   |   EOF!
> >   ;
> >
> > statement
> >   :   expr terminator!
> >   ;
> >
> > expr
> >   :   assign_expr
> >   |   math_expr
> >   ;
> >
> > assign_expr
> > @init {boolean explicitReceiver=false;}
> >   :   (rec=IDENTIFIER DOT {explicitReceiver=true;})? id=IDENTIFIER
> > ASSIGN
> > expr -> {explicitReceiver}? ^(ASSIGN ASSIGN_RECEIVER[$rec.getText()]
> > LVALUE[$id.getText()] expr) -> ^(ASSIGN ASSIGN_LOCAL
> > LVALUE[$id.getText()]
> > expr)
> >   ;
> >
> > math_expr
> >   :   mult_expr ((ADD^|SUB^) mult_expr)*
> >   ;
> >
> > mult_expr
> >   :   pow_expr ((MUL^|DIV^|MOD^) pow_expr)*
> >   ;
> >
> > pow_expr
> >   :   unary_expr ((POW^) unary_expr)*
> >   ;
> >
> > unary_expr
> >   :   NOT? atom
> >   ;
> >
> > meth_call
> > @init {boolean explicitReceiver=false;}
> >   :   (IDENTIFIER DOT {explicitReceiver=true;})? func_call_expr ->
> > {explicitReceiver}? ^(METHOD_CALL IDENTIFIER func_call_expr) ->
> > ^(METHOD_CALL SELF func_call_expr)
> >   |   literal DOT func_call_expr -> ^(METHOD_CALL literal
> > func_call_expr)
> >   ;
> >
> > fragment
> > func_call_expr
> >   :   IDENTIFIER^ argument_list block? orblock?
> >   ;
> >
> > fragment
> > block
> >   :   LCURLY TERMINATOR? statement* RCURLY -> ^(BLOCK statement*)
> >   ;
> >
> > fragment
> > orblock
> >   :   OR LCURLY TERMINATOR? statement* RCURLY -> ^(ORBLOCK statement*)
> >   ;
> >
> > fragment
> > argument_list
> >   :   LPAREN (expr (COMMA expr)*)? RPAREN -> ^(ARGS expr expr*)?
> >   ;
> >
> > class_identifier
> >   :     rec=IDENTIFIER DOT id=IDENTIFIER -> ^(FIELD_ACCESS $rec $id)
> >   ;
> >
> > literal
> >   :     DECIMAL_LITERAL
> >   |     OCTAL_LITERAL
> >   |     HEX_LITERAL
> >   |     FLOATING_POINT_LITERAL
> >   |     STRING_LITERAL
> >   |     CHARACTER_LITERAL
> >   ;
> >
> > atom
> >   :     literal
> >   |     meth_call
> >   |     IDENTIFIER
> >   |     class_identifier
> >   |     LPAREN! expr RPAREN!
> >   ;
> >
> > terminator
> >   :     TERMINATOR
> >   |     EOF
> >   ;
> >
> > OR  :   'or';
> >
> > POW :   '^' ;
> > MOD :   '%' ;
> > ADD :   '+' ;
> > SUB :   '-' ;
> > DIV :   '/' ;
> > MUL :   '*' ;
> > NOT :   '!' ;
> >
> > ASSIGN
> >     :   '='
> >     ;
> >
> > LPAREN
> >     :   '('
> >     ;
> >
> > RPAREN
> >     :   ')'
> >     ;
> >
> > LCURLY
> >     :   '{'
> >     ;
> >
> > RCURLY
> >     :   '}'
> >     ;
> >
> > COMMA
> >     :   ','
> >     ;
> >
> > DOT :   '.' ;
> >
> > IDENTIFIER
> >   : ID_LETTER (ID_LETTER|'0'..'9')*
> >   ;
> >
> > fragment
> > ID_LETTER
> >   : '$'
> >   | 'A'..'Z'
> >   | 'a'..'z'
> >   | '_'
> >   ;
> >
> > CHARACTER_LITERAL
> >     :   '\'' ( EscapeSequence | ~('\''|'\\') ) '\''
> >     ;
> >
> > STRING_LITERAL
> >     :  '"' ( EscapeSequence | ~('\\'|'"') )* '"'
> >     ;
> >
> > HEX_LITERAL : '0' ('x'|'X') HexDigit+ IntegerTypeSuffix? ;
> >
> > DECIMAL_LITERAL : ('0' | '1'..'9' '0'..'9'*) IntegerTypeSuffix? ;
> >
> > OCTAL_LITERAL : '0' ('0'..'7')+ IntegerTypeSuffix? ;
> >
> > fragment
> > HexDigit : ('0'..'9'|'a'..'f'|'A'..'F') ;
> >
> > fragment
> > IntegerTypeSuffix
> >   : ('l'|'L')
> >   | ('u'|'U')  ('l'|'L')?
> >   ;
> >
> > FLOATING_POINT_LITERAL
> >     :   ('0'..'9')+ '.' ('0'..'9')* Exponent? FloatTypeSuffix?
> >     |   '.' ('0'..'9')+ Exponent? FloatTypeSuffix?
> >     |   ('0'..'9')+ Exponent? FloatTypeSuffix?
> >   ;
> >
> > fragment
> > Exponent : ('e'|'E') ('+'|'-')? ('0'..'9')+ ;
> >
> > fragment
> > FloatTypeSuffix : ('f'|'F'|'d'|'D') ;
> >
> > fragment
> > EscapeSequence
> >     :   '\\' ('b'|'t'|'n'|'f'|'r'|'\"'|'\''|'\\'|'/')
> >     |   OctalEscape
> >     |   UnicodeEscape
> >     ;
> >
> > fragment
> > OctalEscape
> >     :   '\\' ('0'..'3') ('0'..'7') ('0'..'7')
> >     |   '\\' ('0'..'7') ('0'..'7')
> >     |   '\\' ('0'..'7')
> >     ;
> >
> > fragment
> > UnicodeEscape
> >     :   '\\' 'u' HexDigit HexDigit HexDigit HexDigit
> >     ;
> > COMMENT
> >     :   '/*' ( options {greedy=false;} : . )* '*/' {$channel=HIDDEN;}
> >     ;
> >
> > LINE_COMMENT
> >     : '//' ~('\n'|'\r')* '\r'? '\n' {$channel=HIDDEN;}
> >     ;
> >
> > TERMINATOR
> >   : '\r'? '\n'
> >   | ';'
> >   ;
> >
> > WS  :  (' '|'\r'|'\t'|'\u000C') {$channel=HIDDEN;}
> >     |  '...' '\r'? '\n'  {$channel=HIDDEN;}
> >     ;
> >
>
> > List: http://www.antlr.org/mailman/listinfo/antlr-interest
> > Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-
> > email-address
>
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe:
> http://www.antlr.org/mailman/options/antlr-interest/your-email-address
>
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe:
> http://www.antlr.org/mailman/options/antlr-interest/your-email-address
>


More information about the antlr-interest mailing list