[antlr-interest] Can anyone help with a basic grammar problem in Antlr 3?

Michael Bedward michael.bedward at gmail.com
Thu Oct 13 17:04:13 PDT 2011


Hi Ross,

For a bit of a newbie that's a nice grammar - much neater than any of mine :)

If you rearrange your expr rule so that the assign_expr is the first
alternative...

expr
 :   assign_expr
 |   math_expr
 |   meth_call_expr
 ;

...I think that the grammar should be able to parse things like a = 1 + (b = 2)

Michael


On 14 October 2011 10:38, Ross Bamford <roscoml at gmail.com> wrote:
> Hi Guys,
>
> I'm a bit of an Antlr newbie - I've successfully created and used Antlr 2
> grammars in the past but mostly by trial and error, and occasionally random
> hacking until it "worked"... I've recently become involved in a project that
> requires a very simple scripting language, and have decided to use Antlr 3
> for this, but I'm getting stuck quite early on - I think I have a
> fundamental problem in my grammar but after much hacking at it and trying
> various ideas I got from Google, I'm still hitting a bit of a brick wall.
>
> Basically I'm at the point where I have mathematical expressions and various
> literal types implemented, and am adding in function and method call
> handling - I want to be able to call methods with or without and explicit
> receiver, and in my language parenthesis are optional (I know that
> complicates matters a bit but it's what I need for this project). I've
> written the grammar so far against a set of functional tests, and all is
> well with most of my syntax. Here is my grammar:
>
> /* ********* GRAMMAR *********** */
> grammar BasicLang;
>
> options {
>    output=AST;
>    ASTLabelType=CommonTree;
>    backtrack=true;
>    memoize=true;
> }
>
> tokens {
>  ASSIGN;
>  METHOD_CALL;
>  SELF;
> }
>
> @parser::members {
>  /* throw exceptions rather than silently failing... */
> protected void mismatch(IntStream input, int ttype, BitSet follow)
>  throws RecognitionException
> {
>  throw new MismatchedTokenException(ttype, input);
> }
>  public Object recoverFromMismatchedSet(IntStream input,
> RecognitionException e, BitSet follow)
>  throws RecognitionException
> {
>  throw e;
> }
> }
>
> @rulecatch {
> // throw exceptions rather than silently failing...
> catch (RecognitionException e) {
>  throw e;
> }
> }
>
> start_rule
>  :   script
>  ;
>
> script
>  :   statement*
>  ;
>
> statement
>  :   expr terminator!
>  ;
>
> expr
>  :   math_expr
>  |   assign_expr
>  |   meth_call_expr
>  ;
>
> meth_call_expr
>  :   (IDENTIFIER DOT)? func_call_expr -> ^(METHOD_CALL IDENTIFIER?
> func_call_expr)
>  |   (STRING_LITERAL DOT)? func_call_expr -> ^(METHOD_CALL STRING_LITERAL?
> func_call_expr)
>  ;
>
> fragment
> func_call_expr
>  :   IDENTIFIER^ argument_list
>  ;
>
> fragment
> argument_list
>  :   LPAREN!? (expr (COMMA! expr)*)? RPAREN!?
>  ;
>
> assign_expr
>  :   IDENTIFIER ASSIGN expr -> ^(ASSIGN IDENTIFIER expr)
>  ;
>
> math_expr
>  :   mult_expr ((ADD^|SUB^) mult_expr)*
>  ;
>
> mult_expr
>  :   pow_expr ((MUL^|DIV^|MOD^) pow_expr)*
>  ;
>
> pow_expr
>  :   unary_expr ((POW^) unary_expr)*
>  ;
>
> unary_expr
>  :   NOT? atom
>  ;
>
> atom
>  :     literal
>  |     LPAREN! expr RPAREN!
>  ;
>
> literal
>  :     HEX_LITERAL
>  |     DECIMAL_LITERAL
>  |     OCTAL_LITERAL
>  |     FLOATING_POINT_LITERAL
> //  |     REGEXP_LITERAL
>  |     STRING_LITERAL
>  ;
>
> terminator
>  :     TERMINATOR
>  |     EOF
>  ;
>
> POW :   '^' ;
> MOD :   '%' ;
> ADD :   '+' ;
> SUB :   '-' ;
> DIV :   '/' ;
> MUL :   '*' ;
> NOT :   '!' ;
>
> ASSIGN
>    :   '='
>    ;
>
> LPAREN
>    :   '('
>    ;
>
> RPAREN
>    :   ')'
>    ;
>
> COMMA
>    :   ','
>    ;
>
> DOT :   '.' ;
>
> CHARACTER_LITERAL
>    :   '\'' ( EscapeSequence | ~('\''|'\\') ) '\''
>    ;
>
> STRING_LITERAL
>    :  '"' ( EscapeSequence | ~('\\'|'"') )* '"'
>    ;
>
> /*
> REGEXP_LITERAL
>    :  '/' ( EscapeSequence | ~('\\'|'"') )* '/'
>    ;
> */
>
> HEX_LITERAL : '0' ('x'|'X') HexDigit+ IntegerTypeSuffix? ;
>
> DECIMAL_LITERAL : ('0' | '1'..'9' '0'..'9'*) IntegerTypeSuffix? ;
>
> OCTAL_LITERAL : '0' ('0'..'7')+ IntegerTypeSuffix? ;
>
> fragment
> HexDigit : ('0'..'9'|'a'..'f'|'A'..'F') ;
>
> fragment
> IntegerTypeSuffix
>  : ('l'|'L')
>  | ('u'|'U')  ('l'|'L')?
>  ;
>
> FLOATING_POINT_LITERAL
>    :   ('0'..'9')+ '.' ('0'..'9')* Exponent? FloatTypeSuffix?
>    |   '.' ('0'..'9')+ Exponent? FloatTypeSuffix?
>    |   ('0'..'9')+ Exponent? FloatTypeSuffix?
>  ;
>
> fragment
> Exponent : ('e'|'E') ('+'|'-')? ('0'..'9')+ ;
>
> fragment
> FloatTypeSuffix : ('f'|'F'|'d'|'D') ;
>
> fragment
> EscapeSequence
>    :   '\\' ('b'|'t'|'n'|'f'|'r'|'\"'|'\''|'\\'|'/')
>    |   OctalEscape
>    ;
>
> fragment
> OctalEscape
>    :   '\\' ('0'..'3') ('0'..'7') ('0'..'7')
>    |   '\\' ('0'..'7') ('0'..'7')
>    |   '\\' ('0'..'7')
>    ;
>
> fragment
> UnicodeEscape
>    :   '\\' 'u' HexDigit HexDigit HexDigit HexDigit
>    ;
> COMMENT
>    :   '/*' ( options {greedy=false;} : . )* '*/' {$channel=HIDDEN;}
>    ;
>
> LINE_COMMENT
>    : '//' ~('\n'|'\r')* '\r'? '\n' {$channel=HIDDEN;}
>    ;
>
> IDENTIFIER
>  : ID_LETTER (ID_LETTER|'0'..'9')*
>  ;
>
> fragment
> ID_LETTER
>  : '$'
>  | 'A'..'Z'
>  | 'a'..'z'
>  | '_'
>  ;
>
> TERMINATOR
>  : '\r'? '\n'
>  | ';'
>  ;
>
> WS  :  (' '|'\r'|'\t'|'\u000C') {$channel=HIDDEN;}
>    |  '...' '\r'? '\n'  {$channel=HIDDEN;}
>    ;
>
> /* *************** END *************** */
>
> With this grammar, my tests so far pass, and I'm building trees for simple
> arithmetic operations and the like, including involving variables (e.g. a+1
> and the like), and method calls are working as I expect, including when
> passing method call results as args to another method call. But I cannot get
> input such as "a=b+(c=1)" to parse at all - Debugging in AntlrWorks shows me
> that the problem occurs when the parse sees the "b+", when it throws a
> NoViableAlt exception.
>
> I guessed this was because the parser doesn't see the identifier as an atom,
> so tries to parse it with the + symbol. So, I tried adding IDENTIFIER as an
> alternative to the atom rule - but that just broke the parser completely and
> many of my tests failed with an exception - MismatchedSetException.
>
> I've been playing with this for a few days now but no matter what I do, even
> when I get the type of syntax I mentioned above (the assign statement)
> working, I invariably break something (or more often, everything! :( ) else.
> I'm really hoping someone out there will take pity on me and give me some
> insight into what I'm doing wrong.
>
> Thanks in advance!
> --
> Ross Bamford - roscoml at gmail.com
>
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address
>


More information about the antlr-interest mailing list