[antlr-interest] AST generation: EXPRESSION TREE example.

Wed Jun 2 13:56:43 PDT 2004

Hi Monty,

I am aware that I'm dealing with heterogeneous ASTs. I have the required
classes from the documentation (CalcAST, BinaryOperatorAST, INTNode,
PLUSNode and MULTNode) and I tried "parser.setASTNodeClass(myPkg.CalcAST);"
since setASTNodeType("myClassName") gave me a deprecation warning. CalcAST
class extends antlr.baseAST and not antlr.CommonAST -- What's the
difference? I tried both classes and the error remains. 

Homogeneous AST works and I can walk it with a treewalker that extends
TreeParser. 

Thanks for your prompt reply!

Bharath.

-----Original Message-----
From: Monty Zukowski [mailto:monty at codetransform.com] 
Sent: Wednesday, June 02, 2004 3:32 PM
To: antlr-interest at yahoogroups.com
Cc: Monty Zukowski
Subject: Re: [antlr-interest] AST generation: EXPRESSION TREE example.

You are trying to create heterogeneous nodes.  Look in the heteroAST 
example in the antlr distribution.

Monty

On Jun 2, 2004, at 1:29 PM, Bharath Sundararaman wrote:

> Hi all,
>
> I looked at the documentation for AST
> (http://www.antlr.org/doc/trees.html)
> and I tried the EXPRESSION TREE example provided in the documentation. 
> The
> grammar compiles without any errors but when I run the main class, I 
> get an
> error that says: "Invalid class or can't make instance, PLUSNode". I 
> get the
> same for MULTNode and INTNode. Am I missing something here??
>
> Ter :- The tutorial was very useful, thanks!
>
> Thanks,
>
> Bharath.
>
> -----Original Message-----
> From: Monty Zukowski [mailto:monty at codetransform.com]
> Sent: Monday, May 24, 2004 10:20 AM
> To: antlr-interest at yahoogroups.com
> Cc: Monty Zukowski
> Subject: Re: [antlr-interest] Whitespace problem. (keywords Vs
> identifiers)
>
>
>
> On May 21, 2004, at 9:03 AM, Bharath Sundararaman wrote:
>
>> Hi Monty,
>>
>> Here's my rule:
>>
>> IDMEAT:i:IDENT {
>>         if ( i.getText().equals("t") | i.getText().equals("T") |
>> i.getText().equals("time")) {
>>                     $setType(TIME_PREFIX);
>>        }
>>         else if (i.getText().equals("e") | i.getText().equals("E")) {
>>       		  $setType(Exponent_prefix);
>>        }
>>        else {
>>
>> 			$setType(i.getType());
>>        }
>>       };
>>
>
> IDENT will have set the type of the token, so your test could be 
> if(i.getType()==T | i.getType()==TIME etc.)
>
> You also aren't testing for # and a number, so you will get 
> TIME_PREFIX for a variable named 't' no matter what follows.
>
> E9 is a valid identifier, I assume.  That one should probably be 
> handled in IDENT
>
> IDENT:
> (('e'|'E') (INT | PLUS | MINUS))=> ('e'|'E') 
> {$setType(Exponent_prefix);}
> | ('a'..'z'|'A'..'Z') ('a'..'z'|'A'..'Z'|'0'..'9')
>
>
>
>> Problem: My time rule is (in the parser) --
>> time: TIME_PREFIX HASH Int; and it takes values like "t#9" or "T#9". 
>> Note that there's no space between 't' and '#' and that's what I 
>> want. However,
>> for Exponent_prefix, it doesn't work.
>>
>> exponent: Exponent_prefix (PLUS|MINUS)? Int; allows "E 9" or "E+9" 
>> but it doesn't allow "E9". I tried to ignore WHITEPACE in IDMEAT rule 
>> but that cant
>> be the problem because TIME_PREFIX works fine.
>>
>> Any ideas?
>>
>> B.
>>
>> -----Original Message-----
>> From: Monty Zukowski [mailto:monty at codetransform.com]
>> Sent: Thursday, May 20, 2004 12:05 PM
>> To: antlr-interest at yahoogroups.com
>> Cc: Monty Zukowski
>> Subject: Re: [antlr-interest] Keywords Vs Identifiers.
>>
>>
>> I'm sorry, I was in a hurry.  Inspect the generated code, you will 
>> see in the ID rule where antlr tests the token text against the 
>> literals table and assigns the token type.  To use it in a rule you 
>> may need a semantic predicate, this is a little tricky because you 
>> need to use the predicate to choose an alternative--hmmm, maybe you 
>> could get by with calling the lexer rule directly in your action 
>> code.  Yes, in your action where you see the TIME id, call the WS 
>> rule and then the INT rule.  If either fail that's ok, it was not the 
>> TIME keyword, is was an ID, so change the type back.  Then call your 
>> s,m,ms rule.  The text will still be appended to the token buffer and 
>> make it through to the parser.  Try it out and ask when you hit a 
>> problem.  I wish I had another 15 minutes to explain fully...
>>
>> Monty
>>
>> On May 20, 2004, at 6:30 AM, Bharath S wrote:
>>
>>> Hi Monty,
>>>
>>> I am unclear about the ID token here. Let's say that lexer sees 
>>> "abc" which is a token of type ID. Please correct me if my 
>>> understanding is not right.
>>>
>>> 1. if (i.getType( )) statement, is used to test against literals. 
>>> So, if ID was "INT" instead of "abc", it would return LITERAL_INT 
>>> and it would skip that token. Otherwise, it sets "abc"'s type as ID. 
>>> Though ID by itself
>>> has
>>> {testliterals} options set, IDMEAT rule would allow me to have both 
>>> ID
>>> and
>>> (TIME : "TIME" Integer;) rule to co-exist in the lexer.
>>>
>>> 2. This is a better solution because if I had 's', 'm', 'ms' etc to 
>>> denote seconds, minutes and milliseconds, I have to write a separate 
>>> rule for each one of them  in the parser (if i follow my solution) 
>>> to prevent conflict
>>> with the ID rule. Doing it via IDMEAT will solve the issue and make
>>> life
>>> easier.
>>>
>>> Thanks for your comments and clarifications!
>>>
>>> Bharath.
>>> ----- Original Message -----
>>> From: "Monty Zukowski" <monty at codetransform.com>
>>> To: <antlr-interest at yahoogroups.com>
>>> Cc: "Monty Zukowski" <monty at codetransform.com>
>>> Sent: Wednesday, May 19, 2004 5:13 PM
>>> Subject: Re: [antlr-interest] Keywords Vs Identifiers.
>>>
>>>
>>>> If you want to handle that in the lexer you need to do it by 
>>>> calling the rule that tests the literals table, here's an example 
>>>> from the C
>>>> grammar:
>>>>
>>>> IDMEAT
>>>>          :
>>>>                  i:ID                {
>>>>
>>>>                                          if ( i.getType() == 
>>>> LITERAL___extension__ ) {
>>>>
>>>> $setType(Token.SKIP);
>>>>                                          }
>>>>                                          else {
>>>>
>>>> $setType(i.getType());
>>>>                                          }
>>>>
>>>>                                      }
>>>>          ;
>>>>
>>>> protected ID
>>>>          options
>>>>                  {
>>>>                  testLiterals = true;
>>>>                  }
>>>>          :       ( 'a'..'z' | 'A'..'Z' | '_' | '$')
>>>>                  ( 'a'..'z' | 'A'..'Z' | '_' | '$' | '0'..'9' )*
>>>>          ;
>>>>
>>>> It's actually tricky to figure out how to lex the following 
>>>> whitespace and integer without using a syntactic predicate, but a 
>>>> syn pred here will be a performance problem.  I would actually 
>>>> recommend using a parser filter see 
>>>> http://www.codetransform.com/filterexample.html
>>>>
>>>> By the way your parser solution works just fine too, is probably 
>>>> the easiest.
>>>>
>>>> Monty
>>>>
>>>> On May 19, 2004, at 2:55 PM, Bharath wrote:
>>>>
>>>>> Hi Monty,
>>>>>
>>>>> I did. I figured a way out too but I am not sure if it's an 
>>>>> efficient solution. I set a rule in the parser which accepts an 
>>>>> identifier and I extracted the identifier input into a string. If 
>>>>> the string is not "TIME", I throw an exception, otherwise I accept 
>>>>> it. (using getText() method).
>>>>>
>>>>> Please let me know if this is bad practice.
>>>>>
>>>>> Thanks!
>>>>>
>>>>> Bharath.
>>>>>
>>>>> -----Original Message-----
>>>>> From: Monty Zukowski [mailto:monty at codetransform.com]
>>>>> Sent: Wednesday, May 19, 2004 4:41 PM
>>>>> To: antlr-interest at yahoogroups.com
>>>>> Cc: Monty Zukowski
>>>>> Subject: Re: [antlr-interest] Keywords Vs Identifiers.
>>>>>
>>>>> See the documentation about "literals"
>>>>>
>>>>> Monty
>>>>>
>>>>> On May 19, 2004, at 8:25 AM, Bharath S wrote:
>>>>>
>>>>>> Hi Antlers,
>>>>>>
>>>>>> I have some rules in my grammar, for time literals which require 
>>>>>> that 'TIME' or "time" be appended to the front of the rule. For 
>>>>>> eg., time can represented as TIME 99secs. The problem is, "TIME" 
>>>>>> is not a keyword
>>>>>> and so I
>>>>>> cant have it in the parser. If I throw it in the lexer, it causes 
>>>>>> a
>>>>>> clash
>>>>>> with IDENTIFIER rule, because the lexer sees the rule as
>>>>>>
>>>>>> TIME: 'T' 'I' 'M' 'E' (Integer) ; and
>>>>>> IDENTIFIER: ('a'..'z'|'A'..'Z')+;
>>>>>>
>>>>>> as expected. Is there a common workaround for this?
>>>>>>
>>>>>> I can solve this problem by moving a whole bunch of rules in the 
>>>>>> parser back to the lexer, just to make the TIME rule protected. 
>>>>>> But it doesnt make sense, at all.
>>>>>>
>>>>>> Any comments are most welcome.
>>>>>>
>>>>>> Bharath.
>>>>> Monty Zukowski
>>>>>
>>>>> ANTLR & Java Consultant -- http://www.codetransform.com ANSI C/GCC 
>>>>> transformation toolkit -- http://www.codetransform.com/gcc.html
>>>>> Embrace the Decay -- 
>>>>> http://www.codetransform.com/EmbraceDecay.html
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> Yahoo! Groups Links
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> Yahoo! Groups Links
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>> Monty Zukowski
>>>>
>>>> ANTLR & Java Consultant -- http://www.codetransform.com ANSI C/GCC 
>>>> transformation toolkit -- http://www.codetransform.com/gcc.html
>>>> Embrace the Decay -- http://www.codetransform.com/EmbraceDecay.html
>>>>
>>>>
>>>>
>>>>
>>>> Yahoo! Groups Links
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>
>>>
>>>
>>>
>>> Yahoo! Groups Links
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>> Monty Zukowski
>>
>> ANTLR & Java Consultant -- http://www.codetransform.com
>> ANSI C/GCC transformation toolkit -- 
>> http://www.codetransform.com/gcc.html
>> Embrace the Decay -- http://www.codetransform.com/EmbraceDecay.html
>>
>>
>>
>>
>> Yahoo! Groups Links
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> Yahoo! Groups Links
>>
>>
>>
>>
>>
>>
>>
>>
> Monty Zukowski
>
> ANTLR & Java Consultant -- http://www.codetransform.com
> ANSI C/GCC transformation toolkit -- 
> http://www.codetransform.com/gcc.html
> Embrace the Decay -- http://www.codetransform.com/EmbraceDecay.html
>
>
>
>
> Yahoo! Groups Links
>
>
>
>
>
>
>
>
>
> Yahoo! Groups Links
>
>
>
>
>
>
>
>
Monty Zukowski

ANTLR & Java Consultant -- http://www.codetransform.com
ANSI C/GCC transformation toolkit -- 
http://www.codetransform.com/gcc.html
Embrace the Decay -- http://www.codetransform.com/EmbraceDecay.html

Yahoo! Groups Links

<*> To visit your group on the web, go to:
     http://groups.yahoo.com/group/antlr-interest/

<*> To unsubscribe from this group, send an email to:
     antlr-interest-unsubscribe at yahoogroups.com

<*> Your use of Yahoo! Groups is subject to:
     http://docs.yahoo.com/info/terms/