[antlr-interest] AST generation: EXPRESSION TREE example.
Bharath Sundararaman
bharath at starthis.com
Wed Jun 2 13:56:43 PDT 2004
Hi Monty,
I am aware that I'm dealing with heterogeneous ASTs. I have the required
classes from the documentation (CalcAST, BinaryOperatorAST, INTNode,
PLUSNode and MULTNode) and I tried "parser.setASTNodeClass(myPkg.CalcAST);"
since setASTNodeType("myClassName") gave me a deprecation warning. CalcAST
class extends antlr.baseAST and not antlr.CommonAST -- What's the
difference? I tried both classes and the error remains.
Homogeneous AST works and I can walk it with a treewalker that extends
TreeParser.
Thanks for your prompt reply!
Bharath.
-----Original Message-----
From: Monty Zukowski [mailto:monty at codetransform.com]
Sent: Wednesday, June 02, 2004 3:32 PM
To: antlr-interest at yahoogroups.com
Cc: Monty Zukowski
Subject: Re: [antlr-interest] AST generation: EXPRESSION TREE example.
You are trying to create heterogeneous nodes. Look in the heteroAST
example in the antlr distribution.
Monty
On Jun 2, 2004, at 1:29 PM, Bharath Sundararaman wrote:
> Hi all,
>
> I looked at the documentation for AST
> (http://www.antlr.org/doc/trees.html)
> and I tried the EXPRESSION TREE example provided in the documentation.
> The
> grammar compiles without any errors but when I run the main class, I
> get an
> error that says: "Invalid class or can't make instance, PLUSNode". I
> get the
> same for MULTNode and INTNode. Am I missing something here??
>
> Ter :- The tutorial was very useful, thanks!
>
> Thanks,
>
> Bharath.
>
> -----Original Message-----
> From: Monty Zukowski [mailto:monty at codetransform.com]
> Sent: Monday, May 24, 2004 10:20 AM
> To: antlr-interest at yahoogroups.com
> Cc: Monty Zukowski
> Subject: Re: [antlr-interest] Whitespace problem. (keywords Vs
> identifiers)
>
>
>
> On May 21, 2004, at 9:03 AM, Bharath Sundararaman wrote:
>
>> Hi Monty,
>>
>> Here's my rule:
>>
>> IDMEAT:i:IDENT {
>> if ( i.getText().equals("t") | i.getText().equals("T") |
>> i.getText().equals("time")) {
>> $setType(TIME_PREFIX);
>> }
>> else if (i.getText().equals("e") | i.getText().equals("E")) {
>> $setType(Exponent_prefix);
>> }
>> else {
>>
>> $setType(i.getType());
>> }
>> };
>>
>
> IDENT will have set the type of the token, so your test could be
> if(i.getType()==T | i.getType()==TIME etc.)
>
> You also aren't testing for # and a number, so you will get
> TIME_PREFIX for a variable named 't' no matter what follows.
>
> E9 is a valid identifier, I assume. That one should probably be
> handled in IDENT
>
> IDENT:
> (('e'|'E') (INT | PLUS | MINUS))=> ('e'|'E')
> {$setType(Exponent_prefix);}
> | ('a'..'z'|'A'..'Z') ('a'..'z'|'A'..'Z'|'0'..'9')
>
>
>
>> Problem: My time rule is (in the parser) --
>> time: TIME_PREFIX HASH Int; and it takes values like "t#9" or "T#9".
>> Note that there's no space between 't' and '#' and that's what I
>> want. However,
>> for Exponent_prefix, it doesn't work.
>>
>> exponent: Exponent_prefix (PLUS|MINUS)? Int; allows "E 9" or "E+9"
>> but it doesn't allow "E9". I tried to ignore WHITEPACE in IDMEAT rule
>> but that cant
>> be the problem because TIME_PREFIX works fine.
>>
>> Any ideas?
>>
>> B.
>>
>> -----Original Message-----
>> From: Monty Zukowski [mailto:monty at codetransform.com]
>> Sent: Thursday, May 20, 2004 12:05 PM
>> To: antlr-interest at yahoogroups.com
>> Cc: Monty Zukowski
>> Subject: Re: [antlr-interest] Keywords Vs Identifiers.
>>
>>
>> I'm sorry, I was in a hurry. Inspect the generated code, you will
>> see in the ID rule where antlr tests the token text against the
>> literals table and assigns the token type. To use it in a rule you
>> may need a semantic predicate, this is a little tricky because you
>> need to use the predicate to choose an alternative--hmmm, maybe you
>> could get by with calling the lexer rule directly in your action
>> code. Yes, in your action where you see the TIME id, call the WS
>> rule and then the INT rule. If either fail that's ok, it was not the
>> TIME keyword, is was an ID, so change the type back. Then call your
>> s,m,ms rule. The text will still be appended to the token buffer and
>> make it through to the parser. Try it out and ask when you hit a
>> problem. I wish I had another 15 minutes to explain fully...
>>
>> Monty
>>
>> On May 20, 2004, at 6:30 AM, Bharath S wrote:
>>
>>> Hi Monty,
>>>
>>> I am unclear about the ID token here. Let's say that lexer sees
>>> "abc" which is a token of type ID. Please correct me if my
>>> understanding is not right.
>>>
>>> 1. if (i.getType( )) statement, is used to test against literals.
>>> So, if ID was "INT" instead of "abc", it would return LITERAL_INT
>>> and it would skip that token. Otherwise, it sets "abc"'s type as ID.
>>> Though ID by itself
>>> has
>>> {testliterals} options set, IDMEAT rule would allow me to have both
>>> ID
>>> and
>>> (TIME : "TIME" Integer;) rule to co-exist in the lexer.
>>>
>>> 2. This is a better solution because if I had 's', 'm', 'ms' etc to
>>> denote seconds, minutes and milliseconds, I have to write a separate
>>> rule for each one of them in the parser (if i follow my solution)
>>> to prevent conflict
>>> with the ID rule. Doing it via IDMEAT will solve the issue and make
>>> life
>>> easier.
>>>
>>> Thanks for your comments and clarifications!
>>>
>>> Bharath.
>>> ----- Original Message -----
>>> From: "Monty Zukowski" <monty at codetransform.com>
>>> To: <antlr-interest at yahoogroups.com>
>>> Cc: "Monty Zukowski" <monty at codetransform.com>
>>> Sent: Wednesday, May 19, 2004 5:13 PM
>>> Subject: Re: [antlr-interest] Keywords Vs Identifiers.
>>>
>>>
>>>> If you want to handle that in the lexer you need to do it by
>>>> calling the rule that tests the literals table, here's an example
>>>> from the C
>>>> grammar:
>>>>
>>>> IDMEAT
>>>> :
>>>> i:ID {
>>>>
>>>> if ( i.getType() ==
>>>> LITERAL___extension__ ) {
>>>>
>>>> $setType(Token.SKIP);
>>>> }
>>>> else {
>>>>
>>>> $setType(i.getType());
>>>> }
>>>>
>>>> }
>>>> ;
>>>>
>>>> protected ID
>>>> options
>>>> {
>>>> testLiterals = true;
>>>> }
>>>> : ( 'a'..'z' | 'A'..'Z' | '_' | '$')
>>>> ( 'a'..'z' | 'A'..'Z' | '_' | '$' | '0'..'9' )*
>>>> ;
>>>>
>>>> It's actually tricky to figure out how to lex the following
>>>> whitespace and integer without using a syntactic predicate, but a
>>>> syn pred here will be a performance problem. I would actually
>>>> recommend using a parser filter see
>>>> http://www.codetransform.com/filterexample.html
>>>>
>>>> By the way your parser solution works just fine too, is probably
>>>> the easiest.
>>>>
>>>> Monty
>>>>
>>>> On May 19, 2004, at 2:55 PM, Bharath wrote:
>>>>
>>>>> Hi Monty,
>>>>>
>>>>> I did. I figured a way out too but I am not sure if it's an
>>>>> efficient solution. I set a rule in the parser which accepts an
>>>>> identifier and I extracted the identifier input into a string. If
>>>>> the string is not "TIME", I throw an exception, otherwise I accept
>>>>> it. (using getText() method).
>>>>>
>>>>> Please let me know if this is bad practice.
>>>>>
>>>>> Thanks!
>>>>>
>>>>> Bharath.
>>>>>
>>>>> -----Original Message-----
>>>>> From: Monty Zukowski [mailto:monty at codetransform.com]
>>>>> Sent: Wednesday, May 19, 2004 4:41 PM
>>>>> To: antlr-interest at yahoogroups.com
>>>>> Cc: Monty Zukowski
>>>>> Subject: Re: [antlr-interest] Keywords Vs Identifiers.
>>>>>
>>>>> See the documentation about "literals"
>>>>>
>>>>> Monty
>>>>>
>>>>> On May 19, 2004, at 8:25 AM, Bharath S wrote:
>>>>>
>>>>>> Hi Antlers,
>>>>>>
>>>>>> I have some rules in my grammar, for time literals which require
>>>>>> that 'TIME' or "time" be appended to the front of the rule. For
>>>>>> eg., time can represented as TIME 99secs. The problem is, "TIME"
>>>>>> is not a keyword
>>>>>> and so I
>>>>>> cant have it in the parser. If I throw it in the lexer, it causes
>>>>>> a
>>>>>> clash
>>>>>> with IDENTIFIER rule, because the lexer sees the rule as
>>>>>>
>>>>>> TIME: 'T' 'I' 'M' 'E' (Integer) ; and
>>>>>> IDENTIFIER: ('a'..'z'|'A'..'Z')+;
>>>>>>
>>>>>> as expected. Is there a common workaround for this?
>>>>>>
>>>>>> I can solve this problem by moving a whole bunch of rules in the
>>>>>> parser back to the lexer, just to make the TIME rule protected.
>>>>>> But it doesnt make sense, at all.
>>>>>>
>>>>>> Any comments are most welcome.
>>>>>>
>>>>>> Bharath.
>>>>> Monty Zukowski
>>>>>
>>>>> ANTLR & Java Consultant -- http://www.codetransform.com ANSI C/GCC
>>>>> transformation toolkit -- http://www.codetransform.com/gcc.html
>>>>> Embrace the Decay --
>>>>> http://www.codetransform.com/EmbraceDecay.html
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> Yahoo! Groups Links
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> Yahoo! Groups Links
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>> Monty Zukowski
>>>>
>>>> ANTLR & Java Consultant -- http://www.codetransform.com ANSI C/GCC
>>>> transformation toolkit -- http://www.codetransform.com/gcc.html
>>>> Embrace the Decay -- http://www.codetransform.com/EmbraceDecay.html
>>>>
>>>>
>>>>
>>>>
>>>> Yahoo! Groups Links
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>
>>>
>>>
>>>
>>> Yahoo! Groups Links
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>> Monty Zukowski
>>
>> ANTLR & Java Consultant -- http://www.codetransform.com
>> ANSI C/GCC transformation toolkit --
>> http://www.codetransform.com/gcc.html
>> Embrace the Decay -- http://www.codetransform.com/EmbraceDecay.html
>>
>>
>>
>>
>> Yahoo! Groups Links
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> Yahoo! Groups Links
>>
>>
>>
>>
>>
>>
>>
>>
> Monty Zukowski
>
> ANTLR & Java Consultant -- http://www.codetransform.com
> ANSI C/GCC transformation toolkit --
> http://www.codetransform.com/gcc.html
> Embrace the Decay -- http://www.codetransform.com/EmbraceDecay.html
>
>
>
>
> Yahoo! Groups Links
>
>
>
>
>
>
>
>
>
> Yahoo! Groups Links
>
>
>
>
>
>
>
>
Monty Zukowski
ANTLR & Java Consultant -- http://www.codetransform.com
ANSI C/GCC transformation toolkit --
http://www.codetransform.com/gcc.html
Embrace the Decay -- http://www.codetransform.com/EmbraceDecay.html
Yahoo! Groups Links
Yahoo! Groups Links
<*> To visit your group on the web, go to:
http://groups.yahoo.com/group/antlr-interest/
<*> To unsubscribe from this group, send an email to:
antlr-interest-unsubscribe at yahoogroups.com
<*> Your use of Yahoo! Groups is subject to:
http://docs.yahoo.com/info/terms/
More information about the antlr-interest
mailing list