[antlr-interest] AST generation: EXPRESSION TREE example.
Bharath Sundararaman
bharath at starthis.com
Wed Jun 2 13:29:00 PDT 2004
Hi all,
I looked at the documentation for AST (http://www.antlr.org/doc/trees.html)
and I tried the EXPRESSION TREE example provided in the documentation. The
grammar compiles without any errors but when I run the main class, I get an
error that says: "Invalid class or can't make instance, PLUSNode". I get the
same for MULTNode and INTNode. Am I missing something here??
Ter :- The tutorial was very useful, thanks!
Thanks,
Bharath.
-----Original Message-----
From: Monty Zukowski [mailto:monty at codetransform.com]
Sent: Monday, May 24, 2004 10:20 AM
To: antlr-interest at yahoogroups.com
Cc: Monty Zukowski
Subject: Re: [antlr-interest] Whitespace problem. (keywords Vs identifiers)
On May 21, 2004, at 9:03 AM, Bharath Sundararaman wrote:
> Hi Monty,
>
> Here's my rule:
>
> IDMEAT:i:IDENT {
> if ( i.getText().equals("t") | i.getText().equals("T") |
> i.getText().equals("time")) {
> $setType(TIME_PREFIX);
> }
> else if (i.getText().equals("e") | i.getText().equals("E")) {
> $setType(Exponent_prefix);
> }
> else {
>
> $setType(i.getType());
> }
> };
>
IDENT will have set the type of the token, so your test could be
if(i.getType()==T | i.getType()==TIME etc.)
You also aren't testing for # and a number, so you will get TIME_PREFIX
for a variable named 't' no matter what follows.
E9 is a valid identifier, I assume. That one should probably be
handled in IDENT
IDENT:
(('e'|'E') (INT | PLUS | MINUS))=> ('e'|'E')
{$setType(Exponent_prefix);}
| ('a'..'z'|'A'..'Z') ('a'..'z'|'A'..'Z'|'0'..'9')
> Problem: My time rule is (in the parser) --
> time: TIME_PREFIX HASH Int; and it takes values like "t#9" or "T#9".
> Note
> that there's no space between 't' and '#' and that's what I want.
> However,
> for Exponent_prefix, it doesn't work.
>
> exponent: Exponent_prefix (PLUS|MINUS)? Int; allows "E 9" or "E+9" but
> it
> doesn't allow "E9". I tried to ignore WHITEPACE in IDMEAT rule but
> that cant
> be the problem because TIME_PREFIX works fine.
>
> Any ideas?
>
> B.
>
> -----Original Message-----
> From: Monty Zukowski [mailto:monty at codetransform.com]
> Sent: Thursday, May 20, 2004 12:05 PM
> To: antlr-interest at yahoogroups.com
> Cc: Monty Zukowski
> Subject: Re: [antlr-interest] Keywords Vs Identifiers.
>
>
> I'm sorry, I was in a hurry. Inspect the generated code, you will see
> in the ID rule where antlr tests the token text against the literals
> table and assigns the token type. To use it in a rule you may need a
> semantic predicate, this is a little tricky because you need to use
> the predicate to choose an alternative--hmmm, maybe you could get by
> with calling the lexer rule directly in your action code. Yes, in
> your action where you see the TIME id, call the WS rule and then the
> INT rule. If either fail that's ok, it was not the TIME keyword, is
> was an ID, so change the type back. Then call your s,m,ms rule. The
> text will still be appended to the token buffer and make it through to
> the parser. Try it out and ask when you hit a problem. I wish I had
> another 15 minutes to explain fully...
>
> Monty
>
> On May 20, 2004, at 6:30 AM, Bharath S wrote:
>
>> Hi Monty,
>>
>> I am unclear about the ID token here. Let's say that lexer sees "abc"
>> which is a token of type ID. Please correct me if my understanding is
>> not right.
>>
>> 1. if (i.getType( )) statement, is used to test against literals. So,
>> if ID was "INT" instead of "abc", it would return LITERAL_INT and it
>> would skip
>> that token. Otherwise, it sets "abc"'s type as ID. Though ID by itself
>> has
>> {testliterals} options set, IDMEAT rule would allow me to have both ID
>> and
>> (TIME : "TIME" Integer;) rule to co-exist in the lexer.
>>
>> 2. This is a better solution because if I had 's', 'm', 'ms' etc to
>> denote seconds, minutes and milliseconds, I have to write a separate
>> rule for each
>> one of them in the parser (if i follow my solution) to prevent
>> conflict
>> with the ID rule. Doing it via IDMEAT will solve the issue and make
>> life
>> easier.
>>
>> Thanks for your comments and clarifications!
>>
>> Bharath.
>> ----- Original Message -----
>> From: "Monty Zukowski" <monty at codetransform.com>
>> To: <antlr-interest at yahoogroups.com>
>> Cc: "Monty Zukowski" <monty at codetransform.com>
>> Sent: Wednesday, May 19, 2004 5:13 PM
>> Subject: Re: [antlr-interest] Keywords Vs Identifiers.
>>
>>
>>> If you want to handle that in the lexer you need to do it by calling
>>> the rule that tests the literals table, here's an example from the C
>>> grammar:
>>>
>>> IDMEAT
>>> :
>>> i:ID {
>>>
>>> if ( i.getType() ==
>>> LITERAL___extension__ ) {
>>>
>>> $setType(Token.SKIP);
>>> }
>>> else {
>>>
>>> $setType(i.getType());
>>> }
>>>
>>> }
>>> ;
>>>
>>> protected ID
>>> options
>>> {
>>> testLiterals = true;
>>> }
>>> : ( 'a'..'z' | 'A'..'Z' | '_' | '$')
>>> ( 'a'..'z' | 'A'..'Z' | '_' | '$' | '0'..'9' )*
>>> ;
>>>
>>> It's actually tricky to figure out how to lex the following
>>> whitespace and integer without using a syntactic predicate, but a
>>> syn pred here will be a performance problem. I would actually
>>> recommend using a parser filter see
>>> http://www.codetransform.com/filterexample.html
>>>
>>> By the way your parser solution works just fine too, is probably the
>>> easiest.
>>>
>>> Monty
>>>
>>> On May 19, 2004, at 2:55 PM, Bharath wrote:
>>>
>>>> Hi Monty,
>>>>
>>>> I did. I figured a way out too but I am not sure if it's an
>>>> efficient solution. I set a rule in the parser which accepts an
>>>> identifier and I extracted the identifier input into a string. If
>>>> the string is not "TIME", I throw an exception, otherwise I accept
>>>> it. (using getText() method).
>>>>
>>>> Please let me know if this is bad practice.
>>>>
>>>> Thanks!
>>>>
>>>> Bharath.
>>>>
>>>> -----Original Message-----
>>>> From: Monty Zukowski [mailto:monty at codetransform.com]
>>>> Sent: Wednesday, May 19, 2004 4:41 PM
>>>> To: antlr-interest at yahoogroups.com
>>>> Cc: Monty Zukowski
>>>> Subject: Re: [antlr-interest] Keywords Vs Identifiers.
>>>>
>>>> See the documentation about "literals"
>>>>
>>>> Monty
>>>>
>>>> On May 19, 2004, at 8:25 AM, Bharath S wrote:
>>>>
>>>>> Hi Antlers,
>>>>>
>>>>> I have some rules in my grammar, for time literals which require
>>>>> that 'TIME'
>>>>> or "time" be appended to the front of the rule. For eg., time can
>>>>> represented as TIME 99secs. The problem is, "TIME" is not a keyword
>>>>> and so I
>>>>> cant have it in the parser. If I throw it in the lexer, it causes a
>>>>> clash
>>>>> with IDENTIFIER rule, because the lexer sees the rule as
>>>>>
>>>>> TIME: 'T' 'I' 'M' 'E' (Integer) ; and
>>>>> IDENTIFIER: ('a'..'z'|'A'..'Z')+;
>>>>>
>>>>> as expected. Is there a common workaround for this?
>>>>>
>>>>> I can solve this problem by moving a whole bunch of rules in the
>>>>> parser back to the lexer, just to make the TIME rule protected.
>>>>> But it doesnt make sense, at all.
>>>>>
>>>>> Any comments are most welcome.
>>>>>
>>>>> Bharath.
>>>> Monty Zukowski
>>>>
>>>> ANTLR & Java Consultant -- http://www.codetransform.com ANSI C/GCC
>>>> transformation toolkit -- http://www.codetransform.com/gcc.html
>>>> Embrace the Decay -- http://www.codetransform.com/EmbraceDecay.html
>>>>
>>>>
>>>>
>>>>
>>>> Yahoo! Groups Links
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> Yahoo! Groups Links
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>> Monty Zukowski
>>>
>>> ANTLR & Java Consultant -- http://www.codetransform.com ANSI C/GCC
>>> transformation toolkit -- http://www.codetransform.com/gcc.html
>>> Embrace the Decay -- http://www.codetransform.com/EmbraceDecay.html
>>>
>>>
>>>
>>>
>>> Yahoo! Groups Links
>>>
>>>
>>>
>>>
>>>
>>>
>>
>>
>>
>>
>> Yahoo! Groups Links
>>
>>
>>
>>
>>
>>
>>
>>
> Monty Zukowski
>
> ANTLR & Java Consultant -- http://www.codetransform.com
> ANSI C/GCC transformation toolkit --
> http://www.codetransform.com/gcc.html
> Embrace the Decay -- http://www.codetransform.com/EmbraceDecay.html
>
>
>
>
> Yahoo! Groups Links
>
>
>
>
>
>
>
>
>
> Yahoo! Groups Links
>
>
>
>
>
>
>
>
Monty Zukowski
ANTLR & Java Consultant -- http://www.codetransform.com
ANSI C/GCC transformation toolkit --
http://www.codetransform.com/gcc.html
Embrace the Decay -- http://www.codetransform.com/EmbraceDecay.html
Yahoo! Groups Links
Yahoo! Groups Links
<*> To visit your group on the web, go to:
http://groups.yahoo.com/group/antlr-interest/
<*> To unsubscribe from this group, send an email to:
antlr-interest-unsubscribe at yahoogroups.com
<*> Your use of Yahoo! Groups is subject to:
http://docs.yahoo.com/info/terms/
More information about the antlr-interest
mailing list