From kferrio at gmail.com  Fri Jan  1 11:10:54 2010
From: kferrio at gmail.com (Kyle Ferrio)
Date: Fri, 1 Jan 2010 12:10:54 -0700
Subject: [antlr-interest] Repost: ANTLRworks: Why do these rules behave
	differently in the embedded interpreter?
Message-ID: <4608cec11001011110w6f5d7401pc8f0f9a5a730e566@mail.gmail.com>

Hi,

I originally posted the question below on 13 December.  I'm guessing I
didn't get any replies because it rolled off the end of everyone's
inbox during the holiday seasons.  So please excuse the repost; I'd be
grateful if someone could tell me whether I'm on the right track.
Since posting this question, I have observed similar (not identical)
behavior in the ANTLR IDE for Eclipse.  My guess (please confirm or
debunk) is that the built-in interpreters build the concrete syntax
tree by (correctly) pursuing the first viable alternative at each
decision point but (unfortunately) not necessarily rewinding the input
stream upon encountering an exception.  Since posting this question,
at least one other person has independently encountered the same
problem, in connection with Scott Stanchfield's excellent ANTLR 3
video tutorials [ http://javadude.com/articles/antlr3xtut/index.html
].  I've been using ANTLR for a little over a year, almost exclusively
by running the ANTLR tool from teh command line.  I'm just a CLI guy.
So I'm encountering questions with ANTLRworks perhaps later than I
should.

Now, here's my previous post, with new comments indicated in square brackets:

This question is so rudimentary that I am almost embarrassed to ask.
But since I almost never try to use ANTLRWorks for my parsers, I'll
risk injuring my pride in exchange for learning something.

If I paste the Expr.g *verbatim* from
http://www.antlr.org/works/help/tutorial/content/Expr.g into
ANTLRWorks 1.3.1 and feed it the following test input:

3+1
3-1

both run (via the Run menu) fine and produce the expected numerical
outputs.  But for the same test input, the ANTLRWorks interpreter
produces the expected parse tree for only 3+1 and gives a
MisMatchedTokenException on the '-' in 3-1.  If I reverse the '+' and
'-' alternatives in rule expr, the results are also reversed: it's the
second alternative that goes bad in the ANTLRWorks interpreter.

Thinking this might have something to do with the embedded actions
which the interpreter does not understand, I stripped them all out.
That leaves us with the following rule, for which the interpreter runs
without error on our test input:

expr
  :  multExpr ( ( '+' multExpr | '-' multExpr ) )*
  ;

[This is potentially ambiguous.  Does a token bind more tightly to
another token, or to the binary operator '|' for alternatives?  Yes,
we know the official ANTLR answer, but I'm questioning my
understanding of the specific implementation embodied in ANTLRworks.
See next rule.]

So I figured [maybe wrongly?] I was right about actions causing
problems.  But wait.  Let's dig deeper.  This second rule

expr
  :  multExpr ( ( '+' multExpr ) | ( '-' multExpr ) )*
  ;

works in the interpreter as expected for the first alternative (used
for 3+1) but produces a MisMatchedTokenException for the second
alternative (used for 3-1).

And better yet, this third rule

expr
  :  multExpr ( ( ( '+' multExpr ) | ( '-' multExpr ) ) )*
  ;

works great in the interpreter for both 3+1 and 3-1, just like the
first rule does.

All three rules actually run (from the Run menu) as expected.  Of
course, running them isn't very interesting with the actions stripped
out, but they do run without error.  So I suspect that they would all
produce equally viable parsers outside ANTLRWorks, but I have not
checked.  Have I stumbled onto an issue with the interpreter embedded
in ANTLRWorks, or have I done something silly? (Or both?)

Thanks [and Happy New Year],
Kyle

From jimi at temporal-wave.com  Fri Jan  1 11:20:01 2010
From: jimi at temporal-wave.com (Jim Idle)
Date: Fri, 01 Jan 2010 11:20:01 -0800
Subject: [antlr-interest] Repost: ANTLRworks: Why do these rules behave
	differently in the embedded interpreter?
In-Reply-To: <4608cec11001011110w6f5d7401pc8f0f9a5a730e566@mail.gmail.com>
Message-ID: <02af690b17664b429cb288d6615f2638@temporal-wave.com>

The interpreter is just a quick testing device and is easily fooled by grammar rules, use the debugger and not the interpreter and all will be fine.

Jim

> -----Original Message-----
> From: antlr-interest-bounces at antlr.org [mailto:antlr-interest-
> bounces at antlr.org] On Behalf Of Kyle Ferrio
> Sent: Friday, January 01, 2010 11:11 AM
> To: antlr-interest at antlr.org
> Subject: [antlr-interest] Repost: ANTLRworks: Why do these rules behave
> differently in the embedded interpreter?
> 
> Hi,
> 
> I originally posted the question below on 13 December.  I'm guessing I
> didn't get any replies because it rolled off the end of everyone's
> inbox during the holiday seasons.  So please excuse the repost; I'd be
> grateful if someone could tell me whether I'm on the right track.
> Since posting this question, I have observed similar (not identical)
> behavior in the ANTLR IDE for Eclipse.  My guess (please confirm or
> debunk) is that the built-in interpreters build the concrete syntax
> tree by (correctly) pursuing the first viable alternative at each
> decision point but (unfortunately) not necessarily rewinding the input
> stream upon encountering an exception.  Since posting this question,
> at least one other person has independently encountered the same
> problem, in connection with Scott Stanchfield's excellent ANTLR 3
> video tutorials [ http://javadude.com/articles/antlr3xtut/index.html
> ].  I've been using ANTLR for a little over a year, almost exclusively
> by running the ANTLR tool from teh command line.  I'm just a CLI guy.
> So I'm encountering questions with ANTLRworks perhaps later than I
> should.
> 
> Now, here's my previous post, with new comments indicated in square
> brackets:
> 
> This question is so rudimentary that I am almost embarrassed to ask.
> But since I almost never try to use ANTLRWorks for my parsers, I'll
> risk injuring my pride in exchange for learning something.
> 
> If I paste the Expr.g *verbatim* from
> http://www.antlr.org/works/help/tutorial/content/Expr.g into
> ANTLRWorks 1.3.1 and feed it the following test input:
> 
> 3+1
> 3-1
> 
> both run (via the Run menu) fine and produce the expected numerical
> outputs.  But for the same test input, the ANTLRWorks interpreter
> produces the expected parse tree for only 3+1 and gives a
> MisMatchedTokenException on the '-' in 3-1.  If I reverse the '+' and
> '-' alternatives in rule expr, the results are also reversed: it's the
> second alternative that goes bad in the ANTLRWorks interpreter.
> 
> Thinking this might have something to do with the embedded actions
> which the interpreter does not understand, I stripped them all out.
> That leaves us with the following rule, for which the interpreter runs
> without error on our test input:
> 
> expr
>   :  multExpr ( ( '+' multExpr | '-' multExpr ) )*
>   ;
> 
> [This is potentially ambiguous.  Does a token bind more tightly to
> another token, or to the binary operator '|' for alternatives?  Yes,
> we know the official ANTLR answer, but I'm questioning my
> understanding of the specific implementation embodied in ANTLRworks.
> See next rule.]
> 
> So I figured [maybe wrongly?] I was right about actions causing
> problems.  But wait.  Let's dig deeper.  This second rule
> 
> expr
>   :  multExpr ( ( '+' multExpr ) | ( '-' multExpr ) )*
>   ;
> 
> works in the interpreter as expected for the first alternative (used
> for 3+1) but produces a MisMatchedTokenException for the second
> alternative (used for 3-1).
> 
> And better yet, this third rule
> 
> expr
>   :  multExpr ( ( ( '+' multExpr ) | ( '-' multExpr ) ) )*
>   ;
> 
> works great in the interpreter for both 3+1 and 3-1, just like the
> first rule does.
> 
> All three rules actually run (from the Run menu) as expected.  Of
> course, running them isn't very interesting with the actions stripped
> out, but they do run without error.  So I suspect that they would all
> produce equally viable parsers outside ANTLRWorks, but I have not
> checked.  Have I stumbled onto an issue with the interpreter embedded
> in ANTLRWorks, or have I done something silly? (Or both?)
> 
> Thanks [and Happy New Year],
> Kyle
> 
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-
> email-address


From david-sarah at jacaranda.org  Fri Jan  1 11:45:22 2010
From: david-sarah at jacaranda.org (David-Sarah Hopwood)
Date: Fri, 01 Jan 2010 19:45:22 +0000
Subject: [antlr-interest] Repost: ANTLRworks: Why do these rules behave
 differently in the embedded interpreter?
In-Reply-To: <02af690b17664b429cb288d6615f2638@temporal-wave.com>
References: <02af690b17664b429cb288d6615f2638@temporal-wave.com>
Message-ID: <4B3E50D2.6030301@jacaranda.org>

Jim Idle wrote:
> The interpreter is just a quick testing device and is easily fooled by grammar rules, use the debugger and not the interpreter and all will be fine.

Yes, but what Kyle pointed out seems like an obvious bug in the
interpreter in a case that it is supposed to be able to handle.

Either bugs like this should be fixed, or there is no point in having
the interpreter and all and it should be removed, with its functionality
being replaced by the debugger.

>> -----Original Message-----
>> From: antlr-interest-bounces at antlr.org [mailto:antlr-interest-
>> bounces at antlr.org] On Behalf Of Kyle Ferrio
>> Sent: Friday, January 01, 2010 11:11 AM
>> To: antlr-interest at antlr.org
>> Subject: [antlr-interest] Repost: ANTLRworks: Why do these rules behave
>> differently in the embedded interpreter?
>>
[...]
>> That leaves us with the following rule, for which the interpreter runs
>> without error on our test input:
>>
>> expr
>>   :  multExpr ( ( '+' multExpr | '-' multExpr ) )*
>>   ;
>>
>> [This is potentially ambiguous.  Does a token bind more tightly to
>> another token, or to the binary operator '|' for alternatives?  Yes,
>> we know the official ANTLR answer, but I'm questioning my
>> understanding of the specific implementation embodied in ANTLRworks.
>> See next rule.]
>>
>> So I figured [maybe wrongly?] I was right about actions causing
>> problems.  But wait.  Let's dig deeper.  This second rule
>>
>> expr
>>   :  multExpr ( ( '+' multExpr ) | ( '-' multExpr ) )*
>>   ;
>>
>> works in the interpreter as expected for the first alternative (used
>> for 3+1) but produces a MisMatchedTokenException for the second
>> alternative (used for 3-1).
>>
>> And better yet, this third rule
>>
>> expr
>>   :  multExpr ( ( ( '+' multExpr ) | ( '-' multExpr ) ) )*
>>   ;
>>
>> works great in the interpreter for both 3+1 and 3-1, just like the
>> first rule does.

-- 
David-Sarah Hopwood  ?  http://davidsarah.livejournal.com

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 292 bytes
Desc: OpenPGP digital signature
Url : http://www.antlr.org/pipermail/antlr-interest/attachments/20100101/3721e965/attachment.bin 

From parrt at cs.usfca.edu  Fri Jan  1 12:31:37 2010
From: parrt at cs.usfca.edu (Terence Parr)
Date: Fri, 1 Jan 2010 12:31:37 -0800
Subject: [antlr-interest] Repost: ANTLRworks: Why do these rules behave
	differently in the embedded interpreter?
In-Reply-To: <4B3E50D2.6030301@jacaranda.org>
References: <02af690b17664b429cb288d6615f2638@temporal-wave.com>
	<4B3E50D2.6030301@jacaranda.org>
Message-ID: <6C54C56B-42D4-44D2-8994-77157A5BB472@cs.usfca.edu>


On Jan 1, 2010, at 11:45 AM, David-Sarah Hopwood wrote:

> Jim Idle wrote:
>> The interpreter is just a quick testing device and is easily fooled by grammar rules, use the debugger and not the interpreter and all will be fine.
> 
> Yes, but what Kyle pointed out seems like an obvious bug in the
> interpreter in a case that it is supposed to be able to handle.
> 
> Either bugs like this should be fixed, or there is no point in having
> the interpreter and all and it should be removed, with its functionality
> being replaced by the debugger.

yup. it's on my to-do list.
T

From parrt at cs.usfca.edu  Fri Jan  1 13:02:15 2010
From: parrt at cs.usfca.edu (Terence Parr)
Date: Fri, 1 Jan 2010 13:02:15 -0800
Subject: [antlr-interest] change to list text/html processing
Message-ID: <A941AC69-3699-4154-A14E-7435AD0727DC@cs.usfca.edu>

Hi, Graham W pointed out that "there has been a sharp increase in HTML-only emails this past year", which yield empty messages with links to an attachment.

I think we should auto convert those to text.  I.e., should Mailman convert text/html parts to plain text? This conversion happens after MIME attachments have been stripped.  I'll be turning on filtering and who knows what else this will mess up.  Anybody care if i try this feature out?

Ter

From kferrio at gmail.com  Fri Jan  1 13:58:39 2010
From: kferrio at gmail.com (Kyle Ferrio)
Date: Fri, 1 Jan 2010 14:58:39 -0700
Subject: [antlr-interest] Repost: ANTLRworks: Why do these rules behave
	differently in the embedded interpreter?
In-Reply-To: <6C54C56B-42D4-44D2-8994-77157A5BB472@cs.usfca.edu>
References: <02af690b17664b429cb288d6615f2638@temporal-wave.com>
	<4B3E50D2.6030301@jacaranda.org>
	<6C54C56B-42D4-44D2-8994-77157A5BB472@cs.usfca.edu>
Message-ID: <4608cec11001011358n116b0135q1fb8442fb827ecfb@mail.gmail.com>

Thanks, folks.

David-Sarah makes a good point, which I would make perhaps just a bit
differently, and this is I see the interpreter as a kind of learning
tool, a crutch if you will.  If it goes wonky in a way that makes it
obvious to a knave that it's at fault, fine.  But if it goes wonky in
a way which causes the student to wonder, "Is this me, or is this the
tool?" then learning is impeded at a critical juncture, even if the
ultimate resolution (assuming a persistent student) does produce
deeper insight.

Note: I manage a commercial software development group serving highly
specialized engineering customers.  More than one customer has told me
that he or she would rather have "a tool that fails in an obvious way"
more often than "a tool which fails in an ambiguous way less often."
{Quotes added to assist parsing.  :) }  It's not about being right or
wrong.  It's about knowing when you can trust your tools, and when you
shouldn't.

Kyle


On Fri, Jan 1, 2010 at 1:31 PM, Terence Parr <parrt at cs.usfca.edu> wrote:
>
> On Jan 1, 2010, at 11:45 AM, David-Sarah Hopwood wrote:
>
>> Jim Idle wrote:
>>> The interpreter is just a quick testing device and is easily fooled by grammar rules, use the debugger and not the interpreter and all will be fine.
>>
>> Yes, but what Kyle pointed out seems like an obvious bug in the
>> interpreter in a case that it is supposed to be able to handle.
>>
>> Either bugs like this should be fixed, or there is no point in having
>> the interpreter and all and it should be removed, with its functionality
>> being replaced by the debugger.
>
> yup. it's on my to-do list.
> T
>
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address
>

From parrt at cs.usfca.edu  Fri Jan  1 16:28:35 2010
From: parrt at cs.usfca.edu (Terence Parr)
Date: Fri, 1 Jan 2010 16:28:35 -0800
Subject: [antlr-interest] Repost: ANTLRworks: Why do these rules behave
	differently in the embedded interpreter?
In-Reply-To: <4608cec11001011358n116b0135q1fb8442fb827ecfb@mail.gmail.com>
References: <02af690b17664b429cb288d6615f2638@temporal-wave.com>
	<4B3E50D2.6030301@jacaranda.org>
	<6C54C56B-42D4-44D2-8994-77157A5BB472@cs.usfca.edu>
	<4608cec11001011358n116b0135q1fb8442fb827ecfb@mail.gmail.com>
Message-ID: <DA4993C3-BA05-483F-81CA-94599C533B9C@cs.usfca.edu>

yeah, the interp has never been quite right...just haven't had time to  
fix. now that book is done (printed just about now I guess) I can get  
back to ANTLR.
Ter
On Jan 1, 2010, at 1:58 PM, Kyle Ferrio wrote:

> Thanks, folks.
>
> David-Sarah makes a good point, which I would make perhaps just a bit
> differently, and this is I see the interpreter as a kind of learning
> tool, a crutch if you will.  If it goes wonky in a way that makes it
> obvious to a knave that it's at fault, fine.  But if it goes wonky in
> a way which causes the student to wonder, "Is this me, or is this the
> tool?" then learning is impeded at a critical juncture, even if the
> ultimate resolution (assuming a persistent student) does produce
> deeper insight.
>
> Note: I manage a commercial software development group serving highly
> specialized engineering customers.  More than one customer has told me
> that he or she would rather have "a tool that fails in an obvious way"
> more often than "a tool which fails in an ambiguous way less often."
> {Quotes added to assist parsing.  :) }  It's not about being right or
> wrong.  It's about knowing when you can trust your tools, and when you
> shouldn't.
>
> Kyle
>
>
>
> On Fri, Jan 1, 2010 at 1:31 PM, Terence Parr <parrt at cs.usfca.edu>  
> wrote:
>>
>> On Jan 1, 2010, at 11:45 AM, David-Sarah Hopwood wrote:
>>
>>> Jim Idle wrote:
>>>> The interpreter is just a quick testing device and is easily  
>>>> fooled by grammar rules, use the debugger and not the interpreter  
>>>> and all will be fine.
>>>
>>> Yes, but what Kyle pointed out seems like an obvious bug in the
>>> interpreter in a case that it is supposed to be able to handle.
>>>
>>> Either bugs like this should be fixed, or there is no point in  
>>> having
>>> the interpreter and all and it should be removed, with its  
>>> functionality
>>> being replaced by the debugger.
>>
>> yup. it's on my to-do list.
>> T
>>
>> List: http://www.antlr.org/mailman/listinfo/antlr-interest
>> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address
>>


From parrt at cs.usfca.edu  Fri Jan  1 16:52:54 2010
From: parrt at cs.usfca.edu (Terence Parr)
Date: Fri, 1 Jan 2010 16:52:54 -0800
Subject: [antlr-interest] change to list text/html processing
In-Reply-To: <A941AC69-3699-4154-A14E-7435AD0727DC@cs.usfca.edu>
References: <A941AC69-3699-4154-A14E-7435AD0727DC@cs.usfca.edu>
Message-ID: <51F86334-6BCE-4710-8CB2-C745206023C9@cs.usfca.edu>

Ok, I just ran:

for f in */*.html; do sed -i 's///g' $f; done

to fix all the old  port numbers.  that helps as we can see the  
attachments from before I changed it.

I'll update mailman too.  Thanks to Graham for fnding this and making  
correct suggestions.

Ter
On Jan 1, 2010, at 1:02 PM, Terence Parr wrote:

> Hi, Graham W pointed out that "there has been a sharp increase in  
> HTML-only emails this past year", which yield empty messages with  
> links to an attachment.
>
> I think we should auto convert those to text.  I.e., should Mailman  
> convert text/html parts to plain text? This conversion happens after  
> MIME attachments have been stripped.  I'll be turning on filtering  
> and who knows what else this will mess up.  Anybody care if i try  
> this feature out?
>
> Ter
>
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address


From parrt at cs.usfca.edu  Fri Jan  1 17:07:13 2010
From: parrt at cs.usfca.edu (Terence Parr)
Date: Fri, 1 Jan 2010 17:07:13 -0800
Subject: [antlr-interest] ok, set mailman to convert text/html to text/plain
Message-ID: <F259016E-E77C-4AB4-B520-8CED91E7B1AB@cs.usfca.edu>

I'm also trying to send this as HTML by using a bold font.

This is MONACO.

Ter

From parrt2000 at yahoo.com  Fri Jan  1 17:13:21 2010
From: parrt2000 at yahoo.com (Terence Parr)
Date: Fri, 1 Jan 2010 17:13:21 -0800 (PST)
Subject: [antlr-interest] testing from yahoo
Message-ID: <769637.13438.qm@web81007.mail.mud.yahoo.com>

I hope it sends some complicated stuff to the list.
	1. 
________________________________
a
	2. b
	3. c
Ter

From parrt at cs.usfca.edu  Fri Jan  1 17:16:22 2010
From: parrt at cs.usfca.edu (Terence Parr)
Date: Fri, 1 Jan 2010 17:16:22 -0800
Subject: [antlr-interest] images scrubbed out test
Message-ID: <F73CAC83-6646-4F41-8320-7CEAB5EA0C74@cs.usfca.edu>

Hmm...missing my images i think.  here's a picture of booboo the kitten:

-------------- next part --------------


Ter

From parrt at cs.usfca.edu  Fri Jan  1 17:17:40 2010
From: parrt at cs.usfca.edu (Terence Parr)
Date: Fri, 1 Jan 2010 17:17:40 -0800
Subject: [antlr-interest] images scrubbed out test
In-Reply-To: <F73CAC83-6646-4F41-8320-7CEAB5EA0C74@cs.usfca.edu>
References: <F73CAC83-6646-4F41-8320-7CEAB5EA0C74@cs.usfca.edu>
Message-ID: <1AC859E1-A4A1-416D-ADC2-F0CE397B2363@cs.usfca.edu>

ah ha! they we're removed. ok, trying again. here's booboo:

-------------- next part --------------
A non-text attachment was scrubbed...
Name: booboo-kitten.jpg
Type: image/jpeg
Size: 17021 bytes
Desc: not available
Url : http://www.antlr.org/pipermail/antlr-interest/attachments/20100101/05b14eec/attachment.jpg 
-------------- next part --------------


Ter

On Jan 1, 2010, at 5:16 PM, Terence Parr wrote:

> Hmm...missing my images i think.  here's a picture of booboo the  
> kitten:
>
>
>
>
> Ter
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address


From parrt at cs.usfca.edu  Fri Jan  1 17:23:15 2010
From: parrt at cs.usfca.edu (Terence Parr)
Date: Fri, 1 Jan 2010 17:23:15 -0800
Subject: [antlr-interest] and more attachments (I have to specify mime types
	to allow)
Message-ID: <5377988E-9C02-4072-A67F-B02B3C93431C@cs.usfca.edu>

Test  java text attachment and PDF (same image again).
Ter

-------------- next part --------------
A non-text attachment was scrubbed...
Name: MyForm.java
Type: application/octet-stream
Size: 305 bytes
Desc: not available
Url : http://www.antlr.org/pipermail/antlr-interest/attachments/20100101/c33641b1/attachment.obj 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: booboo-kitten.pdf
Type: application/pdf
Size: 19228 bytes
Desc: not available
Url : http://www.antlr.org/pipermail/antlr-interest/attachments/20100101/c33641b1/attachment.pdf 

From kferrio at gmail.com  Fri Jan  1 17:53:24 2010
From: kferrio at gmail.com (Kyle Ferrio)
Date: Fri, 1 Jan 2010 18:53:24 -0700
Subject: [antlr-interest] Repost: ANTLRworks: Why do these rules behave
	differently in the embedded interpreter?
In-Reply-To: <DA4993C3-BA05-483F-81CA-94599C533B9C@cs.usfca.edu>
References: <02af690b17664b429cb288d6615f2638@temporal-wave.com>
	<4B3E50D2.6030301@jacaranda.org>
	<6C54C56B-42D4-44D2-8994-77157A5BB472@cs.usfca.edu>
	<4608cec11001011358n116b0135q1fb8442fb827ecfb@mail.gmail.com>
	<DA4993C3-BA05-483F-81CA-94599C533B9C@cs.usfca.edu>
Message-ID: <4608cec11001011753w51e7ad7aoab78301f146fda96@mail.gmail.com>

On Fri, Jan 1, 2010 at 5:28 PM, Terence Parr <parrt at cs.usfca.edu> wrote:
> yeah, the interp has never been quite right...just haven't had time to
> fix. now that book is done (printed just about now I guess) I can get
> back to ANTLR.
> Ter

Cool.  Having not looked at the code, I might have guessed that Jean
Bovet was the guy to talk to about the AW interp.  I'm glad I posted
to the list.  Yep, looks like your book will ship at just the right
time to maximize the number of questions you get during midterms.
lol.  There is no rest for the creative mind.

I had a crazy (read: probably deeply flawed) idea while playing with
ANTLRWorks and the ANTLR IDE.  As I tried to black-box about what
might be going on inside the interp, I thought about how the
java-targeted output always ran fine.  And so I began to appreciate
all over again some of the challenges faced by anyone trying to write
a fault-tolerant interpreter.  I realized that much of the work is
probably redundant with what has already been done for the target
codegen.  So, I thought, why not just build and run the target?  Sure,
codegen takes a second, and compiling to bytecode takes another
second.  So what?  Small price for knowing it's right.  Ok, but what
about drawing concrete syntax trees?  No problem, just insert actions.
 Ok, but what about debugging with single stepping and peeking into
state variables?  No problem, just insert a callback to the GUI at
each decision point.  In fact, it might even be possible to make
predicates work in such an interp, by either "gating off" the
callbacks or just "marking in the debugger" when we're processing a
predicate.  Sure, an "instrumented parser" may be an ugly way to
implement an interp.  But if fidelity to the final product is a goal,
as in emulators, then speed and beauty may be negotiable.  How far out
in left field am I?  I realize that the objective of this line of
reasoning may be to solve a problem outside the intended scope of
ANTLRworks.  I assumed without justification that the interp would
"tell me how my grammar would perform."  But that is not at all the
same as "being a tool for demonstrating simple cases."  So I have no
basis for critiquing the interp, and I'm surely not suggesting a
course of action.  And before I dig a deeper hole for myself, I
preemptively apologize for not having time to implement any of this.
But maybe there's the germ of a class project in this for someone.

Kind Regards,
Kyle

From antlr at mirality.co.nz  Sat Jan  2 02:11:11 2010
From: antlr at mirality.co.nz (Gavin Lambert)
Date: Sat, 02 Jan 2010 23:11:11 +1300
Subject: [antlr-interest] Repost: ANTLRworks: Why do these rules behave
 differently in the embedded interpreter?
In-Reply-To: <4608cec11001011753w51e7ad7aoab78301f146fda96@mail.gmail.co
 m>
References: <02af690b17664b429cb288d6615f2638@temporal-wave.com>
	<4B3E50D2.6030301@jacaranda.org>
	<6C54C56B-42D4-44D2-8994-77157A5BB472@cs.usfca.edu>
	<4608cec11001011358n116b0135q1fb8442fb827ecfb@mail.gmail.com>
	<DA4993C3-BA05-483F-81CA-94599C533B9C@cs.usfca.edu>
	<4608cec11001011753w51e7ad7aoab78301f146fda96@mail.gmail.com>
Message-ID: <20100102101120.5DA603418431@www.antlr.org>

At 14:53 2/01/2010, Kyle Ferrio wrote:
 >So, I thought, why not just build and run the target?
 >Sure, codegen takes a second, and compiling to bytecode takes
 >another second.  So what?  Small price for knowing it's right.
 >Ok, but what about drawing concrete syntax trees?  No problem,
 >just insert actions.

Either I'm misinterpreting what you're talking about, or you're 
describing what the ANTLRWorks debugger already does.

 >In fact, it might even be possible to make predicates work in
 >such an interp, by either "gating off" the callbacks or just
 >"marking in the debugger" when we're processing a predicate.

The problem with predicates is that they're arbitrary target 
language code; ANTLR simply doesn't have enough information to 
emulate their functionality (for semantic predicates, at least; 
syntactic predicates could be dealt with correctly).  But that's 
what the Debug Remote feature is for.


From kferrio at gmail.com  Sat Jan  2 11:25:08 2010
From: kferrio at gmail.com (kferrio at gmail.com)
Date: Sat, 2 Jan 2010 19:25:08 +0000
Subject: [antlr-interest] Repost: ANTLRworks: Why do these rules behave
	differently in the embedded interpreter?
Message-ID: <653987599-1262460308-cardhu_decombobulator_blackberry.rim.net-2078986118-@bda428.bisx.prod.on.blackberry>

Thanks, Gavin.  Sorry to top-post, but I think it might be clearer than multiply  interspersed remarks.  

I think you understood my logic.  If anyone is unclear, it's me.  

I take your point about what the debugger already does, which invites the question 'why does the interp need to be distinct from the debugger?'  

Your point about remote debugging is also well taken.  So maybe my question should be, when would I want the interp?   Do you guys really use it?

 Since I challenged the correctness of the interp, the implication was that it needs fixing.  But if it's not the right tool for me in the first place then it does not need fixing.  My bad.  It's more about expectations (soft requirements) than about correctness.  I know of two people in addition to myself who have/had perfectly plausible but perhaps inappropriate expectations of the interpreter.  If we are not uniquely misguided then the question may be more than an academic curiosity.  

I'm sure people who live and breath antlr would not be confused.  Occasional users like me may not be so expert.

Thanks for setting me straight about the debugger.   I'm slow but deliberate.  :)

Kyle 

------Original Message------
From: Gavin Lambert
To: Kyle Ferrio
To: Terence Parr
Cc: ANTLR Interest Mailing List
Subject: Re: [antlr-interest] Repost: ANTLRworks: Why do these rules behave differently in the embedded interpreter?
Sent: Jan 2, 2010 3:11 AM

At 14:53 2/01/2010, Kyle Ferrio wrote:
 >So, I thought, why not just build and run the target?
 >Sure, codegen takes a second, and compiling to bytecode takes
 >another second.  So what?  Small price for knowing it's right.
 >Ok, but what about drawing concrete syntax trees?  No problem,
 >just insert actions.

Either I'm misinterpreting what you're talking about, or you're 
describing what the ANTLRWorks debugger already does.

 >In fact, it might even be possible to make predicates work in
 >such an interp, by either "gating off" the callbacks or just
 >"marking in the debugger" when we're processing a predicate.

The problem with predicates is that they're arbitrary target 
language code; ANTLR simply doesn't have enough information to 
emulate their functionality (for semantic predicates, at least; 
syntactic predicates could be dealt with correctly).  But that's 
what the Debug Remote feature is for.


Sent from my Verizon Wireless BlackBerry

From parrt at cs.usfca.edu  Mon Jan  4 16:32:35 2010
From: parrt at cs.usfca.edu (Terence Parr)
Date: Mon, 4 Jan 2010 16:32:35 -0800
Subject: [antlr-interest] Consequence of Bug ANTLR-413 in GUnit
In-Reply-To: <200912311445.41271.kaleb.pederson@gmail.com>
References: <200912311445.41271.kaleb.pederson@gmail.com>
Message-ID: <E91D625B-0AB4-4E4A-9468-1B4CB9DFF457@cs.usfca.edu>

fixed. :)
It will go out in next release. thanks, Kaleb.
Ter
On Dec 31, 2009, at 2:45 PM, Kaleb Pederson wrote:

> I just got around to debugging an issue in GUnit's tree walking capabilities only to find that it's caused by a bug I reported previously: 
> 
> ANTLR-413.
> 
> The failure of the CommonTreeNodeStream to pass in the adaptor to the tree iterator results in numerous ClassCastException's being thrown in GUnit's tree walking tests.
> 
> Since the fix is:
> 
> -    it = new TreeIterator(root);
> +    it = new TreeIterator(adaptor, root);
> 
> in CommonTreeNodeStream.java... I hope it can be fixed for the next release :).
> 
> Thanks.
> 
> --
> Kaleb Pederson
> 
> Blog - http://kalebpederson.com
> Twitter - http://twitter.com/kalebpederson
> 
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address


From wclodius at los-alamos.net  Mon Jan  4 20:49:25 2010
From: wclodius at los-alamos.net (William B. Clodius)
Date: Mon, 4 Jan 2010 21:49:25 -0700
Subject: [antlr-interest] Errors associated with target languages
Message-ID: <AD1EEAC2-8B71-4882-8E07-BECB8473DE46@los-alamos.net>

I am experimenting with generating parsers and lexers for a complicated grammar using as many available target languages as possible mostly to see how legible the code is as a possible guide to my hobby language syntax. For Java, Python, and C I have no obvious problems. For Delphi I am consistently getting the error (even for a very simple lexer)

error(10):  internal error: Exprtoken.g : java.lang.IllegalArgumentException: Can't find template actionGate.st; group hierarchy is [Delphi]

I have also been experimenting with creating the infrastructure for a target language. I have made a mistake somewhere and am getting a different error for the simple lexer

error(10):  internal error: Class org.antlr.tool.Grammar has no such attribute: recognizername in template context [outputFile lexer] : java.lang.NoSuchFieldException: recognizername

Are there any suggestions as to how to fix these errors?


From jp.raven at worldonline.fr  Tue Jan  5 06:52:10 2010
From: jp.raven at worldonline.fr (Jean-Pierre LAMBERT)
Date: Tue, 05 Jan 2010 15:52:10 +0100
Subject: [antlr-interest] Parser generation takes hours
Message-ID: <4B43521A.6000501@worldonline.fr>

Hello everybody,

I'm currently rewriting a LR parser to be used for ANTLR. As a result, 
ANTLR works literaly for hours before it outputs errors about my grammar.

My work is not finished; I have removed all left-recursions but I still 
have to do left-factorisations. The problem being that since ANTLR works 
for hours before I get the errors, it isn't very practical for me to fix 
the grammar.

Do you have any suggestions in this case? What could be done so that 
ANTLR would take only dozen of minutes? Is there something capital that 
I missed about ANTLR and LL grammars? How should be written ANTLR rules 
to avoid such a problem?

Thanks in advance, any adice will be welcome.

JP

From parrt at cs.usfca.edu  Tue Jan  5 09:22:24 2010
From: parrt at cs.usfca.edu (Terence Parr)
Date: Tue, 5 Jan 2010 09:22:24 -0800
Subject: [antlr-interest] Parser generation takes hours
In-Reply-To: <4B43521A.6000501@worldonline.fr>
References: <4B43521A.6000501@worldonline.fr>
Message-ID: <7B529C9C-6516-4DD1-8E78-1A8B518BCAD4@cs.usfca.edu>

very strange. antlr has a fail-safe so it cannot do that.  what  
command line options do you use?  command line or ANTLWorks?
Ter
On Jan 5, 2010, at 6:52 AM, Jean-Pierre LAMBERT wrote:

> Hello everybody,
>
> I'm currently rewriting a LR parser to be used for ANTLR. As a result,
> ANTLR works literaly for hours before it outputs errors about my  
> grammar.
>
> My work is not finished; I have removed all left-recursions but I  
> still
> have to do left-factorisations. The problem being that since ANTLR  
> works
> for hours before I get the errors, it isn't very practical for me to  
> fix
> the grammar.
>
> Do you have any suggestions in this case? What could be done so that
> ANTLR would take only dozen of minutes? Is there something capital  
> that
> I missed about ANTLR and LL grammars? How should be written ANTLR  
> rules
> to avoid such a problem?
>
> Thanks in advance, any adice will be welcome.
>
> JP
>
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address


From jimi at temporal-wave.com  Tue Jan  5 10:47:38 2010
From: jimi at temporal-wave.com (Jim Idle)
Date: Tue, 05 Jan 2010 10:47:38 -0800
Subject: [antlr-interest] Errors associated with target languages
In-Reply-To: <AD1EEAC2-8B71-4882-8E07-BECB8473DE46@los-alamos.net>
Message-ID: <34aa83daaa8b1247b9c3559e4e725358@temporal-wave.com>

I don't think that the Delphi target is being maintained to be honest. Perhaps the original author will comment? For your purposes I think that if you looked at C, Java and C#, then you would have all the information you needed as essentially the generated code will follow the same patterns regardless of the language, but the implementation will be oriented toward the target language.

Jim

> -----Original Message-----
> From: antlr-interest-bounces at antlr.org [mailto:antlr-interest-
> bounces at antlr.org] On Behalf Of William B. Clodius
> Sent: Monday, January 04, 2010 8:49 PM
> To: antlr-interest at antlr.org
> Subject: [antlr-interest] Errors associated with target languages
> 
> I am experimenting with generating parsers and lexers for a complicated
> grammar using as many available target languages as possible mostly to
> see how legible the code is as a possible guide to my hobby language
> syntax. For Java, Python, and C I have no obvious problems. For Delphi
> I am consistently getting the error (even for a very simple lexer)
> 
> error(10):  internal error: Exprtoken.g :
> java.lang.IllegalArgumentException: Can't find template actionGate.st;
> group hierarchy is [Delphi]
> 
> I have also been experimenting with creating the infrastructure for a
> target language. I have made a mistake somewhere and am getting a
> different error for the simple lexer
> 
> error(10):  internal error: Class org.antlr.tool.Grammar has no such
> attribute: recognizername in template context [outputFile lexer] :
> java.lang.NoSuchFieldException: recognizername
> 
> Are there any suggestions as to how to fix these errors?
> 
> 
> 
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-
> email-address


From jimi at temporal-wave.com  Tue Jan  5 11:04:07 2010
From: jimi at temporal-wave.com (Jim Idle)
Date: Tue, 05 Jan 2010 11:04:07 -0800
Subject: [antlr-interest] Parser generation takes hours
In-Reply-To: <7B529C9C-6516-4DD1-8E78-1A8B518BCAD4@cs.usfca.edu>
Message-ID: <d8a7ac87a17da14fb702a0ccef2f1a20@temporal-wave.com>

Perhaps you could send us your grammar too? You might find that you just need to comment out one or two rules until you get to reworking them.

Jim

> -----Original Message-----
> From: antlr-interest-bounces at antlr.org [mailto:antlr-interest-
> bounces at antlr.org] On Behalf Of Terence Parr
> Sent: Tuesday, January 05, 2010 9:22 AM
> To: Jean-Pierre LAMBERT
> Cc: antlr-interest at antlr.org
> Subject: Re: [antlr-interest] Parser generation takes hours
> 
> very strange. antlr has a fail-safe so it cannot do that.  what
> command line options do you use?  command line or ANTLWorks?
> Ter
> On Jan 5, 2010, at 6:52 AM, Jean-Pierre LAMBERT wrote:
> 
> > Hello everybody,
> >
> > I'm currently rewriting a LR parser to be used for ANTLR. As a
> result,
> > ANTLR works literaly for hours before it outputs errors about my
> > grammar.
> >
> > My work is not finished; I have removed all left-recursions but I
> > still
> > have to do left-factorisations. The problem being that since ANTLR
> > works
> > for hours before I get the errors, it isn't very practical for me to
> > fix
> > the grammar.
> >
> > Do you have any suggestions in this case? What could be done so that
> > ANTLR would take only dozen of minutes? Is there something capital
> > that
> > I missed about ANTLR and LL grammars? How should be written ANTLR
> > rules
> > to avoid such a problem?
> >
> > Thanks in advance, any adice will be welcome.
> >
> > JP
> >
> > List: http://www.antlr.org/mailman/listinfo/antlr-interest
> > Unsubscribe: http://www.antlr.org/mailman/options/antlr-
> interest/your-email-address
> 
> 
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-
> email-address


From gokul007 at gmail.com  Tue Jan  5 22:42:08 2010
From: gokul007 at gmail.com (Gokulakannan Somasundaram)
Date: Wed, 6 Jan 2010 12:12:08 +0530
Subject: [antlr-interest] Parser generation takes hours
In-Reply-To: <4B43521A.6000501@worldonline.fr>
References: <4B43521A.6000501@worldonline.fr>
Message-ID: <9362e74e1001052242s192e7ae7u4beef375108e297d@mail.gmail.com>

Hi Jean,
         I faced up with a similar issue, when i tried the migration of  a
LR parser. But it's definitely because of recursion stuffs. The way i
removed is sort of layman stuff, but thought of just informing you.
         Try to split the grammar into multiple sections(group of rules) and
try to add them one-by-one. You don't need to wait till the errors are
emitted. As soon as the parser generation takes more than 3-4 mins, just
stop the generation. The last section, which resulted in the increase most
probably contains the problematic code. Bear with me, if this approach looks
very awkward.

Thanks,
Gokul.

On Tue, Jan 5, 2010 at 8:22 PM, Jean-Pierre LAMBERT <jp.raven at worldonline.fr
> wrote:

> Hello everybody,
>
> I'm currently rewriting a LR parser to be used for ANTLR. As a result,
> ANTLR works literaly for hours before it outputs errors about my grammar.
>
> My work is not finished; I have removed all left-recursions but I still
> have to do left-factorisations. The problem being that since ANTLR works
> for hours before I get the errors, it isn't very practical for me to fix
> the grammar.
>
> Do you have any suggestions in this case? What could be done so that
> ANTLR would take only dozen of minutes? Is there something capital that
> I missed about ANTLR and LL grammars? How should be written ANTLR rules
> to avoid such a problem?
>
> Thanks in advance, any adice will be welcome.
>
> JP
>
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe:
> http://www.antlr.org/mailman/options/antlr-interest/your-email-address
>

From gokul007 at gmail.com  Tue Jan  5 22:46:01 2010
From: gokul007 at gmail.com (Gokulakannan Somasundaram)
Date: Wed, 6 Jan 2010 12:16:01 +0530
Subject: [antlr-interest] Resetting the Lexer and Parser in C-Target
Message-ID: <9362e74e1001052246x5d1394acg3400f424cc5dff3d@mail.gmail.com>

Hi,
    I have a grammar with close to 1000 rules, because of which the size of
the parser in C-Target is close to 8k. I was looking at the parser and it
has a function pointer for each of my rule. This portion is not going to
change for ever. So i was wondering, if there is a way to reset the parser
and re-use it, instead of allocating and initializing it from scratch. I am
trying to form something more specific to my project. In the meanwhile, i
thought of asking, whether there is a easy way to do the same.

Thanks,
Gokul.

From jimi at temporal-wave.com  Tue Jan  5 23:46:03 2010
From: jimi at temporal-wave.com (Jim Idle)
Date: Tue, 05 Jan 2010 23:46:03 -0800
Subject: [antlr-interest] Resetting the Lexer and Parser in C-Target
In-Reply-To: <9362e74e1001052246x5d1394acg3400f424cc5dff3d@mail.gmail.com>
Message-ID: <463de4cfabd70245bece05e1894cb50d@temporal-wave.com>

Yes, the next release [of the C runtime] generates a reuse() method for all components of the sequence and reuses all memory allocations. This is a big performance win if you have many inputs to parse. Also, the next release has a universal input stream that deals with UTFxx (with or without BOM) and EBCDIC. This release is a good few weeks away yet though and is tied to ANTLR v3 using ANTLR v3 for the various recognizers.

Jim

> -----Original Message-----
> From: antlr-interest-bounces at antlr.org [mailto:antlr-interest-
> bounces at antlr.org] On Behalf Of Gokulakannan Somasundaram
> Sent: Tuesday, January 05, 2010 10:46 PM
> To: antlr-interest at antlr.org
> Subject: [antlr-interest] Resetting the Lexer and Parser in C-Target
> 
> Hi,
>     I have a grammar with close to 1000 rules, because of which the
> size of
> the parser in C-Target is close to 8k. I was looking at the parser and
> it
> has a function pointer for each of my rule. This portion is not going
> to
> change for ever. So i was wondering, if there is a way to reset the
> parser
> and re-use it, instead of allocating and initializing it from scratch.
> I am
> trying to form something more specific to my project. In the meanwhile,
> i
> thought of asking, whether there is a easy way to do the same.
> 
> Thanks,
> Gokul.
> 
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-
> email-address


From christian.kihm at googlemail.com  Wed Jan  6 02:36:01 2010
From: christian.kihm at googlemail.com (Christian Kihm)
Date: Wed, 6 Jan 2010 11:36:01 +0100
Subject: [antlr-interest] each keyword allowed as Identifier
Message-ID: <cae780b1001060236uf03fed3y27d3c3286e1f63a4@mail.gmail.com>

Hi,

I try to parse a log file which probably was never intented to be
parsed. It is an log file of an poker client. My problem is that there
are nearly no constraints are existing for playernames.

A playername could be a sequens of any charactor of the full unicode
range. The only contraints are:

min  length = 4
max length = 12
no leading or trailing white space
white spaces in between are allowed, but never more than one in a row

Here are some examples:

INPUT:
Seat 9: The Player ( ($76 in chips)

Where the Playername is  "The Player ("

INPUT:

posts small:: posts small blind $2


Where the Playername is "posts small:"


I have no glue how to solve this problem. I already tried some stuff I
found in the FAQs like:

- syncing to the follow set (Article  Custom Syntax Error Recovery)
which dosnt work if a token of the follow set is also part of the name
- non greedy matching ( .+ to match the name)
- a list of all tokens in the rule playername which dosnt work because
the playername can consist not just of one token but an sequense of
tokens

Generelly it must be possible because out ther are severeal commercial
tools which are able to parse these log files. So I hope somebody of
you has an Idea.

Thanks and regards,
Christian

From ttmrichter at gmail.com  Wed Jan  6 03:33:23 2010
From: ttmrichter at gmail.com (Michael Richter)
Date: Wed, 6 Jan 2010 19:33:23 +0800
Subject: [antlr-interest] Issue with antlrworks 1.3.1 and JDK 1.6 update 17?
Message-ID: <ee970b291001060333j78fde473nfc0efad9fa93b03f@mail.gmail.com>

I did a recent round of upgrading software on my machines (real and virtual)
and somewhere in the process I've got ANTLRworks in unusable shape.  (I
tried reporting this through the antlr.org web site but it doesn't seem to
have taken.)

On *every* machine I have access to (both real and virtual, running Windows
XP or Linux) I get the following pretty nasty behaviour:

   1. *java -jar antlrworks.jar* (I can also use javaw on Windows for a
   similar, more annoying effect.)
   2. *The splash screen pops up briefly.*
   3. *The "New Document" dialogue replaces it.*
   4. I hit "Cancel" (or alternatively press "Esc" on the keyboard).

At this point, no matter the platform, no matter what I try, I have a dead
executable until I hit Ctrl+C (or, if I used javaw, I kill it in the task
manager).  I've tried this on Ubuntu 9.04, on Slackware 13.0 (virtualized),
on Windows XP (four different machines, one virtualized) and get this
behaviour consistently.  Whatever's supposed to happen when I cancel the new
document dialogue freezes and can only unfreeze through lethal injection of
Ctrl+C.  (There are, of course, no messages on the console that could tell
me what's going on.)

The behaviour on Windows after this if I choose "OK" is acceptable.  Up
comes the wizard for a new project which works normally and, more
importantly, can be cancelled and gets me into the ANTLRworks GUI.  It's a
bit obnoxious having to go that route, but it works.  If I choose to use the
wizard everything works as expected.

The behaviour on Linux is less acceptable.  The new project wizard pops up
but the text input focus is on ANTLRworks' editor window and CANNOT be put
into the wizard at all on any spot.  I have to cancel the wizard to get to
the main window (which then works as expected).  This also happens if I go
File -> New from the main window: I simply cannot get text input into any
field of the new project wizard.

The last time I did anything with ANTLRworks was v1.3.0 using JDK 1.6 update
16.  I did not see this behaviour then at all, so something has happened
between then and now.

Any advice for debugging this further?

From jp.raven at worldonline.fr  Wed Jan  6 03:44:35 2010
From: jp.raven at worldonline.fr (Jean-Pierre LAMBERT)
Date: Wed, 06 Jan 2010 12:44:35 +0100
Subject: [antlr-interest] Parser generation takes hours
In-Reply-To: <7B529C9C-6516-4DD1-8E78-1A8B518BCAD4@cs.usfca.edu>
References: <4B43521A.6000501@worldonline.fr>
	<7B529C9C-6516-4DD1-8E78-1A8B518BCAD4@cs.usfca.edu>
Message-ID: <4B4477A3.7050908@worldonline.fr>

I'm using command-line.

The last time I used these options but they do not seem to change 
anything : -report  -Xmultithreaded -verbose

Originally I did not use any of these options. I was just experimenting, 
I should remove them by now.


Le 05/01/2010 18:22, Terence Parr a ?crit :
> very strange. antlr has a fail-safe so it cannot do that.  what command
> line options do you use? command line or ANTLWorks?
> Ter
> On Jan 5, 2010, at 6:52 AM, Jean-Pierre LAMBERT wrote:
>
>> Hello everybody,
>>
>> I'm currently rewriting a LR parser to be used for ANTLR. As a result,
>> ANTLR works literaly for hours before it outputs errors about my grammar.
>>
>> My work is not finished; I have removed all left-recursions but I still
>> have to do left-factorisations. The problem being that since ANTLR works
>> for hours before I get the errors, it isn't very practical for me to fix
>> the grammar.
>>
>> Do you have any suggestions in this case? What could be done so that
>> ANTLR would take only dozen of minutes? Is there something capital that
>> I missed about ANTLR and LL grammars? How should be written ANTLR rules
>> to avoid such a problem?
>>
>> Thanks in advance, any adice will be welcome.
>>
>> JP
>>
>> List: http://www.antlr.org/mailman/listinfo/antlr-interest
>> Unsubscribe:
>> http://www.antlr.org/mailman/options/antlr-interest/your-email-address
>
>
>

From jp.raven at worldonline.fr  Wed Jan  6 03:47:06 2010
From: jp.raven at worldonline.fr (Jean-Pierre LAMBERT)
Date: Wed, 06 Jan 2010 12:47:06 +0100
Subject: [antlr-interest] Parser generation takes hours
In-Reply-To: <d8a7ac87a17da14fb702a0ccef2f1a20@temporal-wave.com>
References: <d8a7ac87a17da14fb702a0ccef2f1a20@temporal-wave.com>
Message-ID: <4B44783A.8040409@worldonline.fr>

Sorry but I'm unable to send you my grammar. My boss doesn't want this 
grammar to get out of the company.

If I'm able to narrow the problem to a small subset of my grammar I may 
share it with everybody, however.


Le 05/01/2010 20:04, Jim Idle a ?crit :
> Perhaps you could send us your grammar too? You might find that you just need to comment out one or two rules until you get to reworking them.
>
> Jim
>
>> -----Original Message-----
>> From: antlr-interest-bounces at antlr.org [mailto:antlr-interest-
>> bounces at antlr.org] On Behalf Of Terence Parr
>> Sent: Tuesday, January 05, 2010 9:22 AM
>> To: Jean-Pierre LAMBERT
>> Cc: antlr-interest at antlr.org
>> Subject: Re: [antlr-interest] Parser generation takes hours
>>
>> very strange. antlr has a fail-safe so it cannot do that.  what
>> command line options do you use?  command line or ANTLWorks?
>> Ter
>> On Jan 5, 2010, at 6:52 AM, Jean-Pierre LAMBERT wrote:
>>
>>> Hello everybody,
>>>
>>> I'm currently rewriting a LR parser to be used for ANTLR. As a
>> result,
>>> ANTLR works literaly for hours before it outputs errors about my
>>> grammar.
>>>
>>> My work is not finished; I have removed all left-recursions but I
>>> still
>>> have to do left-factorisations. The problem being that since ANTLR
>>> works
>>> for hours before I get the errors, it isn't very practical for me to
>>> fix
>>> the grammar.
>>>
>>> Do you have any suggestions in this case? What could be done so that
>>> ANTLR would take only dozen of minutes? Is there something capital
>>> that
>>> I missed about ANTLR and LL grammars? How should be written ANTLR
>>> rules
>>> to avoid such a problem?
>>>
>>> Thanks in advance, any adice will be welcome.
>>>
>>> JP
>>>
>>> List: http://www.antlr.org/mailman/listinfo/antlr-interest
>>> Unsubscribe: http://www.antlr.org/mailman/options/antlr-
>> interest/your-email-address
>>
>>
>> List: http://www.antlr.org/mailman/listinfo/antlr-interest
>> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-
>> email-address
>
>
>
>
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address
>
>

From jp.raven at worldonline.fr  Wed Jan  6 03:52:08 2010
From: jp.raven at worldonline.fr (Jean-Pierre LAMBERT)
Date: Wed, 06 Jan 2010 12:52:08 +0100
Subject: [antlr-interest] Parser generation takes hours
In-Reply-To: <9362e74e1001052242s192e7ae7u4beef375108e297d@mail.gmail.com>
References: <4B43521A.6000501@worldonline.fr>
	<9362e74e1001052242s192e7ae7u4beef375108e297d@mail.gmail.com>
Message-ID: <4B447968.9070806@worldonline.fr>

Thank you for the feedback. If find very interesting that basically 
we've done the same kind of task (translating a LR parser to ANTLR) and 
then we encounter the same problem doing it.

Furthermore it's very encouraging to know that you could overcome it.

I have already started to remove parts of the grammar and the problem is 
still there.

Your advice is very helpful, thanks again.

JP


Le 06/01/2010 07:42, Gokulakannan Somasundaram a ?crit :
> Hi Jean,
>           I faced up with a similar issue, when i tried the migration
> of  a LR parser. But it's definitely because of recursion stuffs. The
> way i removed is sort of layman stuff, but thought of just informing you.
>           Try to split the grammar into multiple sections(group of
> rules) and try to add them one-by-one. You don't need to wait till the
> errors are emitted. As soon as the parser generation takes more than 3-4
> mins, just stop the generation. The last section, which resulted in the
> increase most probably contains the problematic code. Bear with me, if
> this approach looks very awkward.
>
> Thanks,
> Gokul.
>
> On Tue, Jan 5, 2010 at 8:22 PM, Jean-Pierre LAMBERT
> <jp.raven at worldonline.fr <mailto:jp.raven at worldonline.fr>> wrote:
>
>     Hello everybody,
>
>     I'm currently rewriting a LR parser to be used for ANTLR. As a result,
>     ANTLR works literaly for hours before it outputs errors about my
>     grammar.
>
>     My work is not finished; I have removed all left-recursions but I still
>     have to do left-factorisations. The problem being that since ANTLR works
>     for hours before I get the errors, it isn't very practical for me to fix
>     the grammar.
>
>     Do you have any suggestions in this case? What could be done so that
>     ANTLR would take only dozen of minutes? Is there something capital that
>     I missed about ANTLR and LL grammars? How should be written ANTLR rules
>     to avoid such a problem?
>
>     Thanks in advance, any adice will be welcome.
>
>     JP
>
>     List: http://www.antlr.org/mailman/listinfo/antlr-interest
>     Unsubscribe:
>     http://www.antlr.org/mailman/options/antlr-interest/your-email-address
>
>

From jp.raven at worldonline.fr  Wed Jan  6 06:17:38 2010
From: jp.raven at worldonline.fr (Jean-Pierre LAMBERT)
Date: Wed, 06 Jan 2010 15:17:38 +0100
Subject: [antlr-interest] Parser generation takes hours
In-Reply-To: <4B447968.9070806@worldonline.fr>
References: <4B43521A.6000501@worldonline.fr>	<9362e74e1001052242s192e7ae7u4beef375108e297d@mail.gmail.com>
	<4B447968.9070806@worldonline.fr>
Message-ID: <4B449B82.5050102@worldonline.fr>

After investigating the problem further, it looks like I have rounded up 
the faulty rules.


In my grammar I have four sets of productions who are mutually 
(indirectly) left-recursive. After removing left-recursion, I have the 
"3 hours parser generation" problem.

If I remove from the grammar any one of these four sets, after removing 
left-recursion the parser generation takes less than 5 minutes, which is 
the expected behavior.


I will try tackling the other problems of the grammar (namely left 
factorisation for start) and I will see later if that changes anything 
when I include back all the four sets of mutually left-recursive rules.


Thanks everybody.


JP


Le 06/01/2010 12:52, Jean-Pierre LAMBERT a ?crit :
> I have already started to remove parts of the grammar and the problem is
> still there.
>
>
> Le 06/01/2010 07:42, Gokulakannan Somasundaram a ?crit :
>> Hi Jean,
>>            I faced up with a similar issue, when i tried the migration
>> of  a LR parser. But it's definitely because of recursion stuffs. The
>> way i removed is sort of layman stuff, but thought of just informing you.
>>            Try to split the grammar into multiple sections(group of
>> rules) and try to add them one-by-one. You don't need to wait till the
>> errors are emitted. As soon as the parser generation takes more than 3-4
>> mins, just stop the generation. The last section, which resulted in the
>> increase most probably contains the problematic code. Bear with me, if
>> this approach looks very awkward.
>>
>> Thanks,
>> Gokul.
>>
>> On Tue, Jan 5, 2010 at 8:22 PM, Jean-Pierre LAMBERT
>> <jp.raven at worldonline.fr<mailto:jp.raven at worldonline.fr>>  wrote:
>>
>>      Hello everybody,
>>
>>      I'm currently rewriting a LR parser to be used for ANTLR. As a result,
>>      ANTLR works literaly for hours before it outputs errors about my
>>      grammar.
>>
>>      My work is not finished; I have removed all left-recursions but I still
>>      have to do left-factorisations. The problem being that since ANTLR works
>>      for hours before I get the errors, it isn't very practical for me to fix
>>      the grammar.
>>
>>      Do you have any suggestions in this case? What could be done so that
>>      ANTLR would take only dozen of minutes? Is there something capital that
>>      I missed about ANTLR and LL grammars? How should be written ANTLR rules
>>      to avoid such a problem?
>>
>>      Thanks in advance, any adice will be welcome.
>>
>>      JP
>>
>>      List: http://www.antlr.org/mailman/listinfo/antlr-interest
>>      Unsubscribe:
>>      http://www.antlr.org/mailman/options/antlr-interest/your-email-address
>>
>>
>
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address
>
>

From gokul007 at gmail.com  Wed Jan  6 07:59:27 2010
From: gokul007 at gmail.com (Gokulakannan Somasundaram)
Date: Wed, 6 Jan 2010 21:29:27 +0530
Subject: [antlr-interest] Parser generation takes hours
In-Reply-To: <4B449B82.5050102@worldonline.fr>
References: <4B43521A.6000501@worldonline.fr>
	<9362e74e1001052242s192e7ae7u4beef375108e297d@mail.gmail.com>
	<4B447968.9070806@worldonline.fr> <4B449B82.5050102@worldonline.fr>
Message-ID: <9362e74e1001060759i3a58cbccmbad1ae04b971e3c6@mail.gmail.com>

Hi JP,
        One of most tough problem in the migration for me was to resolve the
left factoring. I couldn't find any easy way of doing it. If your project is
a big one, then definitely the solution for left factoring discussed in the
antlr website may not work for you.
        If your parser is not performance critical, then you can use
syntactic predicates heavily and solve the issue. But then the
compilation(atleast if you keep a higher value for k) and runtime will be
more for that.

My approach was approximately like this.

if there are a set of rules like
a: b|c;
b : d f | X;
c : d e | Y;

I rewrote the rules like this
a: d ( b_minus_d | c_minus_d ) | b_without_d | c_without_d;

b: d f | X;
b_minus_d : f;
b_without_d : X;

c : d e | Y;
c_minus_d : e;
c_without_d : Y;

This helped me to  keep the relevant rules together and with minimal code
repetition in the actions.

If you find out any other elegant way to resolve left factoring(without
using syntactic predicates), please do let me know.

Thanks,
Gokul.

On Wed, Jan 6, 2010 at 7:47 PM, Jean-Pierre LAMBERT <jp.raven at worldonline.fr
> wrote:

> After investigating the problem further, it looks like I have rounded up
> the faulty rules.
>
>
> In my grammar I have four sets of productions who are mutually
> (indirectly) left-recursive. After removing left-recursion, I have the
> "3 hours parser generation" problem.
>
> If I remove from the grammar any one of these four sets, after removing
> left-recursion the parser generation takes less than 5 minutes, which is
> the expected behavior.
>
>
> I will try tackling the other problems of the grammar (namely left
> factorisation for start) and I will see later if that changes anything
> when I include back all the four sets of mutually left-recursive rules.
>
>
> Thanks everybody.
>
>
> JP
>
>
>
> Le 06/01/2010 12:52, Jean-Pierre LAMBERT a ?crit :
> > I have already started to remove parts of the grammar and the problem is
> > still there.
> >
> >
> > Le 06/01/2010 07:42, Gokulakannan Somasundaram a ?crit :
> >> Hi Jean,
> >>            I faced up with a similar issue, when i tried the migration
> >> of  a LR parser. But it's definitely because of recursion stuffs. The
> >> way i removed is sort of layman stuff, but thought of just informing
> you.
> >>            Try to split the grammar into multiple sections(group of
> >> rules) and try to add them one-by-one. You don't need to wait till the
> >> errors are emitted. As soon as the parser generation takes more than 3-4
> >> mins, just stop the generation. The last section, which resulted in the
> >> increase most probably contains the problematic code. Bear with me, if
> >> this approach looks very awkward.
> >>
> >> Thanks,
> >> Gokul.
> >>
> >> On Tue, Jan 5, 2010 at 8:22 PM, Jean-Pierre LAMBERT
> >> <jp.raven at worldonline.fr<mailto:jp.raven at worldonline.fr>>  wrote:
> >>
> >>      Hello everybody,
> >>
> >>      I'm currently rewriting a LR parser to be used for ANTLR. As a
> result,
> >>      ANTLR works literaly for hours before it outputs errors about my
> >>      grammar.
> >>
> >>      My work is not finished; I have removed all left-recursions but I
> still
> >>      have to do left-factorisations. The problem being that since ANTLR
> works
> >>      for hours before I get the errors, it isn't very practical for me
> to fix
> >>      the grammar.
> >>
> >>      Do you have any suggestions in this case? What could be done so
> that
> >>      ANTLR would take only dozen of minutes? Is there something capital
> that
> >>      I missed about ANTLR and LL grammars? How should be written ANTLR
> rules
> >>      to avoid such a problem?
> >>
> >>      Thanks in advance, any adice will be welcome.
> >>
> >>      JP
> >>
> >>      List: http://www.antlr.org/mailman/listinfo/antlr-interest
> >>      Unsubscribe:
> >>
> http://www.antlr.org/mailman/options/antlr-interest/your-email-address
> >>
> >>
> >
> > List: http://www.antlr.org/mailman/listinfo/antlr-interest
> > Unsubscribe:
> http://www.antlr.org/mailman/options/antlr-interest/your-email-address
> >
> >
>
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe:
> http://www.antlr.org/mailman/options/antlr-interest/your-email-address
>

From gokul007 at gmail.com  Wed Jan  6 08:14:07 2010
From: gokul007 at gmail.com (Gokulakannan Somasundaram)
Date: Wed, 6 Jan 2010 21:44:07 +0530
Subject: [antlr-interest] Request for preinclude_c option
Message-ID: <9362e74e1001060814k7a28abd3tf1213a25e8bbfe25@mail.gmail.com>

Hi Jim,
       One more request that would help people, who would develop parsers
for C++. As you might know, there is a requirement to include C++
Headers(atleast the ones with templates) before the C Headers, in order to
avoid lot of cumbersome errors. Currently we have the following options
a) to include something before the antlr headers in .h file (preinclude
b) to include something after the antlr headers in .h file
c) to include something after the headers in the .cpp file

So the fourth permutation might help people who develop with C++ and not
make the headers heavy.

Thanks,
Gokul.

From jimi at temporal-wave.com  Wed Jan  6 08:32:59 2010
From: jimi at temporal-wave.com (Jim Idle)
Date: Wed, 06 Jan 2010 08:32:59 -0800
Subject: [antlr-interest] each keyword allowed as Identifier
In-Reply-To: <cae780b1001060236uf03fed3y27d3c3286e1f63a4@mail.gmail.com>
Message-ID: <2d115e12737faf4c9cfb151f2d512c43@temporal-wave.com>

Please search antlr.markmail.org for using keywords as identifiers and for the word 'poker' as there must now be about 50 people who have written ANTLR parsers for this! If someone would donate a parser to the ANTLR grammar list it would save a lot of people a lot of time, but I suggest it would not save most people money ;-)

Anyway:

id: ID | MIN | MAX | ...... etc ;

The use this instead of ID. I suspect though that these log files are easier to 'parse' in a manual fashion where you lex in context.

Jim

> -----Original Message-----
> From: antlr-interest-bounces at antlr.org [mailto:antlr-interest-
> bounces at antlr.org] On Behalf Of Christian Kihm
> Sent: Wednesday, January 06, 2010 2:36 AM
> To: antlr-interest at antlr.org
> Subject: [antlr-interest] each keyword allowed as Identifier
> 
> Hi,
> 
> I try to parse a log file which probably was never intented to be
> parsed. It is an log file of an poker client. My problem is that there
> are nearly no constraints are existing for playernames.
> 
> A playername could be a sequens of any charactor of the full unicode
> range. The only contraints are:
> 
> min  length = 4
> max length = 12
> no leading or trailing white space
> white spaces in between are allowed, but never more than one in a row
> 
> Here are some examples:
> 
> INPUT:
> Seat 9: The Player ( ($76 in chips)
> 
> Where the Playername is  "The Player ("
> 
> INPUT:
> 
> posts small:: posts small blind $2
> 
> 
> Where the Playername is "posts small:"
> 
> 
> I have no glue how to solve this problem. I already tried some stuff I
> found in the FAQs like:
> 
> - syncing to the follow set (Article  Custom Syntax Error Recovery)
> which dosnt work if a token of the follow set is also part of the name
> - non greedy matching ( .+ to match the name)
> - a list of all tokens in the rule playername which dosnt work because
> the playername can consist not just of one token but an sequense of
> tokens
> 
> Generelly it must be possible because out ther are severeal commercial
> tools which are able to parse these log files. So I hope somebody of
> you has an Idea.
> 
> Thanks and regards,
> Christian
> 
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-
> email-address


From jimi at temporal-wave.com  Wed Jan  6 09:08:32 2010
From: jimi at temporal-wave.com (Jim Idle)
Date: Wed, 06 Jan 2010 09:08:32 -0800
Subject: [antlr-interest] Parser generation takes hours
In-Reply-To: <4B44783A.8040409@worldonline.fr>
Message-ID: <ec80f4b4c2c012409d6c87dba405f967@temporal-wave.com>

OK - just try it without any options then and if the behavior changes add back each option in turn and see which one affects it. If you can pin it down a bit, then it can be fixed (assuming that there is a bug here).

Jim

> -----Original Message-----
> From: antlr-interest-bounces at antlr.org [mailto:antlr-interest-
> bounces at antlr.org] On Behalf Of Jean-Pierre LAMBERT
> Sent: Wednesday, January 06, 2010 3:47 AM
> To: antlr-interest at antlr.org
> Subject: Re: [antlr-interest] Parser generation takes hours
> 
> Sorry but I'm unable to send you my grammar. My boss doesn't want this
> grammar to get out of the company.
> 
> If I'm able to narrow the problem to a small subset of my grammar I may
> share it with everybody, however.
> 
> 
> Le 05/01/2010 20:04, Jim Idle a ?crit :
> > Perhaps you could send us your grammar too? You might find that you
> just need to comment out one or two rules until you get to reworking
> them.
> >
> > Jim


From jimi at temporal-wave.com  Wed Jan  6 09:22:15 2010
From: jimi at temporal-wave.com (Jim Idle)
Date: Wed, 06 Jan 2010 09:22:15 -0800
Subject: [antlr-interest] Request for preinclude_c option
In-Reply-To: <9362e74e1001060814k7a28abd3tf1213a25e8bbfe25@mail.gmail.com>
Message-ID: <10e2f2e0de8ec14dac09a612afa82f66@temporal-wave.com>

Guess I am not quite following this - would not using the @header section solve this? All headers should protect themselves against multiple #include of course.

I can add an  @preinclude easily enough but I don't want to clutter the options unless I must of course. @header is inserted before the #include of the generated header file.

Also, I am not sure that you really need to do this. You should place any code using C++ templates and headers etc in external files and create an API that you call from action code. That API should have a header and I can't see that including that header after <NAME>.h should be a problem. That doesn't mean that there isn't one, just that I am not seeing why. Can you post an example to the list? If @header won't do it and there is a valid reason, then I will certainly add another @option to fix it.

Jim

> -----Original Message-----
> From: antlr-interest-bounces at antlr.org [mailto:antlr-interest-
> bounces at antlr.org] On Behalf Of Gokulakannan Somasundaram
> Sent: Wednesday, January 06, 2010 8:14 AM
> To: antlr-interest at antlr.org
> Subject: [antlr-interest] Request for preinclude_c option
> 
> Hi Jim,
>        One more request that would help people, who would develop
> parsers
> for C++. As you might know, there is a requirement to include C++
> Headers(atleast the ones with templates) before the C Headers, in order
> to
> avoid lot of cumbersome errors. Currently we have the following options
> a) to include something before the antlr headers in .h file (preinclude
> b) to include something after the antlr headers in .h file
> c) to include something after the headers in the .cpp file
> 
> So the fourth permutation might help people who develop with C++ and
> not
> make the headers heavy.
> 
> Thanks,
> Gokul.
> 
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-
> email-address


From laurie at holoweb.net  Wed Jan  6 11:58:54 2010
From: laurie at holoweb.net (Laurie Harper)
Date: Wed, 6 Jan 2010 14:58:54 -0500
Subject: [antlr-interest] tree rewrite: breaking apart subtrees
Message-ID: <FEFCE222-7000-48DC-8684-ACA5ECC441FB@holoweb.net>

I'm trying to construct a parser/translator that will transform an  
extended version of a C-like language 'X' into standard 'X'. I can't  
figure out quite what I need in my tree grammar to get the result I  
want... For example, I have an input AST that looks something like this:

(VARDECL integer
     (VARIABLE ivar1 (LITERAL 1))
     (VARIABLE ivar2 (LITERAL 2))
     (VARIABLE ivar3))
(VARDECL integer
     (VARIABLE ivar4))

I need to rewrite it to look like this:

(VARDECL integer
     (VARIABLE ivar1 (LITERAL 1)))
(VARDECL integer
     (VARIABLE ivar2 (LITERAL 2)))
(VARDECL integer
     (VARIABLE ivar3))
(VARDECL integer
     (VARIABLE ivar))

My tree grammar contains a rule like this:

vars		: ^(VARDECL type (^(VARIABLE ID literal?))+)
	-> ^(VARDECL type)+ ^(VARIABLE ID literal)+;

but that's not giving a result that's even close to right :-) I've  
tried all sorts of variations as I try to puzzle out the tree rewrite  
syntax, to no avail. Can anyone offer any insight?

Thanks,

L.


From kaleb.pederson at gmail.com  Wed Jan  6 12:51:03 2010
From: kaleb.pederson at gmail.com (Kaleb Pederson)
Date: Wed, 6 Jan 2010 12:51:03 -0800
Subject: [antlr-interest] Issue with antlrworks 1.3.1 and JDK 1.6 update
	17?
In-Reply-To: <ee970b291001060333j78fde473nfc0efad9fa93b03f@mail.gmail.com>
References: <ee970b291001060333j78fde473nfc0efad9fa93b03f@mail.gmail.com>
Message-ID: <f14c01621001061251s7e5ba048mc67a394b97994527@mail.gmail.com>

On Wed, Jan 6, 2010 at 3:33 AM, Michael Richter <ttmrichter at gmail.com> wrote:
> I did a recent round of upgrading software on my machines (real and virtual)
> and somewhere in the process I've got ANTLRworks in unusable shape. ?(I
> tried reporting this through the antlr.org web site but it doesn't seem to
> have taken.)
>
> On *every* machine I have access to (both real and virtual, running Windows
> XP or Linux) I get the following pretty nasty behaviour:
[...snip...]

> The behaviour on Linux is less acceptable. ?The new project wizard pops up
> but the text input focus is on ANTLRworks' editor window and CANNOT be put
> into the wizard at all on any spot. ?I have to cancel the wizard to get to
> the main window (which then works as expected).

My AW preferences were set load the last file on each invocation,
which seems to work fine.  I changed my preferences to go to use the
wizard after which I started seeing some problems.

I started up AW, the 'New Document' dialog showed up.  I hit Cancel.
The UI disappeared but the application kept running.  I did a 'kill
-QUIT $AW_PID' and received the attached dump (I know Ter's been
playing with the mailing list filters and things, so we'll see if it
actually goes through).  The dump shows that AW is awaiting feedback,
but with no GUI present, it will never receive it.  This happens with
both 1.3 and 1.3.1, although the dump is for the 1.3.1.

>?This also happens if I go
> File -> New from the main window: I simply cannot get text input into any
> field of the new project wizard.

I can replicate this behavior on Linux.  Does the following workaround
work for you:

a) Click OK (using an empty grammar name)
b) Dismiss the dialog that says you used an empty grammar name
c) Left click in the grammar name input field to give it focus
d) Now type in the wizard as usual?

A related note, I've seen this behavior on many different Java
applications, so I'm not sure if it's Java related, or if it's just an
error that is easy to make when writing the application using Java.

> Any advice for debugging this further?

I also tried removing the AW preferences and disabling focus-stealing
prevention in my window manager, but neither of those helped either.

Looks like a couple of real bugs to me.

--
Kaleb Pederson

Blog - http://kalebpederson.com
Twitter - http://twitter.com/kalebpederson
-------------- next part --------------
2010-01-06 12:22:44
Full thread dump Java HotSpot(TM) 64-Bit Server VM (14.3-b01 mixed mode):

"Timer-1" prio=10 tid=0x0000000040e40800 nid=0x69a1 in Object.wait() [0x00007f06eb31b000]
   java.lang.Thread.State: TIMED_WAITING (on object monitor)                             
        at java.lang.Object.wait(Native Method)                                          
        - waiting on <0x00007f0721c6d7c8> (a java.util.TaskQueue)                        
        at java.util.TimerThread.mainLoop(Timer.java:509)                                
        - locked <0x00007f0721c6d7c8> (a java.util.TaskQueue)                            
        at java.util.TimerThread.run(Timer.java:462)                                     

"DestroyJavaVM" prio=10 tid=0x00007f06ec3ec000 nid=0x698d waiting on condition [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE                                                                

"Timer-0" daemon prio=10 tid=0x00007f06ec8f7800 nid=0x699f in Object.wait() [0x00007f06eb41c000]
   java.lang.Thread.State: TIMED_WAITING (on object monitor)                                    
        at java.lang.Object.wait(Native Method)                                                 
        - waiting on <0x00007f072148e720> (a java.util.TaskQueue)                               
        at java.util.TimerThread.mainLoop(Timer.java:509)                                       
        - locked <0x00007f072148e720> (a java.util.TaskQueue)                                   
        at java.util.TimerThread.run(Timer.java:462)                                            

"AWT-XAWT" daemon prio=10 tid=0x00007f06ec2d2000 nid=0x699b runnable [0x00007f06f0288000]
   java.lang.Thread.State: RUNNABLE
        at sun.awt.X11.XToolkit.waitForEvents(Native Method)
        at sun.awt.X11.XToolkit.run(XToolkit.java:548)
        at sun.awt.X11.XToolkit.run(XToolkit.java:523)
        at java.lang.Thread.run(Thread.java:619)

"Java2D Disposer" daemon prio=10 tid=0x0000000040b04800 nid=0x699a in Object.wait() [0x00007f06f0389000]
   java.lang.Thread.State: WAITING (on object monitor)
        at java.lang.Object.wait(Native Method)
        - waiting on <0x00007f072226ddf8> (a java.lang.ref.ReferenceQueue$Lock)
        at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:118)
        - locked <0x00007f072226ddf8> (a java.lang.ref.ReferenceQueue$Lock)
        at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:134)
        at sun.java2d.Disposer.run(Disposer.java:125)
        at java.lang.Thread.run(Thread.java:619)

"Low Memory Detector" daemon prio=10 tid=0x0000000040b1c000 nid=0x6998 runnable [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"CompilerThread1" daemon prio=10 tid=0x0000000040b19000 nid=0x6997 waiting on condition [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"CompilerThread0" daemon prio=10 tid=0x0000000040b16000 nid=0x6996 waiting on condition [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"Signal Dispatcher" daemon prio=10 tid=0x0000000040b14000 nid=0x6995 waiting on condition [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"Finalizer" daemon prio=10 tid=0x0000000040af6800 nid=0x6994 in Object.wait() [0x00007f06f1fde000]
   java.lang.Thread.State: WAITING (on object monitor)
        at java.lang.Object.wait(Native Method)
        - waiting on <0x00007f072226e3f0> (a java.lang.ref.ReferenceQueue$Lock)
        at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:118)
        - locked <0x00007f072226e3f0> (a java.lang.ref.ReferenceQueue$Lock)
        at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:134)
        at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:159)

"Reference Handler" daemon prio=10 tid=0x0000000040aef000 nid=0x6993 in Object.wait() [0x00007f06f20df000]
   java.lang.Thread.State: WAITING (on object monitor)
        at java.lang.Object.wait(Native Method)
        - waiting on <0x00007f072226e540> (a java.lang.ref.Reference$Lock)
        at java.lang.Object.wait(Object.java:485)
        at java.lang.ref.Reference$ReferenceHandler.run(Reference.java:116)
        - locked <0x00007f072226e540> (a java.lang.ref.Reference$Lock)

"VM Thread" prio=10 tid=0x0000000040ae8800 nid=0x6992 runnable

"GC task thread#0 (ParallelGC)" prio=10 tid=0x00000000409cf800 nid=0x698e runnable

"GC task thread#1 (ParallelGC)" prio=10 tid=0x00000000409d1800 nid=0x698f runnable

"GC task thread#2 (ParallelGC)" prio=10 tid=0x00000000409d3000 nid=0x6990 runnable

"GC task thread#3 (ParallelGC)" prio=10 tid=0x00000000409d5000 nid=0x6991 runnable

"VM Periodic Task Thread" prio=10 tid=0x0000000040b1e800 nid=0x6999 waiting on condition

JNI global references: 1281

Heap
 PSYoungGen      total 18496K, used 13856K [0x00007f07212e0000, 0x00007f0722780000, 0x00007f0735ce0000)
  eden space 15872K, 70% used [0x00007f07212e0000,0x00007f0721ddf770,0x00007f0722260000)
  from space 2624K, 98% used [0x00007f0722260000,0x00007f07224e88a0,0x00007f07224f0000)
  to   space 2624K, 0% used [0x00007f07224f0000,0x00007f07224f0000,0x00007f0722780000)
 PSOldGen        total 42240K, used 488K [0x00007f06f7ee0000, 0x00007f06fa820000, 0x00007f07212e0000)
  object space 42240K, 1% used [0x00007f06f7ee0000,0x00007f06f7f5a000,0x00007f06fa820000)
 PSPermGen       total 21248K, used 17747K [0x00007f06f2ae0000, 0x00007f06f3fa0000, 0x00007f06f7ee0000)
  object space 21248K, 83% used [0x00007f06f2ae0000,0x00007f06f3c34f10,0x00007f06f3fa0000)

From jp.raven at worldonline.fr  Wed Jan  6 13:31:39 2010
From: jp.raven at worldonline.fr (Jean-Pierre LAMBERT)
Date: Wed, 06 Jan 2010 22:31:39 +0100
Subject: [antlr-interest] Parser generation takes hours
In-Reply-To: <4B449B82.5050102@worldonline.fr>
References: <4B43521A.6000501@worldonline.fr>	<9362e74e1001052242s192e7ae7u4beef375108e297d@mail.gmail.com>	<4B447968.9070806@worldonline.fr>
	<4B449B82.5050102@worldonline.fr>
Message-ID: <4B45013B.8070600@worldonline.fr>

Looks like I finally hit the nail on the head!

After doing some crucial left-factorizations, I can put all four sets of 
mutually left-recursive productions all together, and the parser 
generation takes only a couple of minutes. It seems even faster than 
with only three sets without left-factorization.


It definitely looks like ANTLR is *very* sensible to left-factorization 
in rules.

In some way, it is quite normal since LL parsers requires it.


A big thank you for the help. It was greatly appreciated.


JP


Le 06/01/2010 15:17, Jean-Pierre LAMBERT a ?crit :
> After investigating the problem further, it looks like I have rounded up
> the faulty rules.
>
>
> In my grammar I have four sets of productions who are mutually
> (indirectly) left-recursive. After removing left-recursion, I have the
> "3 hours parser generation" problem.
>
> If I remove from the grammar any one of these four sets, after removing
> left-recursion the parser generation takes less than 5 minutes, which is
> the expected behavior.
>
>
> I will try tackling the other problems of the grammar (namely left
> factorisation for start) and I will see later if that changes anything
> when I include back all the four sets of mutually left-recursive rules.
>
>
> Thanks everybody.
>
>
> JP
>
>
>
> Le 06/01/2010 12:52, Jean-Pierre LAMBERT a ?crit :
>> I have already started to remove parts of the grammar and the problem is
>> still there.
>>
>>
>> Le 06/01/2010 07:42, Gokulakannan Somasundaram a ?crit :
>>> Hi Jean,
>>>             I faced up with a similar issue, when i tried the migration
>>> of  a LR parser. But it's definitely because of recursion stuffs. The
>>> way i removed is sort of layman stuff, but thought of just informing you.
>>>             Try to split the grammar into multiple sections(group of
>>> rules) and try to add them one-by-one. You don't need to wait till the
>>> errors are emitted. As soon as the parser generation takes more than 3-4
>>> mins, just stop the generation. The last section, which resulted in the
>>> increase most probably contains the problematic code. Bear with me, if
>>> this approach looks very awkward.
>>>
>>> Thanks,
>>> Gokul.
>>>
>>> On Tue, Jan 5, 2010 at 8:22 PM, Jean-Pierre LAMBERT
>>> <jp.raven at worldonline.fr<mailto:jp.raven at worldonline.fr>>   wrote:
>>>
>>>       Hello everybody,
>>>
>>>       I'm currently rewriting a LR parser to be used for ANTLR. As a result,
>>>       ANTLR works literaly for hours before it outputs errors about my
>>>       grammar.
>>>
>>>       My work is not finished; I have removed all left-recursions but I still
>>>       have to do left-factorisations. The problem being that since ANTLR works
>>>       for hours before I get the errors, it isn't very practical for me to fix
>>>       the grammar.
>>>
>>>       Do you have any suggestions in this case? What could be done so that
>>>       ANTLR would take only dozen of minutes? Is there something capital that
>>>       I missed about ANTLR and LL grammars? How should be written ANTLR rules
>>>       to avoid such a problem?
>>>
>>>       Thanks in advance, any adice will be welcome.
>>>
>>>       JP
>>>
>>>       List: http://www.antlr.org/mailman/listinfo/antlr-interest
>>>       Unsubscribe:
>>>       http://www.antlr.org/mailman/options/antlr-interest/your-email-address
>>>
>>>
>>
>> List: http://www.antlr.org/mailman/listinfo/antlr-interest
>> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address
>>
>>
>
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address
>
>

From jp.raven at worldonline.fr  Wed Jan  6 13:42:35 2010
From: jp.raven at worldonline.fr (Jean-Pierre LAMBERT)
Date: Wed, 06 Jan 2010 22:42:35 +0100
Subject: [antlr-interest] Parser generation takes hours
In-Reply-To: <ec80f4b4c2c012409d6c87dba405f967@temporal-wave.com>
References: <ec80f4b4c2c012409d6c87dba405f967@temporal-wave.com>
Message-ID: <4B4503CB.20005@worldonline.fr>

Well, now that I succeeded to get passed the problem... I don't know if 
there is a bug or not.

Looking at LL parser theory, it may not be that surprising -- 
combinatory explosions when building an LL parser who needs more 
left-factorizations.


Besides I'm working on a quite big grammar and that probably plays a 
role here. ANTLR probably handles such rules quite well on not-so big 
grammars.


Finally, all this mess occured on a non-working grammar. So I have the 
feeling that it's not that big an issue for the user.

Well, one entry in some FAQ for people migrating LR parsers to ANTLR 
would have done the trick, I'd say.

If I have just been reassured that fixing the left-factorization would 
have solved the problem, I'd simply worked on left-factorizing my 
grammar and stopped worrying. In absence of any advice on the matter I 
kind of panicked instead. :-)


JP


Le 06/01/2010 18:08, Jim Idle a ?crit :
> OK - just try it without any options then and if the behavior changes add back each option in turn and see which one affects it. If you can pin it down a bit, then it can be fixed (assuming that there is a bug here).
>
> Jim
>
>> -----Original Message-----
>> From: antlr-interest-bounces at antlr.org [mailto:antlr-interest-
>> bounces at antlr.org] On Behalf Of Jean-Pierre LAMBERT
>> Sent: Wednesday, January 06, 2010 3:47 AM
>> To: antlr-interest at antlr.org
>> Subject: Re: [antlr-interest] Parser generation takes hours
>>
>> Sorry but I'm unable to send you my grammar. My boss doesn't want this
>> grammar to get out of the company.
>>
>> If I'm able to narrow the problem to a small subset of my grammar I may
>> share it with everybody, however.
>>
>>
>> Le 05/01/2010 20:04, Jim Idle a ?crit :
>>> Perhaps you could send us your grammar too? You might find that you
>> just need to comment out one or two rules until you get to reworking
>> them.
>>>
>>> Jim
>
>
>
>
>
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address
>
>

From parrt at cs.usfca.edu  Wed Jan  6 16:28:05 2010
From: parrt at cs.usfca.edu (Terence Parr)
Date: Wed, 6 Jan 2010 16:28:05 -0800
Subject: [antlr-interest] printed finally: Language implementation patterns
Message-ID: <7275E965-B513-43DB-8E90-338B78F6EE2E@cs.usfca.edu>

Hiya,

Just to let you know that Language implementation patterns is now available as a physically printed book. hooray!  If you would like to put up a review at Amazon (whatever your honest opinion is, good or bad), here's the link:

http://tinyurl.com/y8r9kts

Apparently it's important to have a lot of different reviews/comments in terms of marketing.

Thanks,
Terence

From bios.bob.frankel at gmail.com  Wed Jan  6 17:24:20 2010
From: bios.bob.frankel at gmail.com (Bob Frankel)
Date: Wed, 06 Jan 2010 17:24:20 -0800
Subject: [antlr-interest] tracking token position when original file is
	pre-processed
Message-ID: <4B4537C4.3000001@gmail.com>

my language has a simple pre-processor that expands text of the form 
${<env-var-name>} as a first phase of translation; the expanded stream 
is then input to my ANTLRInputStream, where it proceeds onward to the 
lexer/parser in the usual fashion.  said another way, neither the lexer 
nor the parser is aware of the ${...} construct.

needless to say, character-position information (eg., token start/stop) 
are relative to the expanded stream and not the original file; this 
creates an problem, of course, when error indicators are not correctly 
positioned in the original source file (as i'm doing through some editor 
integration inside eclipse).

is there some pattern and/or (simple!) example that illustrates a 
technique for managing this situation; is there some way (say) i might 
embedded the equivalent of #line directives in the expanded stream which 
are then stripped further downstream while adjusting token offsets???


From antlr at mirality.co.nz  Wed Jan  6 17:29:02 2010
From: antlr at mirality.co.nz (Gavin Lambert)
Date: Thu, 07 Jan 2010 14:29:02 +1300
Subject: [antlr-interest] Parser generation takes hours
In-Reply-To: <4B4503CB.20005@worldonline.fr>
References: <ec80f4b4c2c012409d6c87dba405f967@temporal-wave.com>
	<4B4503CB.20005@worldonline.fr>
Message-ID: <20100107012920.415513418415@www.antlr.org>

At 10:42 7/01/2010, Jean-Pierre LAMBERT wrote:
 >Well, now that I succeeded to get passed the problem... I don't
 >know if there is a bug or not.
 >
 >Looking at LL parser theory, it may not be that surprising --
 >combinatory explosions when building an LL parser who needs more 

 >left-factorizations.

It's probably not a bug that it choked on it, but it might be a 
bug that it didn't *detect* that it was choking on it and give you 
an error message instead... :)

But error detection in general in ANTLR is fairly rudimentary at 
the moment.  Hopefully that'll get better once ANTLR v3 is 
self-hosted.  (Which'll be in 3.3, isn't it?)


From jimi at temporal-wave.com  Wed Jan  6 18:14:34 2010
From: jimi at temporal-wave.com (Jim Idle)
Date: Wed, 06 Jan 2010 18:14:34 -0800
Subject: [antlr-interest] tracking token position when original file is
	pre-processed
In-Reply-To: <4B4537C4.3000001@gmail.com>
Message-ID: <2e55fea51941eb4bb139c5413deb901a@temporal-wave.com>

Easiest is to have the preprocessor mark the input stream like cpp does:

# 555 "myfile.c"

And then add a reference to the file into the token or similar.

You can also incorporate the preprocessor in to your lexer and stack input streams if it isn't in need of a parser to do the pre-processing.


Jim

> -----Original Message-----
> From: antlr-interest-bounces at antlr.org [mailto:antlr-interest-
> bounces at antlr.org] On Behalf Of Bob Frankel
> Sent: Wednesday, January 06, 2010 5:24 PM
> To: antlr-interest at antlr.org
> Subject: [antlr-interest] tracking token position when original file is
> pre-processed
> 
> my language has a simple pre-processor that expands text of the form
> ${<env-var-name>} as a first phase of translation; the expanded stream
> is then input to my ANTLRInputStream, where it proceeds onward to the
> lexer/parser in the usual fashion.  said another way, neither the lexer
> nor the parser is aware of the ${...} construct.
> 
> needless to say, character-position information (eg., token start/stop)
> are relative to the expanded stream and not the original file; this
> creates an problem, of course, when error indicators are not correctly
> positioned in the original source file (as i'm doing through some
> editor
> integration inside eclipse).
> 
> is there some pattern and/or (simple!) example that illustrates a
> technique for managing this situation; is there some way (say) i might
> embedded the equivalent of #line directives in the expanded stream
> which
> are then stripped further downstream while adjusting token offsets???
> 
> 
> 
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-
> email-address


From jsrs701 at yahoo.com  Wed Jan  6 18:42:20 2010
From: jsrs701 at yahoo.com (J. Stephen Riley Silber)
Date: Wed, 6 Jan 2010 18:42:20 -0800 (PST)
Subject: [antlr-interest] printed finally: Language implementation
	patterns
In-Reply-To: <7275E965-B513-43DB-8E90-338B78F6EE2E@cs.usfca.edu>
References: <7275E965-B513-43DB-8E90-338B78F6EE2E@cs.usfca.edu>
Message-ID: <456784.77255.qm@web33306.mail.mud.yahoo.com>

Here's how my evening went:

	1. Finish up work stuff at the office.
	2. Check ANTLR email--oh look!  An exhortation to write an Amazon review!
	3. Write Amazon review.  (I can't remember, is one star good or bad?)

	4. Go home and check snail mail.
	5. Do a happy dance, since there's the book!And it looks glorious!  (Though it feels so familiar... :-)

Congrats, Ter, it looks great!  And I love having it in dead tree format, too!


________________________________
From: Terence Parr <parrt at cs.usfca.edu>
To: "antlr-interest at antlr.org interest" <antlr-interest at antlr.org>; stringtemplate-interest List <stringtemplate-interest at antlr.org>
Sent: Wed, January 6, 2010 4:28:05 PM
Subject: [antlr-interest] printed finally: Language implementation patterns

Hiya,

Just to let you know that Language implementation patterns is now available as a physically printed book. hooray!  If you would like to put up a review at Amazon (whatever your honest opinion is, good or bad), here's the link:

http://tinyurl.com/y8r9kts

Apparently it's important to have a lot of different reviews/comments in terms of marketing.

Thanks,
Terence

List: http://www.antlr.org/mailman/listinfo/antlr-interest
Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address

From manharg at yahoo.com  Wed Jan  6 19:21:12 2010
From: manharg at yahoo.com (Manhar Goindi)
Date: Wed, 6 Jan 2010 19:21:12 -0800 (PST)
Subject: [antlr-interest] Fw: ANTLR Parser Queries
Message-ID: <716065.77567.qm@web57208.mail.re3.yahoo.com>


--- On Wed, 1/6/10, Manhar Goindi <manharg at yahoo.com> wrote:

> From: Manhar Goindi <manharg at yahoo.com>
> Subject: ANTLR Parser Queries
> To: antlr-interest at antlr.org
> Date: Wednesday, January 6, 2010, 7:18 PM
> Hi,
> 
> We are using the ANTLR Parser and found it to be useful in
> generating C# code.? However, we would like to know the
> following about this parser?s capabilities:
> 
> 1.??? Is it possible to advance the parser
> from some input text position to skip some portion of the
> text and let it resume parsing from some new position in the
> input text?
> 2.??? Is it possible to delink the default
> ANTLR lexical analyzer from ANTLR Parser and link the ANTLR
> Parser to some custom Lexical Analyzer?
> 
> 
> Thanks & Best Regards,
> Manhar Goindi
> 
> 
> 
> 


From wclodius at los-alamos.net  Wed Jan  6 21:04:07 2010
From: wclodius at los-alamos.net (William B. Clodius)
Date: Wed, 6 Jan 2010 22:04:07 -0700
Subject: [antlr-interest] Undesirable ANTLRWorks behavior
Message-ID: <C780B50B-CF0B-4D3E-9C5C-820A478E6E06@los-alamos.net>

ANTLRWorks 3.2 on a Mac OS X 10.6.2, regular download from the ANTLR site so I don't believe it is Eclipse hoasted, is showing different odd behaviors on a large grammar file and a .stg file I am editing, that I suspect are related.

First, the syntax checking runs after every keystroke. As most changes in words or strings etc. result in an invalid token the console gets flooded with errors. This would be greatly reduced if the checking were performed only after any carriage return, or better yet, when the code is generated.

Second, it will sometimes give messages that it is running short of memory that can be temporarily fixed by closing other applications, particularly applications that run Java code.

Third it will sometimes slow down to a crawl with no messages. I suspect, but don't know how to prove, that this is a symptom of stressed garbage collection due to short memory.

It would not surprise me that both of the last two are indirect symptoms of the first problem. In particular I suspect it is keeping track of edits to an unnecessary level of detail.

From parrt at cs.usfca.edu  Wed Jan  6 21:37:38 2010
From: parrt at cs.usfca.edu (Terence Parr)
Date: Wed, 6 Jan 2010 21:37:38 -0800
Subject: [antlr-interest] printed finally: Language implementation
	patterns
In-Reply-To: <456784.77255.qm@web33306.mail.mud.yahoo.com>
References: <7275E965-B513-43DB-8E90-338B78F6EE2E@cs.usfca.edu>
	<456784.77255.qm@web33306.mail.mud.yahoo.com>
Message-ID: <7461F5FA-9B6D-4BA3-AEE9-77561F8753BC@cs.usfca.edu>


On Jan 6, 2010, at 6:42 PM, J. Stephen Riley Silber wrote:

> Here's how my evening went:
> Finish up work stuff at the office.
> Check ANTLR email--oh look!  An exhortation to write an Amazon review!
> Write Amazon review.  (I can't remember, is one star good or bad?)
> Go home and check snail mail.
> Do a happy dance, since there's the book!
> And it looks glorious!  (Though it feels so familiar... :-)
>
> Congrats, Ter, it looks great!  And I love having it in dead tree  
> format, too!

Dang it!  I haven't even gotten my copy yet! ;)  Can't wait to hold in  
my paws.

Ter

From gokul007 at gmail.com  Wed Jan  6 21:38:18 2010
From: gokul007 at gmail.com (Gokulakannan Somasundaram)
Date: Thu, 7 Jan 2010 11:08:18 +0530
Subject: [antlr-interest] Request for preinclude_c option
In-Reply-To: <10e2f2e0de8ec14dac09a612afa82f66@temporal-wave.com>
References: <9362e74e1001060814k7a28abd3tf1213a25e8bbfe25@mail.gmail.com>
	<10e2f2e0de8ec14dac09a612afa82f66@temporal-wave.com>
Message-ID: <9362e74e1001062138t38e8a636s16f3deaefe5864e0@mail.gmail.com>

Jim,
   I have tried to put forward my argument.

Guess I am not quite following this - would not using the @header section
> solve this? All headers should protect themselves against multiple #include
> of course.
>
@header section places it in both .h and .c, This makes the headers heavy.
of-course multiple #include is protected. My request is to place the section
only in .c, before placing the ANTLR headers(#include).


>
> Also, I am not sure that you really need to do this. You should place any
> code using C++ templates and headers etc in external files and create an API
> that you call from action code. That API should have a header and I can't
> see that including that header after <NAME>.h should be a problem. That
> doesn't mean that there isn't one, just that I am not seeing why. Can you
> post an example to the list? If @header won't do it and there is a valid
> reason, then I will certainly add another @option to fix it.
>
> Well, atleast we have done it in a way, which uses STL and std::bitset in
the action part. Sometimes we are even returning a std::bitset and
boost::variant, which are all template based. Sometimes to decide on which
token to be issued in lexer, we are using the hashmap.

I think ANTLR somewhere uses winsock.h and including winsock2.h after that
causes some issues for us. Basically we are not facing any issues, if  we
are including the ANTLR headers after our headers. But there is no way to do
that currently without making the generated header files heavy. So i had to
resort to using @preincludes option.

This is the problem by making the headers heavy. Say i have two headers, one
for CplusplusLexer.h and CplusplusParser.h. Say inside the lexer header, i
have included a C++ library that has templates. Now this should get placed
before the C headers. So CplusplusLexer.h looks like this

#include <boost/unordered.hpp>
#include <antlr3.h>


Similarly i have CplusplusParser.h, which looks like this
#include <bitset>
#include <antlr3.h>

Now in the .cpp file, if i have to do parsing, i have to include both
lexer.h and parser.h. Now there is no way template files can be placed
before the antlr header, unless i do something like this by again
re-declaring the headers before the antlr files
#include <boost/unordered.hpp>
#include <bitset>
#include "CplusplusLexer.h"
#include "Cplusplusparser.h"

While the fix is straight forward, identifying that this is the problem,
will take sometime.
The code organization will be more better, if i don't include them in the
CplusplusParser.h and CplusplusLexer.h and the round about fixes may not be
required. There is just one stuff to be kept in mind - to include the ANTLR
headers after the C++headers(with templates).

Hope i was able to put forward a case.

Thanks,
Gokul.

From Heiko.Folkerts at david-bs.de  Wed Jan  6 22:09:35 2010
From: Heiko.Folkerts at david-bs.de (Heiko Folkerts)
Date: Thu, 7 Jan 2010 07:09:35 +0100
Subject: [antlr-interest] Using paraphrase option when using the C target in
	ANTLR
Message-ID: <93FCBF72DCE7634481C5DF1654D8FF13035A80C8@DC2>

Hi all,
I am currently trying to improve the quality of our error messages generated bvy our ANTLR generated parser. Since our error messages are generally in german I'd like to take advance of the paraphrase option for rules and tokens to assign a clear name to those things. Unfortunately I get errors from ANTLR when using the following token definition:
ALPHASTRING
options { paraphrase="Zeichenkette";}
: ('a'..'z' | 'A'..'Z' | '0'..'9' | '/' | '-' | '\u00c0' .. '\u00d6' | '\u00d8' .. '\u00fc')+;

ANTLR reports: "unexpected token "Zeichenkette"

So can't I use paraphrases in the C target? I am using antlr3.2. 
Is there a workarround for the paraphrases?

Regards
Heiko


Mit freundlichem Gru?
Heiko Folkerts
Systementwicklung und -design
--
______________________________________________
DAVID GmbH ? Wendenring 1 ? 38114 Braunschweig
Tel.: +49 531 24379-14
Fax.: +49 531 24379-79
E-Mail: mailto:Heiko.Folkerts at david-bs.de
WWW:   http://www.david-bs.de?
Eintragung: Amtsgericht Braunschweig, HRB 3167
Gesch?ftsf?hrer: Frank Ptok
______________________________________________

From sandworm87 at yahoo.se  Thu Jan  7 01:12:45 2010
From: sandworm87 at yahoo.se (=?iso-8859-1?Q?Christer_L=F6fving?=)
Date: Thu, 7 Jan 2010 09:12:45 +0000 (GMT)
Subject: [antlr-interest] First project ?
Message-ID: <755474.29082.qm@web24715.mail.ird.yahoo.com>

Hi all!I am an experienced software developer, but relative new to antlr.Do any of you have an idea for a suitable "middle-sized" firstproject to work with, and by that way getting started with the whole thing ?
BR/Christer


      __________________________________________________________
L?na pengar utan s?kerhet. J?mf?r vilkor online hos Kelkoo.
http://www.kelkoo.se/c-100390123-lan-utan-sakerhet.html?partnerId=96915014

From jklumpp at harmonia.com  Thu Jan  7 06:15:28 2010
From: jklumpp at harmonia.com (Jared Klumpp)
Date: Thu, 7 Jan 2010 06:15:28 -0800
Subject: [antlr-interest] tree rewrite: breaking apart subtrees
References: <mailman.1.1262808002.9167.antlr-interest@antlr.org>
Message-ID: <B00C7A477A17884DBB97DEC08EE0DF9E0162CB4F@EXVBE010-2.exch010.intermedia.net>

See "Rewrite rule element cardinality" in the Definitive Antlr Reference (pg. 184), it seems you want something like:

vars            : VARDECL type (VARIABLE ID literal?)+
        -> ^(VARDECL type ^(VARIABLE ID literal))+;

-J

Date: Wed, 6 Jan 2010 14:58:54 -0500
From: Laurie Harper <laurie at holoweb.net>
Subject: [antlr-interest] tree rewrite: breaking apart subtrees
To: antlr-interest at antlr.org
Message-ID: <FEFCE222-7000-48DC-8684-ACA5ECC441FB at holoweb.net>
Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes

I'm trying to construct a parser/translator that will transform an 
extended version of a C-like language 'X' into standard 'X'. I can't 
figure out quite what I need in my tree grammar to get the result I 
want... For example, I have an input AST that looks something like this:

(VARDECL integer
     (VARIABLE ivar1 (LITERAL 1))
     (VARIABLE ivar2 (LITERAL 2))
     (VARIABLE ivar3))
(VARDECL integer
     (VARIABLE ivar4))

I need to rewrite it to look like this:

(VARDECL integer
     (VARIABLE ivar1 (LITERAL 1)))
(VARDECL integer
     (VARIABLE ivar2 (LITERAL 2)))
(VARDECL integer
     (VARIABLE ivar3))
(VARDECL integer
     (VARIABLE ivar))

My tree grammar contains a rule like this:

vars            : ^(VARDECL type (^(VARIABLE ID literal?))+)
        -> ^(VARDECL type)+ ^(VARIABLE ID literal)+;

but that's not giving a result that's even close to right :-) I've 
tried all sorts of variations as I try to puzzle out the tree rewrite 
syntax, to no avail. Can anyone offer any insight?

Thanks,

L.

From marcin.rzeznicki at gmail.com  Thu Jan  7 08:16:18 2010
From: marcin.rzeznicki at gmail.com (=?UTF-8?Q?Marcin_Rze=C5=BAnicki?=)
Date: Thu, 7 Jan 2010 17:16:18 +0100
Subject: [antlr-interest] Problem with AST tree with heterogeneous nodes
Message-ID: <14799bf61001070816g6c418b23r370598cdb8befee7@mail.gmail.com>

Hi all,
I have a curious problem when populating AST tree with custom nodes,
or to be more precise, with their constructors.
If, in the tree grammar (I basically construct AST tree in parser and
in the next step I rewrite it using tree walker), I am using:
STORE<UnresolvedLocal>[$ID, expressionResolver]

then constructor UnresolvedLocal(int ttype,	CommonTree id,
ExpressionResolver expressionResolver)
is picked up as expected

But, if I am using the following form:
STORE<LValueError> $lhs expression

then it gets transformed to:
new LValueError(stream_STORE.nextNode()), where nextNode() returns
Object, where I expected integer carrying token type to be used

Is this a bug of some kind? Can you explain it to me? Thank you very
much in advance


-- 
Greetings
Marcin Rze?nicki

From jsrs701 at yahoo.com  Thu Jan  7 09:21:50 2010
From: jsrs701 at yahoo.com (J. Stephen Riley Silber)
Date: Thu, 7 Jan 2010 09:21:50 -0800 (PST)
Subject: [antlr-interest] First project ?
In-Reply-To: <755474.29082.qm@web24715.mail.ird.yahoo.com>
References: <755474.29082.qm@web24715.mail.ird.yahoo.com>
Message-ID: <850996.41243.qm@web33308.mail.mud.yahoo.com>

Hi Crister,

Have you read this article?  "Humans Should Not Have to Grok XML"  http://www.ibm.com/developerworks/xml/library/x-sbxml.html

I really got to know ANTLR3 when I built a system to translate a simple declarative scripting language (like the "{{8, 17, 1964}, instructor}" example in the article, though much richer) into XML.  (I had a large sample of XML files, which represented a scripting language used in a company I worked for--but XML is lousy for coding, so I wanted a scripting language that would compile into that XML.)

It was a good project that really taught me the in's and out's of ANTLR3.  Easy in concept, but tough enough that I had to do some thinking.

Have fun!
Stephen


________________________________
From: Christer L?fving <sandworm87 at yahoo.se>
To: antlr-interest at antlr.org
Sent: Thu, January 7, 2010 1:12:45 AM
Subject: [antlr-interest] First project ?

Hi all!I am an experienced software developer, but relative new to antlr.Do any of you have an idea for a suitable "middle-sized" firstproject to work with, and by that way getting started with the whole thing ?
BR/Christer


      __________________________________________________________
L?na pengar utan s?kerhet. J?mf?r vilkor online hos Kelkoo.
http://www.kelkoo.se/c-100390123-lan-utan-sakerhet.html?partnerId=96915014

List: http://www.antlr.org/mailman/listinfo/antlr-interest
Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address

From antlr at mirality.co.nz  Thu Jan  7 12:36:16 2010
From: antlr at mirality.co.nz (Gavin Lambert)
Date: Fri, 08 Jan 2010 09:36:16 +1300
Subject: [antlr-interest] Using paraphrase option when using the C
 target in ANTLR
In-Reply-To: <93FCBF72DCE7634481C5DF1654D8FF13035A80C8@DC2>
References: <93FCBF72DCE7634481C5DF1654D8FF13035A80C8@DC2>
Message-ID: <20100107203621.C890A3418383@www.antlr.org>

At 19:09 7/01/2010, Heiko Folkerts wrote:
 >I am currently trying to improve the quality of our error 
messages
 >generated bvy our ANTLR generated parser. Since our error 
messages
 >are generally in german I'd like to take advance of the 
paraphrase
 >option for rules and tokens to assign a clear name to those 
things.
 >Unfortunately I get errors from ANTLR when using the following
 >token definition:
 >ALPHASTRING
 >options { paraphrase="Zeichenkette";}
 >: ('a'..'z' | 'A'..'Z' | '0'..'9' | '/' | '-' | '\u00c0' ..
 >'\u00d6' | '\u00d8' .. '\u00fc')+;
 >
 >ANTLR reports: "unexpected token "Zeichenkette"
 >
 >So can't I use paraphrases in the C target? I am using antlr3.2. 


The paraphrase option is a v2 option; there is no equivalent in 
v3.  If you want to change the text of the error messages then you 
will need to alter the exception text yourself, using the error 
reporting hooks (see the wiki).


From cross at kojeware.com  Thu Jan  7 13:06:57 2010
From: cross at kojeware.com (Cameron Ross)
Date: Thu, 07 Jan 2010 16:06:57 -0500
Subject: [antlr-interest] An ANTLR-based XMI translator
Message-ID: <4B464CF1.3070704@kojeware.com>

Hi,

I need to construct a program that will translate UML models specified 
in XMI into language 'X'.  I already have an ANTLR-based parser for 
language X that generates a CommonTree as an intermediate form.  I also 
have a TreeWalker  that uses StringTemplate to emit valid X given an AST 
in this intermediate form.  I currently use these two components to 
implement a pretty-printer for language X.  I was thinking that I could 
implement an ANTLR parser that would take XMI models as input and 
generate some (different) intermediate form AST.   I would then 
implement a TreeWalker to convert this AST into the intermediate form 
AST for language X.  This would allow me to use my existing emitter to 
output the model in language X.

1) Does this sound like a reasonable strategy?
2) Is anyone aware of an existing ANLTR3 grammar for XMI?
3) Is there a better way?

Thanks,
Cameron.

From Sanus at gmx.de  Thu Jan  7 14:04:16 2010
From: Sanus at gmx.de (Christian Hoffmann)
Date: Thu, 7 Jan 2010 23:04:16 +0100
Subject: [antlr-interest] c-target tree creation
Message-ID: <1445674875.20100107230416@gmx.de>

Hi,

I'm stumbling over a unhappy circumstance. Normaly a tree is built witch a
nil node and the children. But if the parser regognises just one line,
the nil node is not used (probably there are no children).

Example 1 - generated tree has just one node
  Source:
    int a;
  AST:
    <TOK_VAR_DEF(99) at line 2,
      <TOK_VAR_TYPE(100) at line 2, int(49) at line 2>,
      <TOK_VAR_DECL(101) at line 2, a(122) at line 2>
    >


Example 2 - generated tree has two nodes
  Source:
    int a;
    int b;
  AST:
    <nil(0) at line 2,
      <TOK_VAR_DEF(99) at line 2,
        <TOK_VAR_TYPE(100) at line 2, int(49) at line 2>,
        <TOK_VAR_DECL(101) at line 2, a(122) at line 2>
      >,
      <TOK_VAR_DEF(99) at line 3,
        <TOK_VAR_TYPE(100) at line 3, int(49) at line 3>,
        <TOK_VAR_DECL(101) at line 3, b(122) at line 3>
      >
    >


I think it would be easier for walking in a loop if the nil-node is
always created. Is this possible in future versions?

Regards,
Christian

-- 
Christian Hoffmann
?tzenkamp 4
38118 Braunschweig
Tel: 0171/7300609
Web: www.c-hoffmann.de
     www.logical-arts.de


From jimi at temporal-wave.com  Thu Jan  7 14:24:27 2010
From: jimi at temporal-wave.com (Jim Idle)
Date: Thu, 07 Jan 2010 14:24:27 -0800
Subject: [antlr-interest] c-target tree creation
In-Reply-To: <1445674875.20100107230416@gmx.de>
Message-ID: <638ecb884e29654ab45d785286fd7743@temporal-wave.com>

Actually the nil node should never be there so there must be something awry with your grammar. Try making sure that your tope rule looks like:

top : myrule EOF! ;

Jim

> -----Original Message-----
> From: antlr-interest-bounces at antlr.org [mailto:antlr-interest-
> bounces at antlr.org] On Behalf Of Christian Hoffmann
> Sent: Thursday, January 07, 2010 2:04 PM
> To: antlr-interest at antlr.org
> Subject: [antlr-interest] c-target tree creation
> 
> Hi,
> 
> I'm stumbling over a unhappy circumstance. Normaly a tree is built
> witch a
> nil node and the children. But if the parser regognises just one line,
> the nil node is not used (probably there are no children).
> 
> Example 1 - generated tree has just one node
>   Source:
>     int a;
>   AST:
>     <TOK_VAR_DEF(99) at line 2,
>       <TOK_VAR_TYPE(100) at line 2, int(49) at line 2>,
>       <TOK_VAR_DECL(101) at line 2, a(122) at line 2>
>     >
> 
> 
> Example 2 - generated tree has two nodes
>   Source:
>     int a;
>     int b;
>   AST:
>     <nil(0) at line 2,
>       <TOK_VAR_DEF(99) at line 2,
>         <TOK_VAR_TYPE(100) at line 2, int(49) at line 2>,
>         <TOK_VAR_DECL(101) at line 2, a(122) at line 2>
>       >,
>       <TOK_VAR_DEF(99) at line 3,
>         <TOK_VAR_TYPE(100) at line 3, int(49) at line 3>,
>         <TOK_VAR_DECL(101) at line 3, b(122) at line 3>
>       >
>     >
> 
> 
> I think it would be easier for walking in a loop if the nil-node is
> always created. Is this possible in future versions?
> 
> Regards,
> Christian
> 
> --
> Christian Hoffmann
> ?tzenkamp 4
> 38118 Braunschweig
> Tel: 0171/7300609
> Web: www.c-hoffmann.de
>      www.logical-arts.de
> 
> 
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-
> email-address


From jimi at temporal-wave.com  Thu Jan  7 14:49:59 2010
From: jimi at temporal-wave.com (Jim Idle)
Date: Thu, 07 Jan 2010 14:49:59 -0800
Subject: [antlr-interest] c-target tree creation
In-Reply-To: <1166891626.20100107233333@gmx.de>
Message-ID: <2c1f92302f6e8a40abdae0162e37e119@temporal-wave.com>

You should always rewrite the top node so you have a single root node and that is your problem as the nil node is created to hold the children and then as you don't rewrite it, it stays there but it is not created when there is just a single node. So just do this:

Try:

translation_unit
         @init{ _pParser->m_bError = false; _pParser->m_ScopeDelimiter = "@"; }
         : telement EOF

			->^(TUNIT telement)
         ;

telement
	: ( pragma
           | expression_statement
        )*
	;

And you will be all set.

Jim

PS: Please use the list rather than emailing me directly :-)


> -----Original Message-----
> From: Christian Hoffmann [mailto:Sanus at gmx.de]
> Sent: Thursday, January 07, 2010 2:34 PM
> To: Jim Idle
> Subject: Re: [antlr-interest] c-target tree creation
> 
> Hi Jim,
> 
> this is my top-rule...
> 
> translation_unit
>         @init{ _pParser->m_bError = false; _pParser->m_ScopeDelimiter =
> "@"; }
>         : ( pragma
>           | expression_statement
>           )* EOF!
>         ;
> 
> Can you see a problem?
> 
> Thx
> Chris
> 
> 
> 
> JI> Actually the nil node should never be there so there must be
> JI> something awry with your grammar. Try making sure that your tope
> rule looks like:
> 
> JI> top : myrule EOF! ;
> 
> JI> Jim
> 
> >> -----Original Message-----
> >> From: antlr-interest-bounces at antlr.org [mailto:antlr-interest-
> >> bounces at antlr.org] On Behalf Of Christian Hoffmann
> >> Sent: Thursday, January 07, 2010 2:04 PM
> >> To: antlr-interest at antlr.org
> >> Subject: [antlr-interest] c-target tree creation
> >>
> >> Hi,
> >>
> >> I'm stumbling over a unhappy circumstance. Normaly a tree is built
> >> witch a
> >> nil node and the children. But if the parser regognises just one
> line,
> >> the nil node is not used (probably there are no children).
> >>
> >> Example 1 - generated tree has just one node
> >>   Source:
> >>     int a;
> >>   AST:
> >>     <TOK_VAR_DEF(99) at line 2,
> >>       <TOK_VAR_TYPE(100) at line 2, int(49) at line 2>,
> >>       <TOK_VAR_DECL(101) at line 2, a(122) at line 2>
> >>     >
> >>
> >>
> >> Example 2 - generated tree has two nodes
> >>   Source:
> >>     int a;
> >>     int b;
> >>   AST:
> >>     <nil(0) at line 2,
> >>       <TOK_VAR_DEF(99) at line 2,
> >>         <TOK_VAR_TYPE(100) at line 2, int(49) at line 2>,
> >>         <TOK_VAR_DECL(101) at line 2, a(122) at line 2>
> >>       >,
> >>       <TOK_VAR_DEF(99) at line 3,
> >>         <TOK_VAR_TYPE(100) at line 3, int(49) at line 3>,
> >>         <TOK_VAR_DECL(101) at line 3, b(122) at line 3>
> >>       >
> >>     >
> >>
> >>
> >> I think it would be easier for walking in a loop if the nil-node is
> >> always created. Is this possible in future versions?
> >>
> >> Regards,
> >> Christian
> >>
> >> --
> >> Christian Hoffmann
> >> ?tzenkamp 4
> >> 38118 Braunschweig
> >> Tel: 0171/7300609
> >> Web: www.c-hoffmann.de
> >>      www.logical-arts.de
> >>
> >>
> >> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> >> Unsubscribe: http://www.antlr.org/mailman/options/antlr-
> interest/your-
> >> email-address
> 
> 
> 
> 
> JI> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> JI> Unsubscribe:
> JI> http://www.antlr.org/mailman/options/antlr-interest/your-email-
> address
> 
> 
> 
> --
> Mit freundlichen Gr??en
> Christian Hoffmann
> 
> mailto:Sanus at gmx.de
> 


From denis.debarbieux at ateji.com  Fri Jan  8 03:05:18 2010
From: denis.debarbieux at ateji.com (Denis Debarbieux)
Date: Fri, 08 Jan 2010 12:05:18 +0100
Subject: [antlr-interest] Parser generation takes hours
In-Reply-To: <9362e74e1001052242s192e7ae7u4beef375108e297d@mail.gmail.com>
References: <4B43521A.6000501@worldonline.fr>
	<9362e74e1001052242s192e7ae7u4beef375108e297d@mail.gmail.com>
Message-ID: <4B47116E.9000407@ateji.com>

Hi everybody,
> One of most tough problem in the migration for me was to resolve the
> left factoring. 

I am surprised by this discussion.

I thought that there are algorithms that automatically removes left  
recursions and left factorizations. Did I learn those algorithms at   
school but they are never used in real problem? Why  ANTLR does not use 
them?

Regards

Denis

Gokulakannan Somasundaram a ?crit :
> Hi Jean,
>          I faced up with a similar issue, when i tried the migration of  a
> LR parser. But it's definitely because of recursion stuffs. The way i
> removed is sort of layman stuff, but thought of just informing you.
>          Try to split the grammar into multiple sections(group of rules) and
> try to add them one-by-one. You don't need to wait till the errors are
> emitted. As soon as the parser generation takes more than 3-4 mins, just
> stop the generation. The last section, which resulted in the increase most
> probably contains the problematic code. Bear with me, if this approach looks
> very awkward.
>
> Thanks,
> Gokul.
>   


From Heiko.Folkerts at david-bs.de  Fri Jan  8 05:32:56 2010
From: Heiko.Folkerts at david-bs.de (Heiko Folkerts)
Date: Fri, 8 Jan 2010 14:32:56 +0100
Subject: [antlr-interest] Doxygen errors when using the C Target with ANTLR
Message-ID: <93FCBF72DCE7634481C5DF1654D8FF13035A819E@DC2>

Hi all,
When I run doxygen over the code generated by ANTLR using the C target, I get the following error message in doxygen.log:

C:/Projekte/modelisar/trunk/src/TFSSParser/grammar/TFSSBaseParser.h:303: Warning: argument 'POinter' of command @param is not found in the argument list of TFSSBaseParser_tfs_SCOPE_struct::void(ANTLR3_CDECL *free)
C:/Projekte/modelisar/trunk/src/TFSSParser/grammar/TFSSBaseParser.h:303: Warning: The following parameters of TFSSBaseParser_tfs_SCOPE_struct::void(ANTLR3_CDECL *free) are not documented:
  parameter 'free'

I found a fitting place in c.stg and tried to fix the template to solve the problem, but in the generated code the errorneous code still exists. What have I made wrong? Any solution how to fix it?
The Code for the mentioned file TFSSBase.h is:
/** Function that the user may provide to be called when the
     *  scope is destroyed (so you can free pANTLR3_HASH_TABLES and so on)
     *
     * \param POinter to an instance of this typedef/struct
     */

Thx
Heiko

Mit freundlichem Gru?
Heiko Folkerts
Systementwicklung und -design
--
______________________________________________
DAVID GmbH ? Wendenring 1 ? 38114 Braunschweig
Tel.: +49 531 24379-14
Fax.: +49 531 24379-79
E-Mail: mailto:Heiko.Folkerts at david-bs.de
WWW:   http://www.david-bs.de?
Eintragung: Amtsgericht Braunschweig, HRB 3167
Gesch?ftsf?hrer: Frank Ptok
______________________________________________

 
From fridi70 at gmx.de  Fri Jan  8 07:14:35 2010
From: fridi70 at gmx.de (fridi)
Date: Fri, 08 Jan 2010 16:14:35 +0100
Subject: [antlr-interest] Match anything until a specific phrase
Message-ID: <4B474BDB.4030105@gmx.de>

Hello all,
maybe someone can help me to get this done with ANTLR 3.2

My file has a header starting with 'test', some comments and then 
several blocks named 'Page 1',  'Page 2' etc. with integers, i.e.

test This is a comment and    
        we are not interested in.        
            Today is friday.

Page 1:
    123
    456
    789


I want to have a rule that consumes everything of the header until the 
word 'Page'.
'Page' should not be consumed by the header, it be consumed by another rule.

So I tried the following:

grammar TestNot;

options {
   language = Java;
}

rule :
   file;

file :
   header PAGE INT ':' INT+ EOF;

header :
   'test' ~PAGE;

PAGE :
   'Page';

INT :
   DIGIT+;

fragment
DIGIT :
   '0'..'9';


Any idea? Thanks in advance.


From steel at kryas.com  Fri Jan  8 07:35:49 2010
From: steel at kryas.com (Stanley Steel)
Date: Fri, 08 Jan 2010 08:35:49 -0700
Subject: [antlr-interest] Binary Message Parsing
Message-ID: <4B4750D5.8050700@kryas.com>

Is ANTLR suitable to build a binary message parser?

From KLPauba at west.com  Fri Jan  8 07:50:05 2010
From: KLPauba at west.com (Pauba, Kevin L)
Date: Fri, 8 Jan 2010 09:50:05 -0600
Subject: [antlr-interest] Can ST be used to generate binary output?
In-Reply-To: <4B4750D5.8050700@kryas.com>
References: <4B4750D5.8050700@kryas.com>
Message-ID: <226316B3E1F749498E28ACA66321D5BA01315F2CDC@oma00cexmbx03.corp.westworlds.com>

I would like to use my ANTLR-based DSL compiler to generate pretty-printed source, documentation (similar to javadocs) and bytecode output using StringTemplate.  The interpreter for this DSL needs bytecode in binary form.

Is ST able to generate binary output (I know it can do the gruntwork for pretty-printing and documentation)?  If so, might you have some pointers on how to do it?

Thanks!

From jimi at temporal-wave.com  Fri Jan  8 10:05:25 2010
From: jimi at temporal-wave.com (Jim Idle)
Date: Fri, 08 Jan 2010 10:05:25 -0800
Subject: [antlr-interest] Doxygen errors when using the C Target with
	ANTLR
In-Reply-To: <93FCBF72DCE7634481C5DF1654D8FF13035A819E@DC2>
Message-ID: <ef5a7836596ca041b21d6e19bbe9ce99@temporal-wave.com>

Well to be honest I started adding doxygen to the generated code but after looking at what you get from it I decided that it wasn't really of much help. In the next version of ANTLR doc comments of rules will be passed through to code gen and that should help.

To change the template you need to either rebuild ANTLR or set your class path up so that it finds your version of C.stg before mine. I suspect though that what you will get is not really that useful. Better to document the grammar than the generated code.

Jim

> -----Original Message-----
> From: antlr-interest-bounces at antlr.org [mailto:antlr-interest-
> bounces at antlr.org] On Behalf Of Heiko Folkerts
> Sent: Friday, January 08, 2010 5:33 AM
> To: antlr-interest at antlr.org
> Subject: [antlr-interest] Doxygen errors when using the C Target with
> ANTLR
> 
> Hi all,
> When I run doxygen over the code generated by ANTLR using the C target,
> I get the following error message in doxygen.log:
> 
> C:/Projekte/modelisar/trunk/src/TFSSParser/grammar/TFSSBaseParser.h:303
> : Warning: argument 'POinter' of command @param is not found in the
> argument list of TFSSBaseParser_tfs_SCOPE_struct::void(ANTLR3_CDECL
> *free)
> C:/Projekte/modelisar/trunk/src/TFSSParser/grammar/TFSSBaseParser.h:303
> : Warning: The following parameters of
> TFSSBaseParser_tfs_SCOPE_struct::void(ANTLR3_CDECL *free) are not
> documented:
>   parameter 'free'
> 
> I found a fitting place in c.stg and tried to fix the template to solve
> the problem, but in the generated code the errorneous code still
> exists. What have I made wrong? Any solution how to fix it?
> The Code for the mentioned file TFSSBase.h is:
> /** Function that the user may provide to be called when the
>      *  scope is destroyed (so you can free pANTLR3_HASH_TABLES and so
> on)
>      *
>      * \param POinter to an instance of this typedef/struct
>      */
> 
> Thx
> Heiko
> 
> Mit freundlichem Gru?
> Heiko Folkerts
> Systementwicklung und -design
> --
> ______________________________________________
> DAVID GmbH ? Wendenring 1 ? 38114 Braunschweig
> Tel.: +49 531 24379-14
> Fax.: +49 531 24379-79
> E-Mail: mailto:Heiko.Folkerts at david-bs.de
> WWW:   http://www.david-bs.de
> Eintragung: Amtsgericht Braunschweig, HRB 3167
> Gesch?ftsf?hrer: Frank Ptok
> ______________________________________________
> 
> 
> 
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-
> email-address


From jimi at temporal-wave.com  Fri Jan  8 10:09:09 2010
From: jimi at temporal-wave.com (Jim Idle)
Date: Fri, 08 Jan 2010 10:09:09 -0800
Subject: [antlr-interest] Match anything until a specific phrase
In-Reply-To: <4B474BDB.4030105@gmx.de>
Message-ID: <880718c24120444aa9e21ed91a6b5f01@temporal-wave.com>

Why don't you just remove the header before sending it to the lexer? Or write a function/method to do input.consume() until you find 'P' then check for 'Page', stop consuming if found, carry on consuming if not. Trigger the method as appropriate in action code for tokens or at lexer start up.

I would remove the 'literals' from your parser and make real lexer rules. Remember that the lexer runs, then the parser runs, you cannot direct the lexer from the parser.

Jim

> -----Original Message-----
> From: antlr-interest-bounces at antlr.org [mailto:antlr-interest-
> bounces at antlr.org] On Behalf Of fridi
> Sent: Friday, January 08, 2010 7:15 AM
> To: antlr-interest at antlr.org
> Subject: [antlr-interest] Match anything until a specific phrase
> 
> Hello all,
> maybe someone can help me to get this done with ANTLR 3.2
> 
> My file has a header starting with 'test', some comments and then
> several blocks named 'Page 1',  'Page 2' etc. with integers, i.e.
> 
> test This is a comment and
>         we are not interested in.
>             Today is friday.
> 
> Page 1:
>     123
>     456
>     789
> 
> 
> I want to have a rule that consumes everything of the header until the
> word 'Page'.
> 'Page' should not be consumed by the header, it be consumed by another
> rule.
> 
> So I tried the following:
> 
> grammar TestNot;
> 
> options {
>    language = Java;
> }
> 
> rule :
>    file;
> 
> file :
>    header PAGE INT ':' INT+ EOF;
> 
> header :
>    'test' ~PAGE;
> 
> PAGE :
>    'Page';
> 
> INT :
>    DIGIT+;
> 
> fragment
> DIGIT :
>    '0'..'9';
> 
> 
> Any idea? Thanks in advance.
> 
> 
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-
> email-address


From jimi at temporal-wave.com  Fri Jan  8 12:30:55 2010
From: jimi at temporal-wave.com (Jim Idle)
Date: Fri, 08 Jan 2010 12:30:55 -0800
Subject: [antlr-interest] Can ST be used to generate binary output?
In-Reply-To: <226316B3E1F749498E28ACA66321D5BA01315F2CDC@oma00cexmbx03.corp.westworlds.com>
Message-ID: <f96d276c563e574ba5d8b409f1f0d9af@temporal-wave.com>

It is usually better to produce an intermediate assembler representation of your byte code then have a parser that can assemble that in to byte code. You will for instance need to resolve the targets of 'jmp' and things like that and having an assembly language listing of the 'byte code' lets you be more productive when debugging and so on. Such assembly/intermediate languages are also good for optimizing phases.

When I need multiple different outputs like this then I create an Abstract class with common functionality for code generation and derive generators for each target from that. I have the code generator create StringTemplates when this is what is needed and also have a code generator that produces the byte code. Then I have the tree walker call the code generation methods rather than create templates directly in the tree grammar. You can then do multiple walks with different code generators.

If you are trying to generate Java byte code then write a code generator that interfaces to the ASM package: http://asm.ow2.org/ which is very good. There is also LLVM of course.

Finally I think that you would benefit greatly from reading the new book:

http://pragprog.com/titles/tpdsl/language-implementation-patterns

Which will guide you through some working of examples of all of this stuff.

Jim

> -----Original Message-----
> From: antlr-interest-bounces at antlr.org [mailto:antlr-interest-
> bounces at antlr.org] On Behalf Of Pauba, Kevin L
> Sent: Friday, January 08, 2010 7:50 AM
> To: antlr-interest at antlr.org
> Subject: [antlr-interest] Can ST be used to generate binary output?
> 
> I would like to use my ANTLR-based DSL compiler to generate pretty-
> printed source, documentation (similar to javadocs) and bytecode output
> using StringTemplate.  The interpreter for this DSL needs bytecode in
> binary form.
> 
> Is ST able to generate binary output (I know it can do the gruntwork
> for pretty-printing and documentation)?  If so, might you have some
> pointers on how to do it?
> 
> Thanks!
> 
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-
> email-address


From KLPauba at west.com  Fri Jan  8 13:53:01 2010
From: KLPauba at west.com (Pauba, Kevin L)
Date: Fri, 8 Jan 2010 15:53:01 -0600
Subject: [antlr-interest] Can ST be used to generate binary output?
In-Reply-To: <f96d276c563e574ba5d8b409f1f0d9af@temporal-wave.com>
References: <226316B3E1F749498E28ACA66321D5BA01315F2CDC@oma00cexmbx03.corp.westworlds.com>
	<f96d276c563e574ba5d8b409f1f0d9af@temporal-wave.com>
Message-ID: <226316B3E1F749498E28ACA66321D5BA01315F2EE5@oma00cexmbx03.corp.westworlds.com>

Thanks (again!) Jim.

I'll take that under advisement.  I've had the LIPs ebook for some time now (just received the hardcopy this week) but haven't dug too deep in it.  I'll do that now.


-----Original Message-----
From: antlr-interest-bounces at antlr.org [mailto:antlr-interest-bounces at antlr.org] On Behalf Of Jim Idle
Sent: Friday, January 08, 2010 2:31 PM
To: antlr-interest at antlr.org
Subject: Re: [antlr-interest] Can ST be used to generate binary output?

It is usually better to produce an intermediate assembler representation of your byte code then have a parser that can assemble that in to byte code. You will for instance need to resolve the targets of 'jmp' and things like that and having an assembly language listing of the 'byte code' lets you be more productive when debugging and so on. Such assembly/intermediate languages are also good for optimizing phases.

When I need multiple different outputs like this then I create an Abstract class with common functionality for code generation and derive generators for each target from that. I have the code generator create StringTemplates when this is what is needed and also have a code generator that produces the byte code. Then I have the tree walker call the code generation methods rather than create templates directly in the tree grammar. You can then do multiple walks with different code generators.

If you are trying to generate Java byte code then write a code generator that interfaces to the ASM package: http://asm.ow2.org/ which is very good. There is also LLVM of course.

Finally I think that you would benefit greatly from reading the new book:

http://pragprog.com/titles/tpdsl/language-implementation-patterns

Which will guide you through some working of examples of all of this stuff.

Jim


From ttmrichter at gmail.com  Fri Jan  8 20:32:05 2010
From: ttmrichter at gmail.com (Michael Richter)
Date: Sat, 9 Jan 2010 12:32:05 +0800
Subject: [antlr-interest] Question about idiom.
Message-ID: <ee970b291001082032t7469cc6em9419e23cb39efe47@mail.gmail.com>

I keep coming across a pattern in a grammar I'm working on.  This pattern
looks something like this:

   - A production can be *A*.
   - A production can be *B*.
   - A production can be *A B.*

In the grammar I'm transcribing this from, the notation used is *(A & B)*.
Is there some convenient way to code that in ANTLR's EBNF notation?  I keep
having to do *(A | B | A B)*.  As is that isn't all that onerous as-is, I
admit, but imagine if A is five tokens long and B is also five tokens long
and then imagine this kind of pattern happening about twenty times in the
grammar.  Is there a way to concisely do this?

From parrt at cs.usfca.edu  Fri Jan  8 21:36:40 2010
From: parrt at cs.usfca.edu (Terence Parr)
Date: Fri, 8 Jan 2010 21:36:40 -0800
Subject: [antlr-interest] Parser generation takes hours
In-Reply-To: <4B47116E.9000407@ateji.com>
References: <4B43521A.6000501@worldonline.fr>
	<9362e74e1001052242s192e7ae7u4beef375108e297d@mail.gmail.com>
	<4B47116E.9000407@ateji.com>
Message-ID: <7D9F6A30-7A5C-4EB6-B77A-89BDE539F8B1@cs.usfca.edu>

ANTLRWorks can do some left-factoring automatically.
Ter

On Jan 8, 2010, at 3:05 AM, Denis Debarbieux wrote:

> Hi everybody,
>> One of most tough problem in the migration for me was to resolve the
>> left factoring.
>
> I am surprised by this discussion.
>
> I thought that there are algorithms that automatically removes left
> recursions and left factorizations. Did I learn those algorithms at
> school but they are never used in real problem? Why  ANTLR does not  
> use
> them?
>
> Regards
>
> Denis
>
> Gokulakannan Somasundaram a ?crit :
>> Hi Jean,
>>         I faced up with a similar issue, when i tried the migration  
>> of  a
>> LR parser. But it's definitely because of recursion stuffs. The way i
>> removed is sort of layman stuff, but thought of just informing you.
>>         Try to split the grammar into multiple sections(group of  
>> rules) and
>> try to add them one-by-one. You don't need to wait till the errors  
>> are
>> emitted. As soon as the parser generation takes more than 3-4 mins,  
>> just
>> stop the generation. The last section, which resulted in the  
>> increase most
>> probably contains the problematic code. Bear with me, if this  
>> approach looks
>> very awkward.
>>
>> Thanks,
>> Gokul.
>>
>
>
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address


From fridi70 at gmx.de  Sat Jan  9 02:04:05 2010
From: fridi70 at gmx.de (fridi)
Date: Sat, 09 Jan 2010 11:04:05 +0100
Subject: [antlr-interest] Match anything until a specific phrase
In-Reply-To: <880718c24120444aa9e21ed91a6b5f01@temporal-wave.com>
References: <880718c24120444aa9e21ed91a6b5f01@temporal-wave.com>
Message-ID: <4B485495.3050803@gmx.de>


Jim Idle wrote:
> Why don't you just remove the header before sending it to the lexer? 

Yes, that is a good idea, too. I thought it should be possible to get 
this done with ANTLR
> Or write a function/method to do input.consume() until you find 'P' then check for 'Page', stop consuming if found, carry on consuming if not. Trigger the method as appropriate in action code for tokens or at lexer start up.
>   
Do you have any simple example or hint how to do that?
> I would remove the 'literals' from your parser and make real lexer rules.

Yes, that is what I have done in my real grammar, this one here was just 
an example.

Thanks a lot - Fridi

>  Remember that the lexer runs, then the parser runs, you cannot direct the lexer from the parser.
>
> Jim
>
>   
>> -----Original Message-----
>> From: antlr-interest-bounces at antlr.org [mailto:antlr-interest-
>> bounces at antlr.org] On Behalf Of fridi
>> Sent: Friday, January 08, 2010 7:15 AM
>> To: antlr-interest at antlr.org
>> Subject: [antlr-interest] Match anything until a specific phrase
>>
>> Hello all,
>> maybe someone can help me to get this done with ANTLR 3.2
>>
>> My file has a header starting with 'test', some comments and then
>> several blocks named 'Page 1',  'Page 2' etc. with integers, i.e.
>>
>> test This is a comment and
>>         we are not interested in.
>>             Today is friday.
>>
>> Page 1:
>>     123
>>     456
>>     789
>>
>>
>> I want to have a rule that consumes everything of the header until the
>> word 'Page'.
>> 'Page' should not be consumed by the header, it be consumed by another
>> rule.
>>
>> So I tried the following:
>>
>> grammar TestNot;
>>
>> options {
>>    language = Java;
>> }
>>
>> rule :
>>    file;
>>
>> file :
>>    header PAGE INT ':' INT+ EOF;
>>
>> header :
>>    'test' ~PAGE;
>>
>> PAGE :
>>    'Page';
>>
>> INT :
>>    DIGIT+;
>>
>> fragment
>> DIGIT :
>>    '0'..'9';
>>
>>
>> Any idea? Thanks in advance.
>>
>>
>> List: http://www.antlr.org/mailman/listinfo/antlr-interest
>> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-
>> email-address
>>     
>
>
>
>
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address
>
>   


From kroepke at classdump.org  Sat Jan  9 06:41:33 2010
From: kroepke at classdump.org (=?iso-8859-1?Q?Kay_R=F6pke?=)
Date: Sat, 9 Jan 2010 15:41:33 +0100
Subject: [antlr-interest] Question about idiom.
In-Reply-To: <ee970b291001082032t7469cc6em9419e23cb39efe47@mail.gmail.com>
References: <ee970b291001082032t7469cc6em9419e23cb39efe47@mail.gmail.com>
Message-ID: <53C75D1D-B191-42E5-BF37-7E5E50BA35D9@classdump.org>


On Jan 9, 2010, at 5:32 AM, Michael Richter wrote:

> I keep coming across a pattern in a grammar I'm working on.  This pattern
> looks something like this:
> 
>   - A production can be *A*.
>   - A production can be *B*.
>   - A production can be *A B.*
> 
> In the grammar I'm transcribing this from, the notation used is *(A & B)*.
> Is there some convenient way to code that in ANTLR's EBNF notation?  I keep
> having to do *(A | B | A B)*.  As is that isn't all that onerous as-is, I
> admit, but imagine if A is five tokens long and B is also five tokens long
> and then imagine this kind of pattern happening about twenty times in the
> grammar.  Is there a way to concisely do this?

What is the restriction on the parts of the production?
I.e. what differentiates a valid production from an invalid one?

I'll take a wild guess, maybe I'm right ;)
Given the tokens A, B, C, D, i suspect that the allowed combination is any permutation of these tokens,
i.e. A B C D, C B A, D, A, B etc are all valid inputs?

Then the question is, how do you a) make it easy to write in the grammar and b) still ensure no repeated element in the production.
One way to do it is to use semantic predicates (turning off or validating parts of the grammar depending on semantic infomation).
Depending on whether you want the FailedPredicateException or not, you would use a gated sempred ( {}?=> ) or a non-gated one ( {}? ).
Gated sempreds "turn off" parts of the grammar, while regular validating predicates do not.

Disclaimer: written in mail, assuming Java target, not enough coffee yadda yadda:

primaryOne
@init {
Map seenToken = new HashMap();
}
	:
	(	{! seenToken.containsKey(input.LT(1).getText()) }? prim=primaryOneToken
		{ seenToken.put($prim.start.getText(), Boolean.TRUE); }
	)+
	;

primaryOneToken
	:	'A'
	|	'B'
	|	'C'
	|	'D'
	;

expr	:	primaryOne '&' primaryOne 'A' /*  the 'A' is just to demonstrate that ANTLR will carry on matching input correctly */
	;

That should allow lists of non-repeated A, B, C, D in any order. Maybe there is a more clever way of writing that, but it eludes me right now.

Try it in ANTLRWorks on input like:
A B C & A A
and see what it matches where and what changes if you change the the sempred to a gated one.

cheers,
-k

From pureza at gmail.com  Sat Jan  9 09:03:33 2010
From: pureza at gmail.com (Luis Pureza)
Date: Sat, 9 Jan 2010 17:03:33 +0000
Subject: [antlr-interest] Unexpected behavior while using += in a tree
	grammar
In-Reply-To: <3e1533501001090858le8e6d05m43327c6be60ec561@mail.gmail.com>
References: <3e1533501001090858le8e6d05m43327c6be60ec561@mail.gmail.com>
Message-ID: <3e1533501001090903g70140323jfb08fed7984ab76d@mail.gmail.com>

Hi,

I've started using antlr a few days ago, so let me begin by thanking
everyone that contributed for creating this fantastic project.

Unfortunately, I think I ran into a bug and I'm hoping you might help me.

I'm using a tree grammar where I have the following rule:

expr returns [Expr value]
? ?| ID ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?{ $value = new
Var($ID.text); }
? ?| ^(APP fn=expr (args+=expr)+ { $value = new App($fn.value, $args); }
? ?...

Surprisingly, $args is a list of CommonTrees, and not a list of Expr
as I was hoping it would be. Is this a bug or a feature? If it's the
latter, is there any way to "convert" the tree into an Expr?

For now, I'm collecting args manually, with the following workaround:

expr returns [Expr value]
@init {
?List<Expr> ops = new ArrayList<Expr>();
}
? ?| ^(APP fn=expr (op=expr { ops.add($op.value); })+) { ... }
? ?| ID
? ? ? ? ? ? ? ? ? ?{ ... }

Thanks!

Lu?s Pureza

From antonio.petrelli at gmail.com  Sat Jan  9 11:38:33 2010
From: antonio.petrelli at gmail.com (Antonio Petrelli)
Date: Sat, 9 Jan 2010 20:38:33 +0100
Subject: [antlr-interest] Problems with Maven plugin
Message-ID: <aae96ca1001091138x2bc992f5r35ec3057a89694cb@mail.gmail.com>

Hi all
Sorry for being an ANTLR newbie. I would like to use the Maven plugin.
When I try to generate (through mvn compile) sources of the Java.g 1.6
parser, the plugin gives me an error (full log below in the mail):

error(7):  cannot find or open file: null/Java.g

However, if I copy the same file under the "null" directory, it
generates the code!

I am using Maven 2.2.1 under Linux Kubuntu 9.10 amd64, OpenJDK 1.6 b16

You can check it live at this address:
http://svn.eu.apache.org/repos/asf/tiles/sandbox/trunk/tiles-autotag/tiles-autotag-core/

Thanks in advance
Antonio Petrelli

-----------------
Full Log

mvn clean compile -e
+ Error stacktraces are turned on.
[INFO] Scanning for projects...
[INFO] ------------------------------------------------------------------------
[INFO] Building Autotag - Core
[INFO]    task-segment: [clean, compile]
[INFO] ------------------------------------------------------------------------
[INFO] [clean:clean {execution: default-clean}]
[INFO] Deleting directory
/home/antonio/javadev/workspace-sandbox/tiles-autotag/tiles-autotag-core/target
[INFO] [antlr3:antlr {execution: default}]
[INFO] ANTLR: Processing source directory
/home/antonio/javadev/workspace-sandbox/tiles-autotag/tiles-autotag-core/src/main/antlr3
ANTLR Parser Generator  Version 3.2 Sep 23, 2009 14:05:07
error(7):  cannot find or open file: null/Java.g
[INFO] ------------------------------------------------------------------------
[ERROR] BUILD ERROR
[INFO] ------------------------------------------------------------------------
[INFO] ANTLR caught 1 build errors.
[INFO] ------------------------------------------------------------------------
[INFO] Trace
org.apache.maven.lifecycle.LifecycleExecutionException: ANTLR caught 1
build errors.
        at org.apache.maven.lifecycle.DefaultLifecycleExecutor.executeGoals(DefaultLifecycleExecutor.java:719)
        at org.apache.maven.lifecycle.DefaultLifecycleExecutor.executeGoalWithLifecycle(DefaultLifecycleExecutor.java:556)
        at org.apache.maven.lifecycle.DefaultLifecycleExecutor.executeGoal(DefaultLifecycleExecutor.java:535)
        at org.apache.maven.lifecycle.DefaultLifecycleExecutor.executeGoalAndHandleFailures(DefaultLifecycleExecutor.java:387)
        at org.apache.maven.lifecycle.DefaultLifecycleExecutor.executeTaskSegments(DefaultLifecycleExecutor.java:348)
        at org.apache.maven.lifecycle.DefaultLifecycleExecutor.execute(DefaultLifecycleExecutor.java:180)
        at org.apache.maven.DefaultMaven.doExecute(DefaultMaven.java:328)
        at org.apache.maven.DefaultMaven.execute(DefaultMaven.java:138)
        at org.apache.maven.cli.MavenCli.main(MavenCli.java:362)
        at org.apache.maven.cli.compat.CompatibleMain.main(CompatibleMain.java:60)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:616)
        at org.codehaus.classworlds.Launcher.launchEnhanced(Launcher.java:315)
        at org.codehaus.classworlds.Launcher.launch(Launcher.java:255)
        at org.codehaus.classworlds.Launcher.mainWithExitCode(Launcher.java:430)
        at org.codehaus.classworlds.Launcher.main(Launcher.java:375)
Caused by: org.apache.maven.plugin.MojoExecutionException: ANTLR
caught 1 build errors.
        at org.antlr.mojo.antlr3.Antlr3Mojo.execute(Antlr3Mojo.java:397)
        at org.apache.maven.plugin.DefaultPluginManager.executeMojo(DefaultPluginManager.java:490)
        at org.apache.maven.lifecycle.DefaultLifecycleExecutor.executeGoals(DefaultLifecycleExecutor.java:694)
        ... 17 more
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 1 second
[INFO] Finished at: Sat Jan 09 20:36:42 CET 2010
[INFO] Final Memory: 9M/67M
[INFO] ------------------------------------------------------------------------

-------------

Maven configuration:

<project xmlns="http://maven.apache.org/POM/4.0.0"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    xsi:schemaLocation="http://maven.apache.org/POM/4.0.0
http://maven.apache.org/maven-v4_0_0.xsd">
    <modelVersion>4.0.0</modelVersion>
    <parent>
        <artifactId>tiles-autotag</artifactId>
        <groupId>org.apache.tiles</groupId>
        <version>1.0-SNAPSHOT</version>
    </parent>
    <groupId>org.apache.tiles</groupId>
    <artifactId>tiles-autotag-core</artifactId>
    <version>1.0-SNAPSHOT</version>
    <name>Autotag - Core</name>
    <description>Core classes for Autotag.</description>
    <build>
        <plugins>
            <plugin>
                <groupId>org.antlr</groupId>
                <artifactId>antlr3-maven-plugin</artifactId>
                <version>3.2</version>
                <configuration>
                    <includes>
                        <include>Java.g</include>
                    </includes>
                </configuration>
                <executions>
                    <execution>
                        <goals>
                            <goal>antlr</goal>
                        </goals>
                    </execution>
                </executions>
            </plugin>
        </plugins>
    </build>
    <dependencies>
        <dependency>
            <groupId>org.antlr</groupId>
            <artifactId>antlr-runtime</artifactId>
            <version>3.2</version>
            <type>jar</type>
            <scope>compile</scope>
        </dependency>
    </dependencies>
</project>

From antonio.petrelli at gmail.com  Sat Jan  9 11:50:00 2010
From: antonio.petrelli at gmail.com (Antonio Petrelli)
Date: Sat, 9 Jan 2010 20:50:00 +0100
Subject: [antlr-interest] Problems with Maven plugin
In-Reply-To: <aae96ca1001091138x2bc992f5r35ec3057a89694cb@mail.gmail.com>
References: <aae96ca1001091138x2bc992f5r35ec3057a89694cb@mail.gmail.com>
Message-ID: <aae96ca1001091150u6f10de30g7fcf8db5888f04f8@mail.gmail.com>

I forgot to say that I am using the 3.2 version of the plugin and of ANTLR.

Moreover if I move the Java.g file under another subdirectory (say
'foo'), it is created under the 'foo' directory but the "package"
instruction is not included in the Java code.

Thanks
Antonio

2010/1/9 Antonio Petrelli <antonio.petrelli at gmail.com>:
> Hi all
> Sorry for being an ANTLR newbie. I would like to use the Maven plugin.
> When I try to generate (through mvn compile) sources of the Java.g 1.6
> parser, the plugin gives me an error (full log below in the mail):
>
> error(7): ?cannot find or open file: null/Java.g
>
> However, if I copy the same file under the "null" directory, it
> generates the code!
>
> I am using Maven 2.2.1 under Linux Kubuntu 9.10 amd64, OpenJDK 1.6 b16
>
> You can check it live at this address:
> http://svn.eu.apache.org/repos/asf/tiles/sandbox/trunk/tiles-autotag/tiles-autotag-core/
>
> Thanks in advance
> Antonio Petrelli
>
> -----------------
> Full Log
>
> mvn clean compile -e
> + Error stacktraces are turned on.
> [INFO] Scanning for projects...
> [INFO] ------------------------------------------------------------------------
> [INFO] Building Autotag - Core
> [INFO] ? ?task-segment: [clean, compile]
> [INFO] ------------------------------------------------------------------------
> [INFO] [clean:clean {execution: default-clean}]
> [INFO] Deleting directory
> /home/antonio/javadev/workspace-sandbox/tiles-autotag/tiles-autotag-core/target
> [INFO] [antlr3:antlr {execution: default}]
> [INFO] ANTLR: Processing source directory
> /home/antonio/javadev/workspace-sandbox/tiles-autotag/tiles-autotag-core/src/main/antlr3
> ANTLR Parser Generator ?Version 3.2 Sep 23, 2009 14:05:07
> error(7): ?cannot find or open file: null/Java.g
> [INFO] ------------------------------------------------------------------------
> [ERROR] BUILD ERROR
> [INFO] ------------------------------------------------------------------------
> [INFO] ANTLR caught 1 build errors.
> [INFO] ------------------------------------------------------------------------
> [INFO] Trace
> org.apache.maven.lifecycle.LifecycleExecutionException: ANTLR caught 1
> build errors.
> ? ? ? ?at org.apache.maven.lifecycle.DefaultLifecycleExecutor.executeGoals(DefaultLifecycleExecutor.java:719)
> ? ? ? ?at org.apache.maven.lifecycle.DefaultLifecycleExecutor.executeGoalWithLifecycle(DefaultLifecycleExecutor.java:556)
> ? ? ? ?at org.apache.maven.lifecycle.DefaultLifecycleExecutor.executeGoal(DefaultLifecycleExecutor.java:535)
> ? ? ? ?at org.apache.maven.lifecycle.DefaultLifecycleExecutor.executeGoalAndHandleFailures(DefaultLifecycleExecutor.java:387)
> ? ? ? ?at org.apache.maven.lifecycle.DefaultLifecycleExecutor.executeTaskSegments(DefaultLifecycleExecutor.java:348)
> ? ? ? ?at org.apache.maven.lifecycle.DefaultLifecycleExecutor.execute(DefaultLifecycleExecutor.java:180)
> ? ? ? ?at org.apache.maven.DefaultMaven.doExecute(DefaultMaven.java:328)
> ? ? ? ?at org.apache.maven.DefaultMaven.execute(DefaultMaven.java:138)
> ? ? ? ?at org.apache.maven.cli.MavenCli.main(MavenCli.java:362)
> ? ? ? ?at org.apache.maven.cli.compat.CompatibleMain.main(CompatibleMain.java:60)
> ? ? ? ?at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> ? ? ? ?at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> ? ? ? ?at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> ? ? ? ?at java.lang.reflect.Method.invoke(Method.java:616)
> ? ? ? ?at org.codehaus.classworlds.Launcher.launchEnhanced(Launcher.java:315)
> ? ? ? ?at org.codehaus.classworlds.Launcher.launch(Launcher.java:255)
> ? ? ? ?at org.codehaus.classworlds.Launcher.mainWithExitCode(Launcher.java:430)
> ? ? ? ?at org.codehaus.classworlds.Launcher.main(Launcher.java:375)
> Caused by: org.apache.maven.plugin.MojoExecutionException: ANTLR
> caught 1 build errors.
> ? ? ? ?at org.antlr.mojo.antlr3.Antlr3Mojo.execute(Antlr3Mojo.java:397)
> ? ? ? ?at org.apache.maven.plugin.DefaultPluginManager.executeMojo(DefaultPluginManager.java:490)
> ? ? ? ?at org.apache.maven.lifecycle.DefaultLifecycleExecutor.executeGoals(DefaultLifecycleExecutor.java:694)
> ? ? ? ?... 17 more
> [INFO] ------------------------------------------------------------------------
> [INFO] Total time: 1 second
> [INFO] Finished at: Sat Jan 09 20:36:42 CET 2010
> [INFO] Final Memory: 9M/67M
> [INFO] ------------------------------------------------------------------------
>
> -------------
>
> Maven configuration:
>
> <project xmlns="http://maven.apache.org/POM/4.0.0"
> xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
> ? ?xsi:schemaLocation="http://maven.apache.org/POM/4.0.0
> http://maven.apache.org/maven-v4_0_0.xsd">
> ? ?<modelVersion>4.0.0</modelVersion>
> ? ?<parent>
> ? ? ? ?<artifactId>tiles-autotag</artifactId>
> ? ? ? ?<groupId>org.apache.tiles</groupId>
> ? ? ? ?<version>1.0-SNAPSHOT</version>
> ? ?</parent>
> ? ?<groupId>org.apache.tiles</groupId>
> ? ?<artifactId>tiles-autotag-core</artifactId>
> ? ?<version>1.0-SNAPSHOT</version>
> ? ?<name>Autotag - Core</name>
> ? ?<description>Core classes for Autotag.</description>
> ? ?<build>
> ? ? ? ?<plugins>
> ? ? ? ? ? ?<plugin>
> ? ? ? ? ? ? ? ?<groupId>org.antlr</groupId>
> ? ? ? ? ? ? ? ?<artifactId>antlr3-maven-plugin</artifactId>
> ? ? ? ? ? ? ? ?<version>3.2</version>
> ? ? ? ? ? ? ? ?<configuration>
> ? ? ? ? ? ? ? ? ? ?<includes>
> ? ? ? ? ? ? ? ? ? ? ? ?<include>Java.g</include>
> ? ? ? ? ? ? ? ? ? ?</includes>
> ? ? ? ? ? ? ? ?</configuration>
> ? ? ? ? ? ? ? ?<executions>
> ? ? ? ? ? ? ? ? ? ?<execution>
> ? ? ? ? ? ? ? ? ? ? ? ?<goals>
> ? ? ? ? ? ? ? ? ? ? ? ? ? ?<goal>antlr</goal>
> ? ? ? ? ? ? ? ? ? ? ? ?</goals>
> ? ? ? ? ? ? ? ? ? ?</execution>
> ? ? ? ? ? ? ? ?</executions>
> ? ? ? ? ? ?</plugin>
> ? ? ? ?</plugins>
> ? ?</build>
> ? ?<dependencies>
> ? ? ? ?<dependency>
> ? ? ? ? ? ?<groupId>org.antlr</groupId>
> ? ? ? ? ? ?<artifactId>antlr-runtime</artifactId>
> ? ? ? ? ? ?<version>3.2</version>
> ? ? ? ? ? ?<type>jar</type>
> ? ? ? ? ? ?<scope>compile</scope>
> ? ? ? ?</dependency>
> ? ?</dependencies>
> </project>
>

From michael.guyver at gmail.com  Sat Jan  9 15:11:04 2010
From: michael.guyver at gmail.com (Michael Guyver)
Date: Sat, 9 Jan 2010 23:11:04 +0000
Subject: [antlr-interest] antlr3-maven-plugin (v3.2): "error(7): cannot find
	or open file: null/MyGrammar.g"
In-Reply-To: <c8ce53501001091507v64e3a6dbs6edb0550b40168c0@mail.gmail.com>
References: <c8ce53501001091507v64e3a6dbs6edb0550b40168c0@mail.gmail.com>
Message-ID: <c8ce53501001091511w70e6000fkaef5bc160f4de65f@mail.gmail.com>

Hi there,

There's a a bug in the Antlr4Mojo class where the grammar files are
stored in the src/main/antlr3 root (for example
src/main/antlr3/MyGrammar.g). Despite scanning and finding the grammar
file (and reporting its location nicely), it results in a 'null' value
being passed back from findSourceSubdir(File,String) such that the
following error occurs:

error(7): ?cannot find or open file: null/MyGrammar.g

and results in the following exception trace:

Caused by: org.apache.maven.plugin.MojoExecutionException: ANTLR
caught 2 build errors.
? ? ? ?at org.antlr.mojo.antlr3.Antlr3Mojo.execute(Antlr3Mojo.java:397)
? ? ? ?at org.apache.maven.plugin.DefaultPluginManager.executeMojo(DefaultPluginManager.java:451)
? ? ? ?at org.apache.maven.lifecycle.DefaultLifecycleExecutor.executeGoals(DefaultLifecycleExecutor.java:558)
? ? ? ?... 16 more

I had formerly been using the codehaus 1.0 release and been setting
the output directory to

target/generated-sources/antlr/my/full/package/path/ so that the

generated files arrived in the right place. Happily the new plugin
does this for you so simply moving the grammar to

src/main/antlr3/my/full/package/path/MyGrammar.g

solved the problem and meant I didn't have to specify the output
directory either \:D/

Hope this helps any other people perplexed by the issue and that it
might result in a fix (not that I'm dependent on it any longer;)?

Best wishes,

Michael

From ttmrichter at gmail.com  Sat Jan  9 18:04:06 2010
From: ttmrichter at gmail.com (Michael Richter)
Date: Sun, 10 Jan 2010 10:04:06 +0800
Subject: [antlr-interest] Question about idiom.
In-Reply-To: <53C75D1D-B191-42E5-BF37-7E5E50BA35D9@classdump.org>
References: <ee970b291001082032t7469cc6em9419e23cb39efe47@mail.gmail.com>
	<53C75D1D-B191-42E5-BF37-7E5E50BA35D9@classdump.org>
Message-ID: <ee970b291001091804x2505b332pecce74f16c27fcbf@mail.gmail.com>

2010/1/9 Kay R?pke <kroepke at classdump.org>

>
> On Jan 9, 2010, at 5:32 AM, Michael Richter wrote:
>
> > I keep coming across a pattern in a grammar I'm working on.  This pattern
> > looks something like this:
> >
> >   - A production can be *A*.
> >   - A production can be *B*.
> >   - A production can be *A B.*
> >
> > In the grammar I'm transcribing this from, the notation used is *(A &
> B)*.
> > Is there some convenient way to code that in ANTLR's EBNF notation?  I
> keep
> > having to do *(A | B | A B)*.  As is that isn't all that onerous as-is, I
> > admit, but imagine if A is five tokens long and B is also five tokens
> long
> > and then imagine this kind of pattern happening about twenty times in the
> > grammar.  Is there a way to concisely do this?
>
> What is the restriction on the parts of the production?
> I.e. what differentiates a valid production from an invalid one?
>

The restriction is exactly as I put it: You can have A (where A is a
multi-token set of specified order), B (where B is a multi-token set of
specified order) or A B.  It *must* be in the order provided and A and B are
fixed token sets.

Think of it this way: you're declaring a variable.  You have a token for the
variable, then an optional type specification (A -- multiple tokens) and an
optional initializer (B -- multiple tokens).  Both parts are optional, but
you *must* have at least one and the declarations *must* be in the order of
type then initializer if both are present.  The only way I've found to do it
is (A | B | A B), but this is painful when A and B are more than one token
in length and I've got about 20 of these things in the grammar.  This is
just begging for typos.

From jbb at acm.org  Sat Jan  9 18:40:04 2010
From: jbb at acm.org (John B. Brodie)
Date: Sat, 09 Jan 2010 21:40:04 -0500
Subject: [antlr-interest] Question about idiom.
In-Reply-To: <ee970b291001091804x2505b332pecce74f16c27fcbf@mail.gmail.com>
References: <ee970b291001082032t7469cc6em9419e23cb39efe47@mail.gmail.com>
	<53C75D1D-B191-42E5-BF37-7E5E50BA35D9@classdump.org>
	<ee970b291001091804x2505b332pecce74f16c27fcbf@mail.gmail.com>
Message-ID: <1263091204.3473.20.camel@gecko.home.org>

Greetings!

On Sun, 2010-01-10 at 10:04 +0800, Michael Richter wrote:
> 2010/1/9 Kay R?pke <kroepke at classdump.org>
> 
> >
> > On Jan 9, 2010, at 5:32 AM, Michael Richter wrote:
> >
> > > I keep coming across a pattern in a grammar I'm working on.  This pattern
> > > looks something like this:
> > >
> > >   - A production can be *A*.
> > >   - A production can be *B*.
> > >   - A production can be *A B.*
> > >
> > > In the grammar I'm transcribing this from, the notation used is *(A &
> > B)*.
> > > Is there some convenient way to code that in ANTLR's EBNF notation?  I
> > keep
> > > having to do *(A | B | A B)*.  As is that isn't all that onerous as-is, I
> > > admit, but imagine if A is five tokens long and B is also five tokens
> > long
> > > and then imagine this kind of pattern happening about twenty times in the
> > > grammar.  Is there a way to concisely do this?
> >
> > What is the restriction on the parts of the production?
> > I.e. what differentiates a valid production from an invalid one?
> >
> 
> The restriction is exactly as I put it: You can have A (where A is a
> multi-token set of specified order), B (where B is a multi-token set of
> specified order) or A B.  It *must* be in the order provided and A and B are
> fixed token sets.
> 

1) make a parser rule to recognize the sequence of Tokens (and/or other
parser rules) comprising A; and call it, say, as: recognize_A.

2) make a parser rule to recognize the sequence of Tokens(and/or other
parser rules) comprising B; and call it, say, as: recognize_B.

3) make a parser rule of the form:

an_A_or_B_or_AB : recognize_A ( recognize_B )? | recognize_B ;

observe the proper left-factoring in the above...

4) use the above parser rule `an_A_or_B_or_AB` from 3) everywhere you
have the (A|B|A B) stuff.

note that if A and B share a common prefix (e.g. a common left-factor)
you will probably experience issues with the above 4 steps.

> Think of it this way: you're declaring a variable.  You have a token for the
> variable, then an optional type specification (A -- multiple tokens) and an
> optional initializer (B -- multiple tokens).  Both parts are optional, but
> you *must* have at least one and the declarations *must* be in the order of
> type then initializer if both are present.  The only way I've found to do it
> is (A | B | A B), but this is painful when A and B are more than one token
> in length and I've got about 20 of these things in the grammar.  This is
> just begging for typos.

this example REALLY FAILS for me. It is hard for me to envision a
language the can initialize a variable (e.g. B) without any declaration
of that variable (e.g. A). So having a bare naked B under the above
example makes no sense to me. Maybe you meant something like: (A B? C?)
where A is the var decl, B is its type and C is its initial value...


Hope this helps....
   -jbb


From ttmrichter at gmail.com  Sat Jan  9 23:17:50 2010
From: ttmrichter at gmail.com (Michael Richter)
Date: Sun, 10 Jan 2010 15:17:50 +0800
Subject: [antlr-interest] Question about idiom.
In-Reply-To: <1263091204.3473.20.camel@gecko.home.org>
References: <ee970b291001082032t7469cc6em9419e23cb39efe47@mail.gmail.com>
	<53C75D1D-B191-42E5-BF37-7E5E50BA35D9@classdump.org>
	<ee970b291001091804x2505b332pecce74f16c27fcbf@mail.gmail.com>
	<1263091204.3473.20.camel@gecko.home.org>
Message-ID: <ee970b291001092317p491ec96ene8e6f280c6892a50@mail.gmail.com>

2010/1/10 John B. Brodie <jbb at acm.org>

> > Think of it this way: you're declaring a variable.  You have a token for
> the
> > variable, then an optional type specification (A -- multiple tokens) and
> an
> > optional initializer (B -- multiple tokens).  Both parts are optional,
> but
> > you *must* have at least one and the declarations *must* be in the order
> of
> > type then initializer if both are present.  The only way I've found to do
> it
> > is (A | B | A B), but this is painful when A and B are more than one
> token
> > in length and I've got about 20 of these things in the grammar.  This is
> > just begging for typos.
>


> this example REALLY FAILS for me. It is hard for me to envision a
> language the can initialize a variable (e.g. B) without any declaration
> of that variable (e.g. A). So having a bare naked B under the above
> example makes no sense to me. Maybe you meant something like: (A B? C?)
> where A is the var decl, B is its type and C is its initial value...
>

That's what I said.  A token for the variable THEN an optional type
specification (A) and an optional initializer (B).  Three elements in total
with only two of them named.

I'll look over your other possible solutions there.  Having (A B? | B) looks
good enough especially since there's no left-commonality with A and B in ...
I think in any case, actually.

From christian.schladetsch at gmail.com  Sun Jan 10 01:41:09 2010
From: christian.schladetsch at gmail.com (Christian Schladetsch)
Date: Sun, 10 Jan 2010 20:41:09 +1100
Subject: [antlr-interest] New Example Project: HLSL Parser + Tree walker
	using C# Target
Message-ID: <6442c4ae1001100141v2f069f9fk67f2e2e33313eef4@mail.gmail.com>

Hi All,

I just spent a few hours tidying up an FX Parser and adding it to my
*GoogleCode
*depot.

It uses ANTLR 3 to parse HLSL files to an AST, then a Tree Walker and
StringTemplate to write out the HLSL again. The target language is C#.

I think its all there now, including all dependencies and custom build rules
for VS 2008.

To try it, you will need to checkout my
repository<http://code.google.com/p/schladetsch/source/checkout><http://code.google.com/p/schladetsch/source/checkout>.
For example:

svn checkout http://schladetsch.googlecode.com/svn/trunk/ .
start Effects\Tools\FXParser\FXParser.sln

Regards,
Christian.

From ttmrichter at gmail.com  Sun Jan 10 02:00:59 2010
From: ttmrichter at gmail.com (Michael Richter)
Date: Sun, 10 Jan 2010 18:00:59 +0800
Subject: [antlr-interest] What is going on here?
Message-ID: <ee970b291001100200j326b1deex698693d0f5837e82@mail.gmail.com>

Here's a snippet from a grammar I'm working on that's just failing in the
most bizarre ways.

grammar junk;

tokens {
    INTERFACE = 'INTERFACE';
    IDENT = 'IDENT';
    END = 'END';
}

interface
    :   INTERFACE IDENT import* declaration* END '.'
    ;

import : ;

When I run antlr on it I get the following output:

error(100): junk.g:10:9: syntax error: antlr: junk.g:10:9: unexpected token:
INTERFACE
error(100): junk.g:10:19: syntax error: antlr: junk.g:10:19: unexpected
token: IDENT
error(100): junk.g:10:50: syntax error: antlr: junk.g:10:50: unexpected
token: '.'
error(100): junk.g:13:1: syntax error: antlr: junk.g:13:1: unexpected token:
import
error(150):  grammar file junk.g has no rules
error(100): junk.g:0:0: syntax error: assign.types: <AST>:0:0: unexpected
end of subtree
error(100): junk.g:0:0: syntax error: define: <AST>:0:0: unexpected end of
subtree
error(10):  internal error: junk.g : java.lang.NullPointerException
org.antlr.grammar.v2.DefineGrammarItemsWalker.trimGrammar(DefineGrammarItemsWalker.java:94)
org.antlr.grammar.v2.DefineGrammarItemsWalker.finish(DefineGrammarItemsWalker.java:77)
org.antlr.grammar.v2.DefineGrammarItemsWalker.grammar(DefineGrammarItemsWalker.java:206)
org.antlr.tool.Grammar.defineGrammarSymbols(Grammar.java:702)
org.antlr.tool.CompositeGrammar.defineGrammarSymbols(CompositeGrammar.java:351)
org.antlr.Tool.process(Tool.java:451)
org.antlr.Tool.main(Tool.java:91)

I have tried renaming the INTERFACE token, the IDENT token, the interface
production, etc. in various combinations and none of it works.  What
incredibly obvious thing am I overlooking?

From andy at andymcm.com  Sun Jan 10 03:21:46 2010
From: andy at andymcm.com (Andy McMullan)
Date: Sun, 10 Jan 2010 11:21:46 +0000
Subject: [antlr-interest] What is going on here?
In-Reply-To: <ee970b291001100200j326b1deex698693d0f5837e82@mail.gmail.com>
References: <ee970b291001100200j326b1deex698693d0f5837e82@mail.gmail.com>
Message-ID: <fea7c4861001100321m639909ccj683ee1cb6ee6834e@mail.gmail.com>

Did you try renaming 'import'?

From stevenraemaekers at gmail.com  Sun Jan 10 03:34:55 2010
From: stevenraemaekers at gmail.com (Steven Raemaekers)
Date: Sun, 10 Jan 2010 12:34:55 +0100
Subject: [antlr-interest] ANTLR compile problem
Message-ID: <46450b021001100334t738a7304ma9c8d2b4cf611ddd@mail.gmail.com>

Hello,

A project i'm working on includes an ANTLR parser, it worked fine a couple
of days ago but now I get the following error message:

Exception in thread "main" java.lang.NoSuchMethodException:
stevenr.yali.antlr.LogoParser.<init>(org.antlr.runtime.TokenStream,
org.antlr.runtime.debug.DebugEventListener)
    at java.lang.Class.getConstructor0(Class.java:2706)
    at java.lang.Class.getDeclaredConstructor(Class.java:1985)
    at org.deved.antlride.runtime.LaunchParser.launch(LaunchParser.java:118)
    at org.deved.antlride.runtime.LaunchParser.main(LaunchParser.java:228)

It seems like ANTLR runtime is trying to call a constructor that is not
there, namely a constructor with a tokenstream and a debugeventlistener as
arguments.
Why would ANTLR want to do this? Previously this problem never occurred. Did
it accidentaly go in some kind of "debug" mode that it is trying to attach
an event listener?
Why doesn't it just call the instructor it created itself (without a
debugeventlistener)? Maybe there is some kind of debug option i turned on
somewhere?
Eclipse says that the java output file that ANTLR generates does not contain
any errors.

Can somebody please help me?  Thanks.

-- 
Regards,

Steven Raemaekers

From r66092 at freescale.com  Sun Jan 10 19:11:28 2010
From: r66092 at freescale.com (Chen Hongjun-R66092)
Date: Mon, 11 Jan 2010 11:11:28 +0800
Subject: [antlr-interest] An error occurs in template example
Message-ID: <3A45394FD742FA419B760BB8D398F9ED011E1A07@zch01exm26.fsl.freescale.net>

Hi,

I am new to ANTLR, and am reading the book The Definitive ANTLR
Reference. When I tried the template example 'template/generator/2pass'
without any modification, and met an error as below:

Exception in thread "main" java.util.NoSuchElementException: no such
attribute: init in template context [jasminFile]
	at
org.antlr.stringtemplate.StringTemplate.checkNullAttributeAgainstFormalA
rguments(StringTemplate.java:1311)
	at
org.antlr.stringtemplate.StringTemplate.getAttribute(StringTemplate.java
:684)
	at
org.antlr.stringtemplate.language.ActionEvaluator.attribute(ActionEvalua
tor.java:360)
	at
org.antlr.stringtemplate.language.ActionEvaluator.expr(ActionEvaluator.j
ava:136)
	at
org.antlr.stringtemplate.language.ActionEvaluator.action(ActionEvaluator
.java:84)
	at
org.antlr.stringtemplate.language.ASTExpr.write(ASTExpr.java:149)
	at
org.antlr.stringtemplate.StringTemplate.write(StringTemplate.java:705)
	at
org.antlr.stringtemplate.StringTemplate.toString(StringTemplate.java:167
0)
	at
org.antlr.stringtemplate.StringTemplate.toString(StringTemplate.java:166
1)
	at Test.main(Test.java:45)

I appreciate your any suggestions or ideas!

Thanks,
Hongjun
  

From aurelien.larive at 4dconcept.fr  Mon Jan 11 03:51:57 2010
From: aurelien.larive at 4dconcept.fr (=?ISO-8859-1?Q?Aur=E9lien_LARIVE?=)
Date: Mon, 11 Jan 2010 12:51:57 +0100
Subject: [antlr-interest] Problems writing a searchbar language
Message-ID: <4B4B10DD.6050200@4dconcept.fr>

Hi,

I'm currently writing a small grammar to parse a searchbar language and 
I'm failing at making whitespaces behave like the AND keyword.

Here is my grammar :

grammar SearchBar;

options {
    output=AST;
}

WS  : ( ' ' | '\t' ) { skip(); } ;
AND : 'AND' ;
OR  : 'OR' ;
NOT : 'NOT' ;
LEFT_PAREN  : '(' ;
RIGHT_PAREN : ')' ;
TERM        : ~(' '|'\t'|'"'|RIGHT_PAREN|LEFT_PAREN|NOT|OR|AND)* ;
QUOTEDTERM  : '"' ~('"')* '"' ;

orexpression
    : andexpression ( OR^ andexpression )*
    ;

andexpression
    : notexpression ( (AND^)? notexpression )*
    ;

notexpression
    : (NOT^)? searchterm
    ;

searchterm
    : TERM
    | QUOTEDTERM
    | LEFT_PAREN! orexpression RIGHT_PAREN!
    ;

And here is my tree grammar :

tree grammar SearchBarEval;

options {
    ASTLabelType=CommonTree;
    tokenVocab=SearchBar;
}

prog
    : expr+ ;

expr returns [XMSExpression expression]
    : ^(OR a=expr b=expr) {
        $expression = new Or($a.expression, $b.expression);
    }
    | ^(AND a=expr b=expr) {
        $expression = new And($a.expression, $b.expression);
    }
    | ^(NOT a=expr) {
        $expression = new Not($a.expression);
    }
    | TERM {
        $expression = new Term($TERM.text);
    }
    | QUOTEDTERM {
        $expression = new QuotedTerm($QUOTEDTERM.text);
    }
    ;

When I try to evaluate, for example, the input 'apples bananas tomatos', 
I only get the Term 'apples'. I understand why I'm having this problem 
but I was unable to find a good solution.

Thanks in advance,

--
Aur?lien

From espina.edgar at gmail.com  Mon Jan 11 04:04:16 2010
From: espina.edgar at gmail.com (Edgar Espina)
Date: Mon, 11 Jan 2010 09:04:16 -0300
Subject: [antlr-interest] Problems writing a searchbar language
In-Reply-To: <4B4B10DD.6050200@4dconcept.fr>
References: <4B4B10DD.6050200@4dconcept.fr>
Message-ID: <92b42db61001110404j44d65070yd7f053f457590ea7@mail.gmail.com>

Hi,

 try this:

WS  : ( ' ' | '\t' ) { $channel=HIDDEN; } ;

Regards

On Mon, Jan 11, 2010 at 8:51 AM, Aur?lien LARIVE <
aurelien.larive at 4dconcept.fr> wrote:

> Hi,
>
> I'm currently writing a small grammar to parse a searchbar language and
> I'm failing at making whitespaces behave like the AND keyword.
>
> Here is my grammar :
>
> grammar SearchBar;
>
> options {
>    output=AST;
> }
>
> WS  : ( ' ' | '\t' ) { skip(); } ;
> AND : 'AND' ;
> OR  : 'OR' ;
> NOT : 'NOT' ;
> LEFT_PAREN  : '(' ;
> RIGHT_PAREN : ')' ;
> TERM        : ~(' '|'\t'|'"'|RIGHT_PAREN|LEFT_PAREN|NOT|OR|AND)* ;
> QUOTEDTERM  : '"' ~('"')* '"' ;
>
> orexpression
>    : andexpression ( OR^ andexpression )*
>    ;
>
> andexpression
>    : notexpression ( (AND^)? notexpression )*
>    ;
>
> notexpression
>    : (NOT^)? searchterm
>    ;
>
> searchterm
>    : TERM
>    | QUOTEDTERM
>    | LEFT_PAREN! orexpression RIGHT_PAREN!
>    ;
>
> And here is my tree grammar :
>
> tree grammar SearchBarEval;
>
> options {
>    ASTLabelType=CommonTree;
>    tokenVocab=SearchBar;
> }
>
> prog
>    : expr+ ;
>
> expr returns [XMSExpression expression]
>    : ^(OR a=expr b=expr) {
>        $expression = new Or($a.expression, $b.expression);
>    }
>    | ^(AND a=expr b=expr) {
>        $expression = new And($a.expression, $b.expression);
>    }
>    | ^(NOT a=expr) {
>        $expression = new Not($a.expression);
>    }
>    | TERM {
>        $expression = new Term($TERM.text);
>    }
>    | QUOTEDTERM {
>        $expression = new QuotedTerm($QUOTEDTERM.text);
>    }
>    ;
>
> When I try to evaluate, for example, the input 'apples bananas tomatos',
> I only get the Term 'apples'. I understand why I'm having this problem
> but I was unable to find a good solution.
>
> Thanks in advance,
>
> --
> Aur?lien
>
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe:
> http://www.antlr.org/mailman/options/antlr-interest/your-email-address
>


-- 
edgar

From aurelien.larive at 4dconcept.fr  Mon Jan 11 05:20:36 2010
From: aurelien.larive at 4dconcept.fr (=?ISO-8859-1?Q?Aur=E9lien_LARIVE?=)
Date: Mon, 11 Jan 2010 14:20:36 +0100
Subject: [antlr-interest] Problems writing a searchbar language
In-Reply-To: <92b42db61001110404j44d65070yd7f053f457590ea7@mail.gmail.com>
References: <4B4B10DD.6050200@4dconcept.fr>
	<92b42db61001110404j44d65070yd7f053f457590ea7@mail.gmail.com>
Message-ID: <4B4B25A4.5020600@4dconcept.fr>

That does not seem to change anything. Did I miss something ?


Edgar Espina a ?crit :
> Hi,
>
>  try this:
>
> WS  : ( ' ' | '\t' ) { $channel=HIDDEN; } ;
>
> Regards
>
> On Mon, Jan 11, 2010 at 8:51 AM, Aur?lien LARIVE 
> <aurelien.larive at 4dconcept.fr <mailto:aurelien.larive at 4dconcept.fr>> 
> wrote:
>
>     Hi,
>
>     I'm currently writing a small grammar to parse a searchbar
>     language and
>     I'm failing at making whitespaces behave like the AND keyword.
>
>     Here is my grammar :
>
>     grammar SearchBar;
>
>     options {
>        output=AST;
>     }
>
>     WS  : ( ' ' | '\t' ) { skip(); } ;
>     AND : 'AND' ;
>     OR  : 'OR' ;
>     NOT : 'NOT' ;
>     LEFT_PAREN  : '(' ;
>     RIGHT_PAREN : ')' ;
>     TERM        : ~(' '|'\t'|'"'|RIGHT_PAREN|LEFT_PAREN|NOT|OR|AND)* ;
>     QUOTEDTERM  : '"' ~('"')* '"' ;
>
>     orexpression
>        : andexpression ( OR^ andexpression )*
>        ;
>
>     andexpression
>        : notexpression ( (AND^)? notexpression )*
>        ;
>
>     notexpression
>        : (NOT^)? searchterm
>        ;
>
>     searchterm
>        : TERM
>        | QUOTEDTERM
>        | LEFT_PAREN! orexpression RIGHT_PAREN!
>        ;
>
>     And here is my tree grammar :
>
>     tree grammar SearchBarEval;
>
>     options {
>        ASTLabelType=CommonTree;
>        tokenVocab=SearchBar;
>     }
>
>     prog
>        : expr+ ;
>
>     expr returns [XMSExpression expression]
>        : ^(OR a=expr b=expr) {
>            $expression = new Or($a.expression, $b.expression);
>        }
>        | ^(AND a=expr b=expr) {
>            $expression = new And($a.expression, $b.expression);
>        }
>        | ^(NOT a=expr) {
>            $expression = new Not($a.expression);
>        }
>        | TERM {
>            $expression = new Term($TERM.text);
>        }
>        | QUOTEDTERM {
>            $expression = new QuotedTerm($QUOTEDTERM.text);
>        }
>        ;
>
>     When I try to evaluate, for example, the input 'apples bananas
>     tomatos',
>     I only get the Term 'apples'. I understand why I'm having this problem
>     but I was unable to find a good solution.
>
>     Thanks in advance,
>
>     --
>     Aur?lien
>
>     List: http://www.antlr.org/mailman/listinfo/antlr-interest
>     Unsubscribe:
>     http://www.antlr.org/mailman/options/antlr-interest/your-email-address
>
>
>
>
> -- 
> edgar

From aurelien.larive at 4dconcept.fr  Mon Jan 11 07:46:39 2010
From: aurelien.larive at 4dconcept.fr (=?ISO-8859-1?Q?Aur=E9lien_LARIVE?=)
Date: Mon, 11 Jan 2010 16:46:39 +0100
Subject: [antlr-interest] Operators and rewrite rules equivalence
Message-ID: <4B4B47DF.1070400@4dconcept.fr>

Hi,

I successfully buit an AST using the operators notation (^) but I need 
to customize a bit my AST construction. Could someone tell me what's the 
rewrite rules version of the following rule ?

andexpression
    : notexpression ( AND^ notexpression )*
    ;

I found a similar example at 
http://www.antlr.org/wiki/display/ANTLR3/Tree+construction but I failed 
to apply this to my problem.

Thanks in advance,

--
Aur?lien

From aurelien.larive at 4dconcept.fr  Mon Jan 11 08:31:20 2010
From: aurelien.larive at 4dconcept.fr (=?UTF-8?B?QXVyw6lsaWVuIExBUklWRQ==?=)
Date: Mon, 11 Jan 2010 17:31:20 +0100
Subject: [antlr-interest] Problems writing a searchbar language
In-Reply-To: <4B4B10DD.6050200@4dconcept.fr>
References: <4B4B10DD.6050200@4dconcept.fr>
Message-ID: <4B4B5258.8060506@4dconcept.fr>

Below is the e-mail John B. Brodie sent to me, which solved my problem.

John B. Brodie wrote :

Greetings!

(I tried to send this to the mail-list, but the list seems to be
rejecting my e-mail at the moment.... sigh)

When you have an implicit AND (e.g. whitespace), your andexpression
sub-tree will not have any root. It will be just a list of notexpression
sub-trees, which your tree walker is not prepared to handle.

More below.....

On Mon, 2010-01-11 at 12:51 +0100, Aur?lien LARIVE wrote:
> Hi,
> 
> I'm currently writing a small grammar to parse a searchbar language
and
> I'm failing at making whitespaces behave like the AND keyword.
> 
> Here is my grammar :
> 
> grammar SearchBar;
> 
> options {
>     output=AST;
> }
> 
> WS  : ( ' ' | '\t' ) { skip(); } ;
> AND : 'AND' ;
> OR  : 'OR' ;
> NOT : 'NOT' ;
> LEFT_PAREN  : '(' ;
> RIGHT_PAREN : ')' ;
> TERM        : ~(' '|'\t'|'"'|RIGHT_PAREN|LEFT_PAREN|NOT|OR|AND)* ;
> QUOTEDTERM  : '"' ~('"')* '"' ;
> 
> orexpression
>     : andexpression ( OR^ andexpression )*
>     ;
> 
> andexpression
>     : notexpression ( (AND^)? notexpression )*
>     ;

when the AND is absent e.g. an implied AND via whitespace there will be
no root. so (I THINK) you will just end up with a simple list of
notexpression sub-trees.

suggest these parsing rules instead (tested!):

andexpression
     : notexpression ( and_operator^ notexpression )*
     ;

and_operator : AND | (/*empty*/->AND["implicit_AND"]) ;

NOTE!!! The token spawned for "implicit_AND" above may not contain
meaningful location information (e.g. line number, column, ...whatever).
If that information is important to your application (usually for error
messages), you may need to dig into the details of the "X[...]" ANTLR
meta-notation for token insertion....

> 
> notexpression
>     : (NOT^)? searchterm
>     ;
> 
> searchterm
>     : TERM
>     | QUOTEDTERM
>     | LEFT_PAREN! orexpression RIGHT_PAREN!
>     ;
> 
> And here is my tree grammar :
> 
> tree grammar SearchBarEval;
> 
> options {
>     ASTLabelType=CommonTree;
>     tokenVocab=SearchBar;
> }
> 
> prog
>     : expr+ ;
> 
> expr returns [XMSExpression expression]
>     : ^(OR a=expr b=expr) {
>         $expression = new Or($a.expression, $b.expression);
>     }
>     | ^(AND a=expr b=expr) {
>         $expression = new And($a.expression, $b.expression);
>     }
>     | ^(NOT a=expr) {
>         $expression = new Not($a.expression);
>     }
>     | TERM {
>         $expression = new Term($TERM.text);
>     }
>     | QUOTEDTERM {
>         $expression = new QuotedTerm($QUOTEDTERM.text);
>     }

if you would rather not apply the above suggested parser changes, you
might be able to alter the tree grammar as follows (UNTESTED!):

add an alternative to the expr rule (i think it has to be at the end,
not sure...):
       | implicit_and
>     ;
> 
and then add an implicit_and rule (UNTESTED!):

implicit_and returns [XMSExpression expression]
       : a=expr {$expression = $a.expression;}
           ( b=implicit_and {
               $expression = new And($a.expression, $b.expression);
             }
           )?
       ;
> When I try to evaluate, for example, the input 'apples bananas
tomatos',
> I only get the Term 'apples'. I understand why I'm having this
problem
> but I was unable to find a good solution.
> 
> Thanks in advance,

Hope this helps....
    -jbb


From aurelien.larive at 4dconcept.fr  Mon Jan 11 08:37:30 2010
From: aurelien.larive at 4dconcept.fr (=?UTF-8?B?QXVyw6lsaWVuIExBUklWRQ==?=)
Date: Mon, 11 Jan 2010 17:37:30 +0100
Subject: [antlr-interest] Operators and rewrite rules equivalence
In-Reply-To: <4B4B47DF.1070400@4dconcept.fr>
References: <4B4B47DF.1070400@4dconcept.fr>
Message-ID: <4B4B53CA.7000001@4dconcept.fr>

Below is the message John B. Brodie sent to me :

(again tried to send a copy of this to the list, but failed)

On Mon, 2010-01-11 at 16:46 +0100, Aur?lien LARIVE wrote:
> Hi,
> 
> I successfully buit an AST using the operators notation (^) but I need 
> to customize a bit my AST construction. Could someone tell me what's the 
> rewrite rules version of the following rule ?
> 
> andexpression
>     : notexpression ( AND^ notexpression )*
>     ;
> 
> I found a similar example at 
> http://www.antlr.org/wiki/display/ANTLR3/Tree+construction but I failed 
> to apply this to my problem.
> 

off the top of my head:

andexpression
     : l=notexpression ( AND r=andexpression -> ^(AND $l $r) )?
     ;

but i have a vague memory that the associativity of these two are
different.  would need to look into that if associativity matters in
your application.


From jimi at temporal-wave.com  Mon Jan 11 09:36:34 2010
From: jimi at temporal-wave.com (Jim Idle)
Date: Mon, 11 Jan 2010 09:36:34 -0800
Subject: [antlr-interest] Problems writing a searchbar language
In-Reply-To: <4B4B10DD.6050200@4dconcept.fr>
Message-ID: <203633f967209c439c19322dff181ddc@temporal-wave.com>

You need to rewrite the absence of AND as the AND keyword for a start as your SPACE becomes the binary operator AND, and so should not just be ignored.

andexpression
     : notexpression ( andWord^ notexpression )*
     ;

andWord : a=AND -> $a
        |       -> AND
        ;

Then you probably want a root node and a rule that consumes to EOF:

search: orexpression EOF
          -> ^(QUERY orexpression)
      ;

And tree:

prog : ^(QUERY expr)
     ;


> -----Original Message-----
> From: antlr-interest-bounces at antlr.org [mailto:antlr-interest-
> bounces at antlr.org] On Behalf Of Aur?lien LARIVE
> Sent: Monday, January 11, 2010 3:52 AM
> To: antlr-interest at antlr.org
> Subject: [antlr-interest] Problems writing a searchbar language
> 
> Hi,
> 
> I'm currently writing a small grammar to parse a searchbar language and
> I'm failing at making whitespaces behave like the AND keyword.
> 
> Here is my grammar :
> 
> grammar SearchBar;
> 
> options {
>     output=AST;
> }
> 
> WS  : ( ' ' | '\t' ) { skip(); } ;
> AND : 'AND' ;
> OR  : 'OR' ;
> NOT : 'NOT' ;
> LEFT_PAREN  : '(' ;
> RIGHT_PAREN : ')' ;
> TERM        : ~(' '|'\t'|'"'|RIGHT_PAREN|LEFT_PAREN|NOT|OR|AND)* ;
> QUOTEDTERM  : '"' ~('"')* '"' ;
> 
> orexpression
>     : andexpression ( OR^ andexpression )*
>     ;
> 
> andexpression
>     : notexpression ( (AND^)? notexpression )*
>     ;
> 
> notexpression
>     : (NOT^)? searchterm
>     ;
> 
> searchterm
>     : TERM
>     | QUOTEDTERM
>     | LEFT_PAREN! orexpression RIGHT_PAREN!
>     ;
> 
> And here is my tree grammar :
> 
> tree grammar SearchBarEval;
> 
> options {
>     ASTLabelType=CommonTree;
>     tokenVocab=SearchBar;
> }
> 
> prog
>     : expr+ ;
> 
> expr returns [XMSExpression expression]
>     : ^(OR a=expr b=expr) {
>         $expression = new Or($a.expression, $b.expression);
>     }
>     | ^(AND a=expr b=expr) {
>         $expression = new And($a.expression, $b.expression);
>     }
>     | ^(NOT a=expr) {
>         $expression = new Not($a.expression);
>     }
>     | TERM {
>         $expression = new Term($TERM.text);
>     }
>     | QUOTEDTERM {
>         $expression = new QuotedTerm($QUOTEDTERM.text);
>     }
>     ;
> 
> When I try to evaluate, for example, the input 'apples bananas
> tomatos',
> I only get the Term 'apples'. I understand why I'm having this problem
> but I was unable to find a good solution.
> 
> Thanks in advance,
> 
> --
> Aur?lien
> 
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-
> email-address


From felix_do at web.de  Mon Jan 11 11:57:46 2010
From: felix_do at web.de (Felix Dorner)
Date: Mon, 11 Jan 2010 20:57:46 +0100
Subject: [antlr-interest]
 =?windows-1252?q?R=E9f=2E_=3A__Re=3A__Maven_prob?=
 =?windows-1252?q?lems_with_ANTLR_3=2E2?=
In-Reply-To: <41dcd1c202dbd44b842b489d1a12d052@temporal-wave.com>
References: <41dcd1c202dbd44b842b489d1a12d052@temporal-wave.com>
Message-ID: <4B4B82BA.3010102@web.de>

Jim Idle wrote:
>
> Cool ? thanks for that Lo?c ? I will update the build with this once I 
> have read the article.
>
> Jim
>
Hi I cloned antlr from github yesterday and run into the same issue 
Adding Lo?c's tags seems to help indeed.

btw, will antlr's git repository on github persist in the future or is 
this just an experiment?


Cheers

From parrt at cs.usfca.edu  Mon Jan 11 12:18:54 2010
From: parrt at cs.usfca.edu (Terence Parr)
Date: Mon, 11 Jan 2010 12:18:54 -0800
Subject: [antlr-interest]
 =?windows-1252?q?R=E9f=2E_=3A__Re=3A__Maven_prob?=
 =?windows-1252?q?lems_with_ANTLR_3=2E2?=
In-Reply-To: <4B4B82BA.3010102@web.de>
References: <41dcd1c202dbd44b842b489d1a12d052@temporal-wave.com>
	<4B4B82BA.3010102@web.de>
Message-ID: <407CF246-3671-4A6E-AF58-F50FE7FCAF71@cs.usfca.edu>


On Jan 11, 2010, at 11:57 AM, Felix Dorner wrote:
> btw, will antlr's git repository on github persist in the future or is 
> this just an experiment?

it should persist as it's pulled automagically.
T

From felix_do at web.de  Mon Jan 11 11:57:18 2010
From: felix_do at web.de (Felix Dorner)
Date: Mon, 11 Jan 2010 20:57:18 +0100
Subject: [antlr-interest]
 =?windows-1252?q?R=E9f=2E_=3A__Re=3A__Maven_prob?=
 =?windows-1252?q?lems_with_ANTLR_3=2E2?=
In-Reply-To: <41dcd1c202dbd44b842b489d1a12d052@temporal-wave.com>
References: <41dcd1c202dbd44b842b489d1a12d052@temporal-wave.com>
Message-ID: <4B4B829E.40200@web.de>

Jim Idle wrote:
>
> Cool ? thanks for that Lo?c ? I will update the build with this once I 
> have read the article.
>
> Jim
>
Hi I cloned antlr from github yesterday and run into the same issue 
Adding Lo?c's tags seems to help indeed.

btw, will antlr's git repository on github persist in the future or is 
this just an experiment?


Cheers

From zep_mailinglist at bahj.com  Mon Jan 11 16:46:13 2010
From: zep_mailinglist at bahj.com (Zachary Palmer)
Date: Mon, 11 Jan 2010 19:46:13 -0500
Subject: [antlr-interest] ANTLR Errors on Line Zero
In-Reply-To: <4B4BC4CC.6030001@bahj.com>
References: <4B4BC4CC.6030001@bahj.com>
Message-ID: <4B4BC655.1010803@bahj.com>

Hello, all.  I have what I expected to be a fairly common problem but 
couldn't find a FAQ or Google result that addressed it.  Most of the 
errors coming out of my grammar appear to be for line number zero.  For 
example:

[antlr:antlr3] error(117): .../compiler/grammar/Bsj.g:0:0: missing 
attribute access on rule scope: primary

The "..." was my own edit to eliminate a very long path.  Can anyone 
recommend how I can get line numbers for these errors?  They become very 
difficult to track down after a while.  I have gotten some errors with 
line numbers from various positions in my file; I'm unable to discern a 
pattern.  Any suggestions?

Thanks much!

Cheers,

Zachary Palmer

From egrimm at dds.nl  Tue Jan 12 08:06:24 2010
From: egrimm at dds.nl (Olaf Keijsers)
Date: Tue, 12 Jan 2010 17:06:24 +0100
Subject: [antlr-interest] Using own ASTLabelType and quantification
Message-ID: <E9653B4A7B554014AD8C78DA568E2104@ultramagnus>

Greetings,

I am trying to make a treewalker for my grammar in order to check if it 
contains nondeterminism. I would like to be able to set some properties for 
every node I encounter, so I figured it would be a good idea to use my own 
ASTLabelType.

I have set "ASTLabelType=GrooveTree" in my options, and my grammar uses this 
labeltype now, but I get the following exception when trying to use the 
checker:
java.lang.ClassCastException: org.antlr.runtime.tree.CommonTree cannot be 
cast to groove.control.parse.GrooveTree
 at 
groove.control.parse.GCLDeterminismChecker.program(GCLDeterminismChecker.java:139)

This line contains:
root_0 = (GrooveTree)adaptor.nil();

and is part of the program() method. Somehow I think this is a beginner's 
error, but I cannot find the solution. I have tried to work around it by 
using the default ASTLabelType and keeping a Map<CommonTree,Boolean> to keep 
track of the property I would like, but this seems cumbersome. Could anyone 
point me in a good direction?

Thanks!

Olaf Keijsers 


From jbb at acm.org  Tue Jan 12 10:09:41 2010
From: jbb at acm.org (John B. Brodie)
Date: Tue, 12 Jan 2010 13:09:41 -0500
Subject: [antlr-interest] Using own ASTLabelType and quantification
In-Reply-To: <E9653B4A7B554014AD8C78DA568E2104@ultramagnus>
References: <E9653B4A7B554014AD8C78DA568E2104@ultramagnus>
Message-ID: <1263319781.769.27.camel@gecko.home.org>

Greetings!

On Tue, 2010-01-12 at 17:06 +0100, Olaf Keijsers wrote:
> Greetings,
> 
> I am trying to make a treewalker for my grammar in order to check if it 
> contains nondeterminism. I would like to be able to set some properties for 
> every node I encounter, so I figured it would be a good idea to use my own 
> ASTLabelType.
> 
> I have set "ASTLabelType=GrooveTree" in my options, and my grammar uses this 
> labeltype now, but I get the following exception when trying to use the 
> checker:
> java.lang.ClassCastException: org.antlr.runtime.tree.CommonTree cannot be 
> cast to groove.control.parse.GrooveTree
>  at 
> groove.control.parse.GCLDeterminismChecker.program(GCLDeterminismChecker.java:139)
> 
> This line contains:
> root_0 = (GrooveTree)adaptor.nil();
> 
> and is part of the program() method. Somehow I think this is a beginner's 
> error, but I cannot find the solution. I have tried to work around it by 
> using the default ASTLabelType and keeping a Map<CommonTree,Boolean> to keep 
> track of the property I would like, but this seems cumbersome. Could anyone 
> point me in a good direction?

You need to setup a tree adaptor so that the runtime knows how to
construct your nodes.

These are the things I had to do in order to get my own ASTLabelType,
note that my AST is called ExprAST -- so replace all occurrances of that
string below with yours. also note that I did this over a year ago using
an earlier version of ANTLR v3, so altho this still works, just re-ran
my tests, today's version of ANTLR may make some of my steps simpler
and/or entirely un-necessary... YMMV

1) in the grammar add the ASTLabelType= option (as you have already
done)

2) create your new tree node class, ensuring that it extends CommonTree.
Here is my ExprAST (note that Type is also one of my classes):

//----begin ExprAST here....
import org.antlr.runtime.Token;
import org.antlr.runtime.tree.*;

public class ExprAST extends CommonTree {

   public Type type;

   public ExprAST() {
      super();
      type = null;
   }

   public ExprAST(Token tok) {
      super(tok);
      type = null;
   }

   public ExprAST(ExprAST tree) {
      super(tree);
      this.type = tree.type;
   }

   public ExprAST(Token tok, Type type) {
      super(tok);
      this.type = type;
   }

   @Override public Tree dupNode() {
      return new ExprAST(this);
   }

   @Override public String toString() {
      final String result;
      if (type==null) {
         result = super.toString();
      } else {
         result = String.format("%s[%s]",
                     super.toString(),type.nickName());
      }
      return result;
   }
}
//----end ExprAST

3) copy org.antlr.runtime.tree.CommonErrorNode from the ANTLR run-time
sources. I called mine ExprASTErrorNode. Edit your copy so that is
extends your new tree node class rather than CommonTree.

4) create an instance of the adaptor class, i do this in my main:

//---begin adaptor code here...
   // Custom adaptor to create ExprAST node type
   private static final TreeAdaptor adaptor = new CommonTreeAdaptor() {
         @Override public Object create(Token payload) {
            return new ExprAST(payload);
         }
         @Override public Object dupNode(Object old) {
            return (old==null)? null : ((ExprAST)old).dupNode();
         }
         @Override public Object errorNode(TokenStream input,
                                           Token start, Token stop,
                                           RecognitionException e) {
            return new ExprASTErrorNode(input, start, stop, e);
         }
      };
//----end adaptor code.

5) call the parser's setAdaptor method with the above adaptor. I invoke
my parser with something similar to this:

//----begin parser invocation code here...
   ExprLexer lexer = new ExprLexer(...whatever....);
   CommonTokenStream tokens = new CommonTokenStream(lexer);
   ExprParser parser = new ExprParser(tokens);
   parser.setTreeAdaptor(adaptor);
   ExprParser.program_return p_result = parser.program();

   ast = p_result.tree;
//----end parser invocation code.

> 
> Thanks!

Hope this helps...
   -jbb


From antonio.petrelli at gmail.com  Tue Jan 12 12:42:10 2010
From: antonio.petrelli at gmail.com (Antonio Petrelli)
Date: Tue, 12 Jan 2010 21:42:10 +0100
Subject: [antlr-interest] antlr3-maven-plugin (v3.2): "error(7): cannot
	find or open file: null/MyGrammar.g"
In-Reply-To: <c8ce53501001091511w70e6000fkaef5bc160f4de65f@mail.gmail.com>
References: <c8ce53501001091507v64e3a6dbs6edb0550b40168c0@mail.gmail.com>
	<c8ce53501001091511w70e6000fkaef5bc160f4de65f@mail.gmail.com>
Message-ID: <aae96ca1001121242w78c2d763yb64ff66e7abfcfc7@mail.gmail.com>

Just right now I noticed that I did not answer to the mailing list but
directly to the poster, sorry :-)

2010/1/10 Michael Guyver <michael.guyver at gmail.com>:
> I had formerly been using the codehaus 1.0 release and been setting
> the output directory to
>
> target/generated-sources/antlr/my/full/package/path/ so that the
>
> generated files arrived in the right place. Happily the new plugin
> does this for you so simply moving the grammar to
>
> src/main/antlr3/my/full/package/path/MyGrammar.g
>
> solved the problem and meant I didn't have to specify the output
> directory either \:D/

It does not work wit Java.g, the package is still the default!
http://openjdk.java.net/projects/compiler-grammar/antlrworks/Java.g
This is definitely a double bug I think.
I would file a bug myself, but I can for a strange policy (never seen
anywhere else!) that Antlr team have about bugs.

Thanks anyway
Antonio

P.S. Luckily I noticed that I don't need Antlr anymore, thanks to the
Compiler Tree API of JDK 6, so, well, who cares :-D

From wwilbur3 at yahoo.com  Tue Jan 12 14:58:13 2010
From: wwilbur3 at yahoo.com (Warren Wilbur)
Date: Tue, 12 Jan 2010 14:58:13 -0800 (PST)
Subject: [antlr-interest] Issue with antlrworks 1.3.1 and JDK 1.6 update
	17?
In-Reply-To: <mailman.1.1262808002.9167.antlr-interest@antlr.org>
Message-ID: <212800.78090.qm@web65606.mail.ac4.yahoo.com>

Here are a few debugging ideas as I've seen each of these issues before...

1. Try increasing the heap memory for Java on the command line. e.g. to increase to 1GB use: java -Xmx1024M -jar antlrworks-1.3.1.jar

2. Check if you are really using the Sun Java JRE/JDK on Ubuntu Linux (this will give you the right idea: http://www.cyberciti.biz/faq/howto-ubuntu-linux-install-configure-jdk-jre) . If multiple alternatives are installed you might not be... Using another JRE/JDK could be the cause of your problems.

3. Run antlrworks by command line from a terminal. If you have any 'out of memory' errors you will see console messages in the Ubuntu terminal you executed it from.

Date: Wed, 6 Jan 2010 19:33:23 +0800
From: Michael Richter <ttmrichter at gmail.com>
Subject: [antlr-interest] Issue with antlrworks 1.3.1 and JDK 1.6
??? update 17?
To: antlr-interest at antlr.org
Message-ID:
??? <ee970b291001060333j78fde473nfc0efad9fa93b03f at mail.gmail.com>
Content-Type: text/plain; charset=UTF-8

I did a recent round of upgrading software on my machines (real and virtual)
and somewhere in the process I've got ANTLRworks in unusable shape.? (I
tried reporting this through the antlr.org web site but it doesn't seem to
have taken.)

On *every* machine I have access to (both real and virtual, running Windows
XP or Linux) I get the following pretty nasty behaviour:

???1. *java -jar antlrworks.jar* (I can also use javaw on Windows for a
???similar, more annoying effect.)
???2. *The splash screen pops up briefly.*
???3. *The "New Document" dialogue replaces it.*
???4. I hit "Cancel" (or alternatively press "Esc" on the keyboard).

At this point, no matter the platform, no matter what I try, I have a dead
executable until I hit Ctrl+C (or, if I used javaw, I kill it in the task
manager).? I've tried this on Ubuntu 9.04, on Slackware 13.0 (virtualized),
on Windows XP (four different machines, one virtualized) and get this
behaviour consistently.? Whatever's supposed to happen when I cancel the new
document dialogue freezes and can only unfreeze through lethal injection of
Ctrl+C.? (There are, of course, no messages on the console that could tell
me what's going on.)

The behaviour on Windows after this if I choose "OK" is acceptable.? Up
comes the wizard for a new project which works normally and, more
importantly, can be cancelled and gets me into the ANTLRworks GUI.? It's a
bit obnoxious having to go that route, but it works.? If I choose to use the
wizard everything works as expected.

The behaviour on Linux is less acceptable.? The new project wizard pops up
but the text input focus is on ANTLRworks' editor window and CANNOT be put
into the wizard at all on any spot.? I have to cancel the wizard to get to
the main window (which then works as expected).? This also happens if I go
File -> New from the main window: I simply cannot get text input into any
field of the new project wizard.

The last time I did anything with ANTLRworks was v1.3.0 using JDK 1.6 update
16.? I did not see this behaviour then at all, so something has happened
between then and now.

Any advice for debugging this further?


From jimi at temporal-wave.com  Tue Jan 12 15:46:35 2010
From: jimi at temporal-wave.com (Jim Idle)
Date: Tue, 12 Jan 2010 15:46:35 -0800
Subject: [antlr-interest] Issue with antlrworks 1.3.1 and JDK 1.6 update
	17?
In-Reply-To: <212800.78090.qm@web65606.mail.ac4.yahoo.com>
Message-ID: <0482aaf0e226ef43abff16e2f78c9db2@temporal-wave.com>

I have never had much success with the OpenJDK/JRE it is better to use Sun's JDK (installed from their Web Site). Ubuntu was nothing but trouble for me too but it was 64 Bit Ubuntu.

Jim

> -----Original Message-----
> From: antlr-interest-bounces at antlr.org [mailto:antlr-interest-
> bounces at antlr.org] On Behalf Of Warren Wilbur
> Sent: Tuesday, January 12, 2010 2:58 PM
> To: antlr-interest at antlr.org
> Subject: Re: [antlr-interest] Issue with antlrworks 1.3.1 and JDK 1.6
> update 17?
> 
> Here are a few debugging ideas as I've seen each of these issues
> before...
> 
> 1. Try increasing the heap memory for Java on the command line. e.g. to
> increase to 1GB use: java -Xmx1024M -jar antlrworks-1.3.1.jar
> 
> 2. Check if you are really using the Sun Java JRE/JDK on Ubuntu Linux
> (this will give you the right idea: http://www.cyberciti.biz/faq/howto-
> ubuntu-linux-install-configure-jdk-jre) . If multiple alternatives are
> installed you might not be... Using another JRE/JDK could be the cause
> of your problems.
> 
> manager).? I've tried this on Ubuntu 9.04, on Slackware 13.0
> (virtualized),


From nikmd23 at gmail.com  Tue Jan 12 17:52:48 2010
From: nikmd23 at gmail.com (Nik Molnar)
Date: Tue, 12 Jan 2010 20:52:48 -0500
Subject: [antlr-interest] Noob Question
Message-ID: <f69cc8191001121752q3dce26f2o93aada84ef47b00a@mail.gmail.com>

Hello all,

I am rather new to ANTLR and seem to be running into a small issue I can't
figure out.

I'm writing a very simple grammar based on many tutorials online, the
calculator.

This grammar generates C# code that compiles perfectly, and works for the
most part in ANTLRWorks Interpreter, Debugger and in a sample app I made in
.NET to call the generated Parser/Lexer.

The problem I run into is what I put in invalid syntax, expecting an error.
Output like so:

Valid Syntax: "3+3" => Works in interpreter, debugger and compiled .net
code.
Invalid Syntax: "3+/3" => Gives error in interpreter, debugger and compiled
.net code, as expected.
Invalid Syntax: "3_3" => The interpreter shows nothing, the debugger cannot
connect and the .net code hangs for a while then throws an out of memory
exception.

I'm sure I'm doing something wrong in my grammar but don't know what.

I've included it below. Please help me!

Thanks,

grammar Test;

/*options
{
language = 'CSharp2';
}*/

expression
    : amExpression;

amExpression
    :mdExpression ((PLUS|DASH) mdExpression)*
    ;

mdExpression
    :INT ((STAR|SLASH) INT)*
    ;

DASH
    :'-'
    ;

SLASH
    :'/'
    ;

WS
    : (' '
    | '\t'
    | '\n'
    | '\r')*
    { $channel = HIDDEN; }
    ;

STAR
    : '*'
    ;

PLUS
    : '+'
    ;

fragment DIGIT
    : '0'..'9'
    ;

INT
    : (DIGIT)+
    ;

From jbb at acm.org  Tue Jan 12 18:21:03 2010
From: jbb at acm.org (John B. Brodie)
Date: Tue, 12 Jan 2010 21:21:03 -0500
Subject: [antlr-interest] Noob Question
In-Reply-To: <f69cc8191001121752q3dce26f2o93aada84ef47b00a@mail.gmail.com>
References: <f69cc8191001121752q3dce26f2o93aada84ef47b00a@mail.gmail.com>
Message-ID: <1263349263.8618.17.camel@gecko.home.org>

Greetings!

Your WS lexer rule can recognize the empty string, this is VERY bad.

Because WS can recognize the empty string your lexer will enter an
infinite loop when encountering a character it can not deal with - like
the '_' in your example - you have no lexer rule that can handle a '_'.

More below...

On Tue, 2010-01-12 at 20:52 -0500, Nik Molnar wrote:
> Hello all,
> 
> I am rather new to ANTLR and seem to be running into a small issue I can't
> figure out.
> 
> I'm writing a very simple grammar based on many tutorials online, the
> calculator.
> 
> This grammar generates C# code that compiles perfectly, and works for the
> most part in ANTLRWorks Interpreter, Debugger and in a sample app I made in
> .NET to call the generated Parser/Lexer.
> 
> The problem I run into is what I put in invalid syntax, expecting an error.
> Output like so:
> 
> Valid Syntax: "3+3" => Works in interpreter, debugger and compiled .net
> code.
> Invalid Syntax: "3+/3" => Gives error in interpreter, debugger and compiled
> .net code, as expected.
> Invalid Syntax: "3_3" => The interpreter shows nothing, the debugger cannot
> connect and the .net code hangs for a while then throws an out of memory
> exception.

Your lexer will correctly identify the first '3' as an INT. Next your
lexer will see the '_' which it is unable to deal with. BUT since your
WS rule says that the empty string - the non-stuff between the first '3'
and the '_' - is legal, your lexer accepts that empty string as a WS
token and deposits it into the HIDDEN channel. Now the lexer is still
looking at the '_' which it is unable to deal with. BUT since your WS
rule says that the empty string - the non-stuff between the first '3'
and the '_' - is legal, your lexer accepts that empty string as a WS
token and deposits it into the HIDDEN channel. Now the lexer is still
looking at the '_' which it is unable to deal with. BUT since your WS
rule says that the empty string - the non-stuff between the first '3'
and the '_' - is legal, your lexer accepts that empty string as a WS
token and deposits it into the HIDDEN channel. Now the lexer is still
looking at the '_' .... and so nothing good results.

Your .NET app runs out of memory because the infinite sequence of empty
WS tokens appended onto the HIDDEN channel just gobbles up all memory.

The debugger can not connect because the connections happens after the
lexer has finished tokenizing the input text. Your lexer never finishes
so the debugger won't connect. I bet if you waited long enuf you would
eventually run out of memory in this case too.

Same drill for the interpreter....

> 
> I'm sure I'm doing something wrong in my grammar but don't know what.
> 
> I've included it below. Please help me!
> 
> Thanks,
> 
> grammar Test;
> 
> /*options
> {
> language = 'CSharp2';
> }*/
> 
> expression
>     : amExpression;
> 
> amExpression
>     :mdExpression ((PLUS|DASH) mdExpression)*
>     ;
> 
> mdExpression
>     :INT ((STAR|SLASH) INT)*
>     ;
> 
> DASH
>     :'-'
>     ;
> 
> SLASH
>     :'/'
>     ;
> 
> WS
>     : (' '
>     | '\t'
>     | '\n'
>     | '\r')*
>     { $channel = HIDDEN; }
>     ;

the * above should really be a +

be VERY careful with rules that can recognize the empty string, e.g.
have just a * or ? operator.

I have NEVER found an instance where a lexer rule that accepts nothing
(the empty string) does anything that helps.

On RARE occasions, a parser rule that accepts the empty string can be
appropriate, but needs to be examined VERY closely.

> 
> STAR
>     : '*'
>     ;
> 
> PLUS
>     : '+'
>     ;
> 
> fragment DIGIT
>     : '0'..'9'
>     ;
> 
> INT
>     : (DIGIT)+
>     ;

Hope this helps...
   -jbb


From ttmrichter at gmail.com  Tue Jan 12 18:30:50 2010
From: ttmrichter at gmail.com (Michael Richter)
Date: Wed, 13 Jan 2010 10:30:50 +0800
Subject: [antlr-interest] Issue with antlrworks 1.3.1 and JDK 1.6 update
	17?
In-Reply-To: <212800.78090.qm@web65606.mail.ac4.yahoo.com>
References: <mailman.1.1262808002.9167.antlr-interest@antlr.org>
	<212800.78090.qm@web65606.mail.ac4.yahoo.com>
Message-ID: <ee970b291001121830t7ea6b43dxc47066574591c61e@mail.gmail.com>

Off the top of my head 2 and 3 are confirmed.  I run my own copy of the Sun
JDK in my user directory precisely because of the whole GNU "java" compiler
fiasco.  gjc is not on my path anywhere and java points to
~/software/jdk<stuff>/bin/java.  And I only ever actually use antlrworks
from the command line (as an alias, to be fair) so it's the only way I can
show you how it's running.  ;)

I'll test the heap memory thing now, though.

....And we're back.  The behaviour is identical with 1GB of heap memory on
both Windows and Linux.

2010/1/13 Warren Wilbur <wwilbur3 at yahoo.com>

> Here are a few debugging ideas as I've seen each of these issues before...
>
> 1. Try increasing the heap memory for Java on the command line. e.g. to
> increase to 1GB use: java -Xmx1024M -jar antlrworks-1.3.1.jar
>
> 2. Check if you are really using the Sun Java JRE/JDK on Ubuntu Linux (this
> will give you the right idea:
> http://www.cyberciti.biz/faq/howto-ubuntu-linux-install-configure-jdk-jre)
> . If multiple alternatives are installed you might not be... Using another
> JRE/JDK could be the cause of your problems.
>
> 3. Run antlrworks by command line from a terminal. If you have any 'out of
> memory' errors you will see console messages in the Ubuntu terminal you
> executed it from.
>
> Date: Wed, 6 Jan 2010 19:33:23 +0800
> From: Michael Richter <ttmrichter at gmail.com>
> Subject: [antlr-interest] Issue with antlrworks 1.3.1 and JDK 1.6
>     update 17?
> To: antlr-interest at antlr.org
> Message-ID:
>     <ee970b291001060333j78fde473nfc0efad9fa93b03f at mail.gmail.com>
> Content-Type: text/plain; charset=UTF-8
>
> I did a recent round of upgrading software on my machines (real and
> virtual)
> and somewhere in the process I've got ANTLRworks in unusable shape.  (I
> tried reporting this through the antlr.org web site but it doesn't seem to
> have taken.)
>
> On *every* machine I have access to (both real and virtual, running Windows
> XP or Linux) I get the following pretty nasty behaviour:
>
>    1. *java -jar antlrworks.jar* (I can also use javaw on Windows for a
>    similar, more annoying effect.)
>    2. *The splash screen pops up briefly.*
>    3. *The "New Document" dialogue replaces it.*
>    4. I hit "Cancel" (or alternatively press "Esc" on the keyboard).
>
> At this point, no matter the platform, no matter what I try, I have a dead
> executable until I hit Ctrl+C (or, if I used javaw, I kill it in the task
> manager).  I've tried this on Ubuntu 9.04, on Slackware 13.0 (virtualized),
> on Windows XP (four different machines, one virtualized) and get this
> behaviour consistently.  Whatever's supposed to happen when I cancel the
> new
> document dialogue freezes and can only unfreeze through lethal injection of
> Ctrl+C.  (There are, of course, no messages on the console that could tell
> me what's going on.)
>
> The behaviour on Windows after this if I choose "OK" is acceptable.  Up
> comes the wizard for a new project which works normally and, more
> importantly, can be cancelled and gets me into the ANTLRworks GUI.  It's a
> bit obnoxious having to go that route, but it works.  If I choose to use
> the
> wizard everything works as expected.
>
> The behaviour on Linux is less acceptable.  The new project wizard pops up
> but the text input focus is on ANTLRworks' editor window and CANNOT be put
> into the wizard at all on any spot.  I have to cancel the wizard to get to
> the main window (which then works as expected).  This also happens if I go
> File -> New from the main window: I simply cannot get text input into any
> field of the new project wizard.
>
> The last time I did anything with ANTLRworks was v1.3.0 using JDK 1.6
> update
> 16.  I did not see this behaviour then at all, so something has happened
> between then and now.
>
> Any advice for debugging this further?
>
>
>
>
>
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe:
> http://www.antlr.org/mailman/options/antlr-interest/your-email-address
>

From nikmd23 at gmail.com  Tue Jan 12 18:32:50 2010
From: nikmd23 at gmail.com (Nik Molnar)
Date: Tue, 12 Jan 2010 21:32:50 -0500
Subject: [antlr-interest] Noob Question
In-Reply-To: <1263349263.8618.17.camel@gecko.home.org>
References: <f69cc8191001121752q3dce26f2o93aada84ef47b00a@mail.gmail.com>
	<1263349263.8618.17.camel@gecko.home.org>
Message-ID: <f69cc8191001121832w7a1dde59s1437d851bb8831cf@mail.gmail.com>

JOHN!

THANK YOU! You don't know how long I've been struggling with this - and now
that you explain it, it makes perfect sense!

I will heed your warning about * and ? - I see how they match empty strings
now.

Thanks,
Nik

On Tue, Jan 12, 2010 at 9:21 PM, John B. Brodie <jbb at acm.org> wrote:

> Greetings!
>
> Your WS lexer rule can recognize the empty string, this is VERY bad.
>
> Because WS can recognize the empty string your lexer will enter an
> infinite loop when encountering a character it can not deal with - like
> the '_' in your example - you have no lexer rule that can handle a '_'.
>
> More below...
>
> On Tue, 2010-01-12 at 20:52 -0500, Nik Molnar wrote:
> > Hello all,
> >
> > I am rather new to ANTLR and seem to be running into a small issue I
> can't
> > figure out.
> >
> > I'm writing a very simple grammar based on many tutorials online, the
> > calculator.
> >
> > This grammar generates C# code that compiles perfectly, and works for the
> > most part in ANTLRWorks Interpreter, Debugger and in a sample app I made
> in
> > .NET to call the generated Parser/Lexer.
> >
> > The problem I run into is what I put in invalid syntax, expecting an
> error.
> > Output like so:
> >
> > Valid Syntax: "3+3" => Works in interpreter, debugger and compiled .net
> > code.
> > Invalid Syntax: "3+/3" => Gives error in interpreter, debugger and
> compiled
> > .net code, as expected.
> > Invalid Syntax: "3_3" => The interpreter shows nothing, the debugger
> cannot
> > connect and the .net code hangs for a while then throws an out of memory
> > exception.
>
> Your lexer will correctly identify the first '3' as an INT. Next your
> lexer will see the '_' which it is unable to deal with. BUT since your
> WS rule says that the empty string - the non-stuff between the first '3'
> and the '_' - is legal, your lexer accepts that empty string as a WS
> token and deposits it into the HIDDEN channel. Now the lexer is still
> looking at the '_' which it is unable to deal with. BUT since your WS
> rule says that the empty string - the non-stuff between the first '3'
> and the '_' - is legal, your lexer accepts that empty string as a WS
> token and deposits it into the HIDDEN channel. Now the lexer is still
> looking at the '_' which it is unable to deal with. BUT since your WS
> rule says that the empty string - the non-stuff between the first '3'
> and the '_' - is legal, your lexer accepts that empty string as a WS
> token and deposits it into the HIDDEN channel. Now the lexer is still
> looking at the '_' .... and so nothing good results.
>
> Your .NET app runs out of memory because the infinite sequence of empty
> WS tokens appended onto the HIDDEN channel just gobbles up all memory.
>
> The debugger can not connect because the connections happens after the
> lexer has finished tokenizing the input text. Your lexer never finishes
> so the debugger won't connect. I bet if you waited long enuf you would
> eventually run out of memory in this case too.
>
> Same drill for the interpreter....
>
> >
> > I'm sure I'm doing something wrong in my grammar but don't know what.
> >
> > I've included it below. Please help me!
> >
> > Thanks,
> >
> > grammar Test;
> >
> > /*options
> > {
> > language = 'CSharp2';
> > }*/
> >
> > expression
> >     : amExpression;
> >
> > amExpression
> >     :mdExpression ((PLUS|DASH) mdExpression)*
> >     ;
> >
> > mdExpression
> >     :INT ((STAR|SLASH) INT)*
> >     ;
> >
> > DASH
> >     :'-'
> >     ;
> >
> > SLASH
> >     :'/'
> >     ;
> >
> > WS
> >     : (' '
> >     | '\t'
> >     | '\n'
> >     | '\r')*
> >     { $channel = HIDDEN; }
> >     ;
>
> the * above should really be a +
>
> be VERY careful with rules that can recognize the empty string, e.g.
> have just a * or ? operator.
>
> I have NEVER found an instance where a lexer rule that accepts nothing
> (the empty string) does anything that helps.
>
> On RARE occasions, a parser rule that accepts the empty string can be
> appropriate, but needs to be examined VERY closely.
>
> >
> > STAR
> >     : '*'
> >     ;
> >
> > PLUS
> >     : '+'
> >     ;
> >
> > fragment DIGIT
> >     : '0'..'9'
> >     ;
> >
> > INT
> >     : (DIGIT)+
> >     ;
>
> Hope this helps...
>    -jbb
>
>
>

From r66092 at freescale.com  Tue Jan 12 18:35:40 2010
From: r66092 at freescale.com (Chen Hongjun-R66092)
Date: Wed, 13 Jan 2010 10:35:40 +0800
Subject: [antlr-interest]  An error occurs in template example
Message-ID: <3A45394FD742FA419B760BB8D398F9ED011E1D2A@zch01exm26.fsl.freescale.net>

Hi,

I am new to ANTLR, and am reading the book The Definitive ANTLR
Reference. When I tried the template example 'template/generator/2pass'
without any modification, and met an error as below:

Exception in thread "main" java.util.NoSuchElementException: no such
attribute: init in template context [jasminFile]
	at
org.antlr.stringtemplate.StringTemplate.checkNullAttributeAgainstFormalA
rguments(StringTemplate.java:1311)
	at
org.antlr.stringtemplate.StringTemplate.getAttribute(StringTemplate.java
:684)
	at
org.antlr.stringtemplate.language.ActionEvaluator.attribute(ActionEvalua
tor.java:360)
	at
org.antlr.stringtemplate.language.ActionEvaluator.expr(ActionEvaluator.j
ava:136)
	at
org.antlr.stringtemplate.language.ActionEvaluator.action(ActionEvaluator
.java:84)
	at
org.antlr.stringtemplate.language.ASTExpr.write(ASTExpr.java:149)
	at
org.antlr.stringtemplate.StringTemplate.write(StringTemplate.java:705)
	at
org.antlr.stringtemplate.StringTemplate.toString(StringTemplate.java:167
0)
	at
org.antlr.stringtemplate.StringTemplate.toString(StringTemplate.java:166
1)
	at Test.main(Test.java:45)

I appreciate your any suggestions or ideas!

Thanks,
Hongjun
  

From parrt at cs.usfca.edu  Tue Jan 12 18:52:28 2010
From: parrt at cs.usfca.edu (Terence Parr)
Date: Tue, 12 Jan 2010 18:52:28 -0800
Subject: [antlr-interest] An error occurs in template example
In-Reply-To: <3A45394FD742FA419B760BB8D398F9ED011E1D2A@zch01exm26.fsl.freescale.net>
References: <3A45394FD742FA419B760BB8D398F9ED011E1D2A@zch01exm26.fsl.freescale.net>
Message-ID: <DB97243E-6634-4E1A-9FEB-12C972A5A57F@cs.usfca.edu>

the error says you don't have an "init" parameter to the template. do you have one?
Ter
On Jan 12, 2010, at 6:35 PM, Chen Hongjun-R66092 wrote:

> Hi,
> 
> I am new to ANTLR, and am reading the book The Definitive ANTLR
> Reference. When I tried the template example 'template/generator/2pass'
> without any modification, and met an error as below:
> 
> Exception in thread "main" java.util.NoSuchElementException: no such
> attribute: init in template context [jasminFile]
> 	at
> org.antlr.stringtemplate.StringTemplate.checkNullAttributeAgainstFormalA
> rguments(StringTemplate.java:1311)
> 	at
> org.antlr.stringtemplate.StringTemplate.getAttribute(StringTemplate.java
> :684)
> 	at
> org.antlr.stringtemplate.language.ActionEvaluator.attribute(ActionEvalua
> tor.java:360)
> 	at
> org.antlr.stringtemplate.language.ActionEvaluator.expr(ActionEvaluator.j
> ava:136)
> 	at
> org.antlr.stringtemplate.language.ActionEvaluator.action(ActionEvaluator
> .java:84)
> 	at
> org.antlr.stringtemplate.language.ASTExpr.write(ASTExpr.java:149)
> 	at
> org.antlr.stringtemplate.StringTemplate.write(StringTemplate.java:705)
> 	at
> org.antlr.stringtemplate.StringTemplate.toString(StringTemplate.java:167
> 0)
> 	at
> org.antlr.stringtemplate.StringTemplate.toString(StringTemplate.java:166
> 1)
> 	at Test.main(Test.java:45)
> 
> I appreciate your any suggestions or ideas!
> 
> Thanks,
> Hongjun
> 
> 
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address


From r66092 at freescale.com  Tue Jan 12 19:12:16 2010
From: r66092 at freescale.com (Chen Hongjun-R66092)
Date: Wed, 13 Jan 2010 11:12:16 +0800
Subject: [antlr-interest] An error occurs in template example
In-Reply-To: <DB97243E-6634-4E1A-9FEB-12C972A5A57F@cs.usfca.edu>
References: <3A45394FD742FA419B760BB8D398F9ED011E1D2A@zch01exm26.fsl.freescale.net>
	<DB97243E-6634-4E1A-9FEB-12C972A5A57F@cs.usfca.edu>
Message-ID: <3A45394FD742FA419B760BB8D398F9ED011E1D45@zch01exm26.fsl.freescale.net>

Hi Terence,

Thanks for your response. For the example 'templates/generator/2pass', I
used the following commands to try it out:

# java org.antlr.Tool *.g
# javac *.java
# java Test < input

Do I miss anything? What is the "init" parameter needed by template? How
to provide this "init" parameter for template?

Thanks again,
Hongjun

> -----Original Message-----
> From: Terence Parr [mailto:parrt at cs.usfca.edu] 
> Sent: Wednesday, January 13, 2010 10:52 AM
> To: Chen Hongjun-R66092
> Cc: antlr-interest at antlr.org
> Subject: Re: [antlr-interest] An error occurs in template example
> 
> the error says you don't have an "init" parameter to the 
> template. do you have one?
> Ter
> On Jan 12, 2010, at 6:35 PM, Chen Hongjun-R66092 wrote:
> 
> > Hi,
> > 
> > I am new to ANTLR, and am reading the book The Definitive ANTLR 
> > Reference. When I tried the template example 
> 'template/generator/2pass'
> > without any modification, and met an error as below:
> > 
> > Exception in thread "main" java.util.NoSuchElementException: no such
> > attribute: init in template context [jasminFile]
> > 	at
> > 
> org.antlr.stringtemplate.StringTemplate.checkNullAttributeAgainstForma
> > lA
> > rguments(StringTemplate.java:1311)
> > 	at
> > 
> org.antlr.stringtemplate.StringTemplate.getAttribute(StringTemplate.ja
> > va
> > :684)
> > 	at
> > 
> org.antlr.stringtemplate.language.ActionEvaluator.attribute(ActionEval
> > ua
> > tor.java:360)
> > 	at
> > 
> org.antlr.stringtemplate.language.ActionEvaluator.expr(ActionEvaluator
> > .j
> > ava:136)
> > 	at
> > 
> org.antlr.stringtemplate.language.ActionEvaluator.action(ActionEvaluat
> > or
> > .java:84)
> > 	at
> > org.antlr.stringtemplate.language.ASTExpr.write(ASTExpr.java:149)
> > 	at
> > 
> org.antlr.stringtemplate.StringTemplate.write(StringTemplate.java:705)
> > 	at
> > 
> org.antlr.stringtemplate.StringTemplate.toString(StringTemplate.java:1
> > 67
> > 0)
> > 	at
> > 
> org.antlr.stringtemplate.StringTemplate.toString(StringTemplate.java:1
> > 66
> > 1)
> > 	at Test.main(Test.java:45)
> > 
> > I appreciate your any suggestions or ideas!
> > 
> > Thanks,
> > Hongjun
> > 
> > 
> > List: http://www.antlr.org/mailman/listinfo/antlr-interest
> > Unsubscribe: 
> > 
> http://www.antlr.org/mailman/options/antlr-interest/your-email-address
> 
> 
> 

From scott at javadude.com  Wed Jan 13 10:11:42 2010
From: scott at javadude.com (Scott Stanchfield)
Date: Wed, 13 Jan 2010 13:11:42 -0500
Subject: [antlr-interest] ANTLR 3.x Video Tutorial
Message-ID: <d19d16481001131011h6d54ec2fjb068ba9d6997e061@mail.gmail.com>

Hey all!

I've posted the first parts of my new ANTLR 3.x video tutorial (in Eclipse) at

    http://javadude.com/articles/antlr3xtut

I plan to do vids on all phases of the sample compiler. Right now it
builds a recognizer and did examples of interpreting an expression (in
the parser grammar, using GoF Interpreter Pattern and an ANTLR tree
parser - good demonstration of how much simpler the tree parser is)

I'd love to hear any comments/suggestions/errors on the tutorials.
They're in 10-30 minute chunks, so if I royally screwed something up I
can redo parts ;)

Note that I did each of these with very little rehearsal, and there
are some spots where I make a mistake and walk through correcting it.
I like doing the tuts this way as they feel more "human" and get to
show a bit more thought process.

-- Scott

----------------------------------------
Scott Stanchfield
http://javadude.com

From espina.edgar at gmail.com  Wed Jan 13 10:39:22 2010
From: espina.edgar at gmail.com (Edgar Espina)
Date: Wed, 13 Jan 2010 15:39:22 -0300
Subject: [antlr-interest] ANTLR 3.x Video Tutorial
In-Reply-To: <d19d16481001131011h6d54ec2fjb068ba9d6997e061@mail.gmail.com>
References: <d19d16481001131011h6d54ec2fjb068ba9d6997e061@mail.gmail.com>
Message-ID: <92b42db61001131039x357eb39flf462ea461a93536b@mail.gmail.com>

Hi Scott,

 All the videos are really awesome. Thanks you for choosing ANTLR IDE.

Regards,

edgar

On Wed, Jan 13, 2010 at 3:11 PM, Scott Stanchfield <scott at javadude.com>wrote:

> Hey all!
>
> I've posted the first parts of my new ANTLR 3.x video tutorial (in Eclipse)
> at
>
>    http://javadude.com/articles/antlr3xtut
>
> I plan to do vids on all phases of the sample compiler. Right now it
> builds a recognizer and did examples of interpreting an expression (in
> the parser grammar, using GoF Interpreter Pattern and an ANTLR tree
> parser - good demonstration of how much simpler the tree parser is)
>
> I'd love to hear any comments/suggestions/errors on the tutorials.
> They're in 10-30 minute chunks, so if I royally screwed something up I
> can redo parts ;)
>
> Note that I did each of these with very little rehearsal, and there
> are some spots where I make a mistake and walk through correcting it.
> I like doing the tuts this way as they feel more "human" and get to
> show a bit more thought process.
>
> -- Scott
>
> ----------------------------------------
> Scott Stanchfield
> http://javadude.com
>
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe:
> http://www.antlr.org/mailman/options/antlr-interest/your-email-address
>


-- 
edgar

From parrt at cs.usfca.edu  Wed Jan 13 11:05:17 2010
From: parrt at cs.usfca.edu (Terence Parr)
Date: Wed, 13 Jan 2010 11:05:17 -0800
Subject: [antlr-interest] ANTLR 3.x Video Tutorial
In-Reply-To: <d19d16481001131011h6d54ec2fjb068ba9d6997e061@mail.gmail.com>
References: <d19d16481001131011h6d54ec2fjb068ba9d6997e061@mail.gmail.com>
Message-ID: <741CD6EA-9CF9-47CB-BDE8-CFF3E45683D7@cs.usfca.edu>

Thanks Scott. great stuff. I took a peek.
Ter
On Jan 13, 2010, at 10:11 AM, Scott Stanchfield wrote:

> Hey all!
> 
> I've posted the first parts of my new ANTLR 3.x video tutorial (in Eclipse) at
> 
>    http://javadude.com/articles/antlr3xtut
> 
> I plan to do vids on all phases of the sample compiler. Right now it
> builds a recognizer and did examples of interpreting an expression (in
> the parser grammar, using GoF Interpreter Pattern and an ANTLR tree
> parser - good demonstration of how much simpler the tree parser is)
> 
> I'd love to hear any comments/suggestions/errors on the tutorials.
> They're in 10-30 minute chunks, so if I royally screwed something up I
> can redo parts ;)
> 
> Note that I did each of these with very little rehearsal, and there
> are some spots where I make a mistake and walk through correcting it.
> I like doing the tuts this way as they feel more "human" and get to
> show a bit more thought process.
> 
> -- Scott
> 
> ----------------------------------------
> Scott Stanchfield
> http://javadude.com
> 
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address


From parrt at cs.usfca.edu  Wed Jan 13 11:06:52 2010
From: parrt at cs.usfca.edu (Terence Parr)
Date: Wed, 13 Jan 2010 11:06:52 -0800
Subject: [antlr-interest] An error occurs in template example
In-Reply-To: <3A45394FD742FA419B760BB8D398F9ED011E1D45@zch01exm26.fsl.freescale.net>
References: <3A45394FD742FA419B760BB8D398F9ED011E1D2A@zch01exm26.fsl.freescale.net>
	<DB97243E-6634-4E1A-9FEB-12C972A5A57F@cs.usfca.edu>
	<3A45394FD742FA419B760BB8D398F9ED011E1D45@zch01exm26.fsl.freescale.net>
Message-ID: <63ACB2EF-E813-4B3F-BDE1-941BF6C77C2B@cs.usfca.edu>

weird. and youdidn't alter the software at all?
Ter
On Jan 12, 2010, at 7:12 PM, Chen Hongjun-R66092 wrote:

> Hi Terence,
> 
> Thanks for your response. For the example 'templates/generator/2pass', I
> used the following commands to try it out:
> 
> # java org.antlr.Tool *.g
> # javac *.java
> # java Test < input
> 
> Do I miss anything? What is the "init" parameter needed by template? How
> to provide this "init" parameter for template?
> 
> Thanks again,
> Hongjun
> 
>> -----Original Message-----
>> From: Terence Parr [mailto:parrt at cs.usfca.edu] 
>> Sent: Wednesday, January 13, 2010 10:52 AM
>> To: Chen Hongjun-R66092
>> Cc: antlr-interest at antlr.org
>> Subject: Re: [antlr-interest] An error occurs in template example
>> 
>> the error says you don't have an "init" parameter to the 
>> template. do you have one?
>> Ter
>> On Jan 12, 2010, at 6:35 PM, Chen Hongjun-R66092 wrote:
>> 
>>> Hi,
>>> 
>>> I am new to ANTLR, and am reading the book The Definitive ANTLR 
>>> Reference. When I tried the template example 
>> 'template/generator/2pass'
>>> without any modification, and met an error as below:
>>> 
>>> Exception in thread "main" java.util.NoSuchElementException: no such
>>> attribute: init in template context [jasminFile]
>>> 	at
>>> 
>> org.antlr.stringtemplate.StringTemplate.checkNullAttributeAgainstForma
>>> lA
>>> rguments(StringTemplate.java:1311)
>>> 	at
>>> 
>> org.antlr.stringtemplate.StringTemplate.getAttribute(StringTemplate.ja
>>> va
>>> :684)
>>> 	at
>>> 
>> org.antlr.stringtemplate.language.ActionEvaluator.attribute(ActionEval
>>> ua
>>> tor.java:360)
>>> 	at
>>> 
>> org.antlr.stringtemplate.language.ActionEvaluator.expr(ActionEvaluator
>>> .j
>>> ava:136)
>>> 	at
>>> 
>> org.antlr.stringtemplate.language.ActionEvaluator.action(ActionEvaluat
>>> or
>>> .java:84)
>>> 	at
>>> org.antlr.stringtemplate.language.ASTExpr.write(ASTExpr.java:149)
>>> 	at
>>> 
>> org.antlr.stringtemplate.StringTemplate.write(StringTemplate.java:705)
>>> 	at
>>> 
>> org.antlr.stringtemplate.StringTemplate.toString(StringTemplate.java:1
>>> 67
>>> 0)
>>> 	at
>>> 
>> org.antlr.stringtemplate.StringTemplate.toString(StringTemplate.java:1
>>> 66
>>> 1)
>>> 	at Test.main(Test.java:45)
>>> 
>>> I appreciate your any suggestions or ideas!
>>> 
>>> Thanks,
>>> Hongjun
>>> 
>>> 
>>> List: http://www.antlr.org/mailman/listinfo/antlr-interest
>>> Unsubscribe: 
>>> 
>> http://www.antlr.org/mailman/options/antlr-interest/your-email-address
>> 
>> 
>> 


From felix_do at web.de  Wed Jan 13 12:23:04 2010
From: felix_do at web.de (Felix Dorner)
Date: Wed, 13 Jan 2010 21:23:04 +0100
Subject: [antlr-interest] =?iso-8859-1?q?Building_the_=DCberjar_fails?=
Message-ID: <4B4E2BA8.7090000@web.de>

Hi,

I do:

mvn -Dmaven.test.skip=true package assembly:assembly

which fails with the output shown below. Any help welcome.

[...]
[INFO] [antlr3:antlr {execution: default}]
ANTLR installation corrupted; cannot find ANTLR messages format file 
org/antlr/tool/templates/messages/formats/antlr.stg
[INFO] 
------------------------------------------------------------------------
[ERROR] FATAL ERROR
[INFO] 
------------------------------------------------------------------------
[INFO] ANTLR ErrorManager panic
[INFO] 
------------------------------------------------------------------------
[INFO] Trace
java.lang.Error: ANTLR ErrorManager panic
    at org.antlr.tool.ErrorManager.panic(ErrorManager.java:955)
    at org.antlr.tool.ErrorManager.setFormat(ErrorManager.java:465)
    at org.antlr.Tool.setMessageFormat(Tool.java:1222)
    at org.antlr.mojo.antlr3.Antlr3Mojo.execute(Antlr3Mojo.java:336)
    at 
org.apache.maven.plugin.DefaultPluginManager.executeMojo(DefaultPluginManager.java:490)
    at 
org.apache.maven.lifecycle.DefaultLifecycleExecutor.executeGoals(DefaultLifecycleExecutor.java:694)
    at 
org.apache.maven.lifecycle.DefaultLifecycleExecutor.executeGoalWithLifecycle(DefaultLifecycleExecutor.java:556)
    at 
org.apache.maven.lifecycle.DefaultLifecycleExecutor.forkProjectLifecycle(DefaultLifecycleExecutor.java:1205)
    at 
org.apache.maven.lifecycle.DefaultLifecycleExecutor.forkLifecycle(DefaultLifecycleExecutor.java:1033)
    at 
org.apache.maven.lifecycle.DefaultLifecycleExecutor.executeGoals(DefaultLifecycleExecutor.java:643)
    at 
org.apache.maven.lifecycle.DefaultLifecycleExecutor.executeStandaloneGoal(DefaultLifecycleExecutor.java:569)
    at 
org.apache.maven.lifecycle.DefaultLifecycleExecutor.executeGoal(DefaultLifecycleExecutor.java:539)
    at 
org.apache.maven.lifecycle.DefaultLifecycleExecutor.executeGoalAndHandleFailures(DefaultLifecycleExecutor.java:387)
    at 
org.apache.maven.lifecycle.DefaultLifecycleExecutor.executeTaskSegments(DefaultLifecycleExecutor.java:284)
    at 
org.apache.maven.lifecycle.DefaultLifecycleExecutor.execute(DefaultLifecycleExecutor.java:180)
    at org.apache.maven.DefaultMaven.doExecute(DefaultMaven.java:328)
    at org.apache.maven.DefaultMaven.execute(DefaultMaven.java:138)
    at org.apache.maven.cli.MavenCli.main(MavenCli.java:362)
    at 
org.apache.maven.cli.compat.CompatibleMain.main(CompatibleMain.java:60)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
    at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597)
    at org.codehaus.classworlds.Launcher.launchEnhanced(Launcher.java:315)
    at org.codehaus.classworlds.Launcher.launch(Launcher.java:255)
    at org.codehaus.classworlds.Launcher.mainWithExitCode(Launcher.java:430)
    at org.codehaus.classworlds.Launcher.main(Launcher.java:375)
[INFO] 
------------------------------------------------------------------------
[INFO] Total time: 18 seconds
[INFO] Finished at: Wed Jan 13 21:20:27 CET 2010
[INFO] Final Memory: 36M/89M
[INFO] 
------------------------------------------------------------------------


From jimi at temporal-wave.com  Wed Jan 13 12:29:50 2010
From: jimi at temporal-wave.com (Jim Idle)
Date: Wed, 13 Jan 2010 12:29:50 -0800
Subject: [antlr-interest] =?iso-8859-1?q?Building_the_=DCberjar_fails?=
In-Reply-To: <4B4E2BA8.7090000@web.de>
Message-ID: <f2c1fdef60a78f4db28351d9dbe2c866@temporal-wave.com>

It's a Maven bug. Do a clean then try again and it will eventually work. I believe that I mention this in the BUILD.txt file, which you should read to the end before trying to build.

Jim

> -----Original Message-----
> From: antlr-interest-bounces at antlr.org [mailto:antlr-interest-
> bounces at antlr.org] On Behalf Of Felix Dorner
> Sent: Wednesday, January 13, 2010 12:23 PM
> To: antlr-interest at antlr.org
> Subject: [antlr-interest] Building the ?berjar fails
> 
> Hi,
> 
> I do:
> 
> mvn -Dmaven.test.skip=true package assembly:assembly
> 
> which fails with the output shown below. Any help welcome.
> 
> [...]
> [INFO] [antlr3:antlr {execution: default}]
> ANTLR installation corrupted; cannot find ANTLR messages format file
> org/antlr/tool/templates/messages/formats/antlr.stg
> [INFO]
> -----------------------------------------------------------------------
> -
> [ERROR] FATAL ERROR
> [INFO]
> -----------------------------------------------------------------------
> -
> [INFO] ANTLR ErrorManager panic
> [INFO]
> -----------------------------------------------------------------------
> -
> [INFO] Trace


From r66092 at freescale.com  Wed Jan 13 17:35:27 2010
From: r66092 at freescale.com (Chen Hongjun-R66092)
Date: Thu, 14 Jan 2010 09:35:27 +0800
Subject: [antlr-interest] An error occurs in template example
In-Reply-To: <63ACB2EF-E813-4B3F-BDE1-941BF6C77C2B@cs.usfca.edu>
References: <3A45394FD742FA419B760BB8D398F9ED011E1D2A@zch01exm26.fsl.freescale.net>
	<DB97243E-6634-4E1A-9FEB-12C972A5A57F@cs.usfca.edu>
	<3A45394FD742FA419B760BB8D398F9ED011E1D45@zch01exm26.fsl.freescale.net>
	<63ACB2EF-E813-4B3F-BDE1-941BF6C77C2B@cs.usfca.edu>
Message-ID: <3A45394FD742FA419B760BB8D398F9ED011E1E91@zch01exm26.fsl.freescale.net>

No, I didn't modify anything in this example.

Best Regards,
Hongjun 

> -----Original Message-----
> From: Terence Parr [mailto:parrt at cs.usfca.edu] 
> Sent: Thursday, January 14, 2010 3:07 AM
> To: Chen Hongjun-R66092
> Cc: antlr-interest at antlr.org
> Subject: Re: [antlr-interest] An error occurs in template example
> 
> weird. and youdidn't alter the software at all?
> Ter
> On Jan 12, 2010, at 7:12 PM, Chen Hongjun-R66092 wrote:
> 
> > Hi Terence,
> > 
> > Thanks for your response. For the example 
> 'templates/generator/2pass', 
> > I used the following commands to try it out:
> > 
> > # java org.antlr.Tool *.g
> > # javac *.java
> > # java Test < input
> > 
> > Do I miss anything? What is the "init" parameter needed by 
> template? 
> > How to provide this "init" parameter for template?
> > 
> > Thanks again,
> > Hongjun
> > 
> >> -----Original Message-----
> >> From: Terence Parr [mailto:parrt at cs.usfca.edu]
> >> Sent: Wednesday, January 13, 2010 10:52 AM
> >> To: Chen Hongjun-R66092
> >> Cc: antlr-interest at antlr.org
> >> Subject: Re: [antlr-interest] An error occurs in template example
> >> 
> >> the error says you don't have an "init" parameter to the 
> template. do 
> >> you have one?
> >> Ter
> >> On Jan 12, 2010, at 6:35 PM, Chen Hongjun-R66092 wrote:
> >> 
> >>> Hi,
> >>> 
> >>> I am new to ANTLR, and am reading the book The Definitive ANTLR 
> >>> Reference. When I tried the template example
> >> 'template/generator/2pass'
> >>> without any modification, and met an error as below:
> >>> 
> >>> Exception in thread "main" 
> java.util.NoSuchElementException: no such
> >>> attribute: init in template context [jasminFile]
> >>> 	at
> >>> 
> >> 
> org.antlr.stringtemplate.StringTemplate.checkNullAttributeAgainstForm
> >> a
> >>> lA
> >>> rguments(StringTemplate.java:1311)
> >>> 	at
> >>> 
> >> 
> org.antlr.stringtemplate.StringTemplate.getAttribute(StringTemplate.j
> >> a
> >>> va
> >>> :684)
> >>> 	at
> >>> 
> >> 
> org.antlr.stringtemplate.language.ActionEvaluator.attribute(ActionEva
> >> l
> >>> ua
> >>> tor.java:360)
> >>> 	at
> >>> 
> >> 
> org.antlr.stringtemplate.language.ActionEvaluator.expr(ActionEvaluato
> >> r
> >>> .j
> >>> ava:136)
> >>> 	at
> >>> 
> >> 
> org.antlr.stringtemplate.language.ActionEvaluator.action(ActionEvalua
> >> t
> >>> or
> >>> .java:84)
> >>> 	at
> >>> org.antlr.stringtemplate.language.ASTExpr.write(ASTExpr.java:149)
> >>> 	at
> >>> 
> >> 
> org.antlr.stringtemplate.StringTemplate.write(StringTemplate.java:705
> >> )
> >>> 	at
> >>> 
> >> 
> org.antlr.stringtemplate.StringTemplate.toString(StringTemplate.java:
> >> 1
> >>> 67
> >>> 0)
> >>> 	at
> >>> 
> >> 
> org.antlr.stringtemplate.StringTemplate.toString(StringTemplate.java:
> >> 1
> >>> 66
> >>> 1)
> >>> 	at Test.main(Test.java:45)
> >>> 
> >>> I appreciate your any suggestions or ideas!
> >>> 
> >>> Thanks,
> >>> Hongjun
> >>> 
> >>> 
> >>> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> >>> Unsubscribe: 
> >>> 
> >> 
> http://www.antlr.org/mailman/options/antlr-interest/your-email-addres
> >> s
> >> 
> >> 
> >> 
> 
> 
> 

From lord.of.board at gmx.de  Thu Jan 14 01:10:24 2010
From: lord.of.board at gmx.de (lord.of.board at gmx.de)
Date: Thu, 14 Jan 2010 10:10:24 +0100
Subject: [antlr-interest] parsing boolean expressions: not not or abc
Message-ID: <20100114091024.175140@gmx.net>

Hello,

I am trying to build a grammar which accepts boolean expressions for filtering. I found some interesting articles on the web, but now I got stuck.
I try to parse something like this:

  not not or abc

The first "not" is the boolean operator and the second is a text.

Or even worse

  not not and not or and not and

My grammar look like this:

grammar TextFilterGrammar;
options {
	output=AST;
}
content :	orexpression
	;
orexpression 
	:	andexpression (OR^ andexpression)*
	;
andexpression 
	:	expression (AND^ expression)*
	;
expression 
	:	(NOT^)? term
	;
term 	:	WORD
	;

NOT 	:	'not'
	;
AND 	:	'and'
	;
OR 	:	'or'
	;
WORD	:	('a'..'z' | '0'..'9' | '%' | '_')+
	;
WS 	:	(' ' | '\r' | '\n' | '\t')  { skip(); }
	;

In ANTLRWorks I always get a MismatchedTokenException when trying to parse "not not or ljsdf". Parsing e.g. "not noti or ljsdf" works fine.

I managed to get it working with quotation marks, but I would prefer to have a solution without.

Best regards,
Lordi

-- 
GRATIS f?r alle GMX-Mitglieder: Die maxdome Movie-FLAT!
Jetzt freischalten unter http://portal.gmx.net/de/go/maxdome01

From Heiko.Folkerts at david-bs.de  Thu Jan 14 05:00:59 2010
From: Heiko.Folkerts at david-bs.de (Heiko Folkerts)
Date: Thu, 14 Jan 2010 14:00:59 +0100
Subject: [antlr-interest] Tree pattern maching using the C target
Message-ID: <93FCBF72DCE7634481C5DF1654D8FF13035A8401@DC2>

Hi all,
I wrote al litle tree pattern matcher for a specific validation we need in our grammar. ANTLR and the C compiler compile it all well but there is now "downup" mehtod for running the matcher. Instead I only see our own rules in the generated parser. So, is the method to run when using a tree pattern macher in the C target different than ^"downup"? How to run the matcher?

I tried to find an answer in the C examples but there was only a treeparser and no tree pattern matcher.

Thx+
Heiko


Mit freundlichem Gru?
Heiko Folkerts
Systementwicklung und -design
--
______________________________________________
DAVID GmbH ? Wendenring 1 ? 38114 Braunschweig
Tel.: +49 531 24379-14
Fax.: +49 531 24379-79
E-Mail: mailto:Heiko.Folkerts at david-bs.de
WWW:   http://www.david-bs.de?
Eintragung: Amtsgericht Braunschweig, HRB 3167
Gesch?ftsf?hrer: Frank Ptok
______________________________________________

 
From cummings at kjchome.homeip.net  Thu Jan 14 08:20:28 2010
From: cummings at kjchome.homeip.net (Kevin J. Cummings)
Date: Thu, 14 Jan 2010 11:20:28 -0500
Subject: [antlr-interest] parsing boolean expressions: not not or abc
In-Reply-To: <20100114091024.175140@gmx.net>
References: <20100114091024.175140@gmx.net>
Message-ID: <4B4F444C.10103@kjchome.homeip.net>

On 01/14/2010 04:10 AM, lord.of.board at gmx.de wrote:
> Hello,
> 
> I am trying to build a grammar which accepts boolean expressions for filtering. I found some interesting articles on the web, but now I got stuck.
> I try to parse something like this:
> 
>   not not or abc
> 
> The first "not" is the boolean operator and the second is a text.

NOT term OR term

> Or even worse
> 
>   not not and not or and not and

Gawk!  NOT term AND NOT term AND NOT term ????  It took me a couple of
seconds to figure out how this would be legal!  B^)

The parser is *definitely* going to need help figuring out when "not" is
a NOT and when it is a term!

> My grammar look like this:
> 
> grammar TextFilterGrammar;
> options {
> 	output=AST;
> }
> content :	orexpression
> 	;
> orexpression 
> 	:	andexpression (OR^ andexpression)*
> 	;
> andexpression 
> 	:	expression (AND^ expression)*
> 	;
> expression 
> 	:	(NOT^)? term
> 	;
> term 	:	WORD
> 	;
> 
> NOT 	:	'not'
> 	;
> AND 	:	'and'
> 	;
> OR 	:	'or'
> 	;

So, NOT, AND, and OR are reserved words in your grammar.

> WORD	:	('a'..'z' | '0'..'9' | '%' | '_')+
> 	;
> WS 	:	(' ' | '\r' | '\n' | '\t')  { skip(); }
> 	;
> 
> In ANTLRWorks I always get a MismatchedTokenException when trying to parse "not not or ljsdf". Parsing e.g. "not noti or ljsdf" works fine.
> 
> I managed to get it working with quotation marks, but I would prefer to have a solution without.

"not" will always match your TOKEN named NOT.  It will never be a WORD.
 If you wish to allow it as a term, you might want to change your term
production to be:

term : WORD | NOT | AND | OR
     ;

This should effectively allow "not", "and", and "or" to be keywords
instead of reserved words.

But then, how do you want the parser to handle the sequence "not not"?
Is that a NOT WORD or NOT NOT?  Given that you are only allowing one
optional NOT in your expression production, adding the operators to your
term production should work.  But, you'll be in a world of hurt if you
change (NOT)? term to (NOT)* term, as then there is no way to know if a
following "not" is a term or a NOT....  [gawk! the puns are getting bad!]

You may need to add a syntactic predicate to your grammar around the NOT
stuff:

expression : (NOT term)=> (NOT^) term
           | term
           ;

should help you out here....

> Best regards,
> Lordi

-- 
Kevin J. Cummings
kjchome at rcn.com
cummings at kjchome.homeip.net
cummings at kjc386.framingham.ma.us
Registered Linux User #1232 (http://counter.li.org)

From lord.of.board at gmx.de  Thu Jan 14 08:24:19 2010
From: lord.of.board at gmx.de (lord.of.board at gmx.de)
Date: Thu, 14 Jan 2010 17:24:19 +0100
Subject: [antlr-interest] parsing boolean expressions: not not or abc
Message-ID: <20100114162419.142900@gmx.net>

I received an email describing a working solution. See below:

-----------

Looks like it should work if you change term  to

  term    :       WORD | NOT | AND | OR ;

I tried it with a few examples and it seems to do ok.

(I added in ASTLabelType=CommonTree; in the options and printed
toStringTree() on the resulting tree and things look good.)

The problem is going to be showing syntax errors - because the
keywords are non-reserved it's more likely that something the user
didn't intend will acually parse.

If you can't stop people from using the keywords as terms, you should
at least discourage it.

Remember PL/I :
   IF IF = THEN THEN THEN = ELSE ELSE ELSE = IF;

Sigh... You can do it, but no one really did (if I recall my PL/I
syntax... been a while) and it was highly discouraged.

Good luck!
-- Scott

> Hello,
>
> I am trying to build a grammar which accepts boolean expressions for filtering. I found some interesting articles on the web, but now I got stuck.
> I try to parse something like this:
>
> ?not not or abc
>
> The first "not" is the boolean operator and the second is a text.
>
> Or even worse
>
> ?not not and not or and not and
>
> My grammar look like this:
>
> grammar TextFilterGrammar;
> options {
> ? ? ? ?output=AST;
> }
> content : ? ? ? orexpression
> ? ? ? ?;
> orexpression
> ? ? ? ?: ? ? ? andexpression (OR^ andexpression)*
> ? ? ? ?;
> andexpression
> ? ? ? ?: ? ? ? expression (AND^ expression)*
> ? ? ? ?;
> expression
> ? ? ? ?: ? ? ? (NOT^)? term
> ? ? ? ?;
> term ? ?: ? ? ? WORD
> ? ? ? ?;
>
> NOT ? ? : ? ? ? 'not'
> ? ? ? ?;
> AND ? ? : ? ? ? 'and'
> ? ? ? ?;
> OR ? ? ?: ? ? ? 'or'
> ? ? ? ?;
> WORD ? ?: ? ? ? ('a'..'z' | '0'..'9' | '%' | '_')+
> ? ? ? ?;
> WS ? ? ?: ? ? ? (' ' | '\r' | '\n' | '\t') ?{ skip(); }
> ? ? ? ?;
>
> In ANTLRWorks I always get a MismatchedTokenException when trying to parse "not not or ljsdf". Parsing e.g. "not noti or ljsdf" works fine.
>
> I managed to get it working with quotation marks, but I would prefer to have a solution without.
>

-- 
Jetzt kostenlos herunterladen: Internet Explorer 8 und Mozilla Firefox 3.5 -
sicherer, schneller und einfacher! http://portal.gmx.net/de/go/atbrowser

From jimi at temporal-wave.com  Thu Jan 14 08:59:57 2010
From: jimi at temporal-wave.com (Jim Idle)
Date: Thu, 14 Jan 2010 08:59:57 -0800
Subject: [antlr-interest] parsing boolean expressions: not not or abc
In-Reply-To: <20100114091024.175140@gmx.net>
Message-ID: <efb7f6c73c1f9c449d3c84de3349f858@temporal-wave.com>

Change your grammar to:

grammar T;
options {
	output=AST;
}
tokens {
	EXPR;
}

content :	orexpression EOF
		->^(EXPR orexpression)
	;
	
orexpression 
	:	andexpression (OR^ andexpression)*
	;
andexpression 
	:	expression (AND^ expression)*
	;
expression 
	:	(NOT^)? term
	;
term 	: (
		  t=WORD
		| t=AND
		| t=OR
		| t=NOT
	  )
	  {
	  	$t.setType(WORD);
	  }
	;

NOT 	:	'not'
	;
AND 	:	'and'
	;
OR 	:	'or'
	;
WORD	:	('a'..'z' | '0'..'9' | '%' | '_')+
	;
WS 	:	(' ' | '\r' | '\n' | '\t')  { skip(); }


However note that the grammar has to make some assumptions here such as the word 'not' on its own is a term and not (pun not intended) a syntax error where the not is the operator and should expect a term.

Also I suspect that your not processing rule should actually be:

expression 
	:	NOT^ expression
	|	term
	;

But this would eat not not not as a repeated not as in NOT NOT WORD

If the expression rule gets more complicated then ANTLR may not be able to predict properly.

Jim

> -----Original Message-----
> From: antlr-interest-bounces at antlr.org [mailto:antlr-interest-
> bounces at antlr.org] On Behalf Of lord.of.board at gmx.de
> Sent: Thursday, January 14, 2010 1:10 AM
> To: antlr-interest at antlr.org
> Subject: [antlr-interest] parsing boolean expressions: not not or abc
> 
> Hello,
> 
> I am trying to build a grammar which accepts boolean expressions for
> filtering. I found some interesting articles on the web, but now I got
> stuck.
> I try to parse something like this:
> 
>   not not or abc
> 
> The first "not" is the boolean operator and the second is a text.
> 
> Or even worse
> 
>   not not and not or and not and
> 
> My grammar look like this:
> 
> grammar TextFilterGrammar;
> options {
> 	output=AST;
> }
> content :	orexpression
> 	;
> orexpression
> 	:	andexpression (OR^ andexpression)*
> 	;
> andexpression
> 	:	expression (AND^ expression)*
> 	;
> expression
> 	:	(NOT^)? term
> 	;
> term 	:	WORD
> 	;
> 
> NOT 	:	'not'
> 	;
> AND 	:	'and'
> 	;
> OR 	:	'or'
> 	;
> WORD	:	('a'..'z' | '0'..'9' | '%' | '_')+
> 	;
> WS 	:	(' ' | '\r' | '\n' | '\t')  { skip(); }
> 	;
> 
> In ANTLRWorks I always get a MismatchedTokenException when trying to
> parse "not not or ljsdf". Parsing e.g. "not noti or ljsdf" works fine.
> 
> I managed to get it working with quotation marks, but I would prefer to
> have a solution without.
> 
> Best regards,
> Lordi
> 
> --
> GRATIS f?r alle GMX-Mitglieder: Die maxdome Movie-FLAT!
> Jetzt freischalten unter http://portal.gmx.net/de/go/maxdome01
> 
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-
> email-address


From jimi at temporal-wave.com  Thu Jan 14 09:02:02 2010
From: jimi at temporal-wave.com (Jim Idle)
Date: Thu, 14 Jan 2010 09:02:02 -0800
Subject: [antlr-interest] Tree pattern maching using the C target
In-Reply-To: <93FCBF72DCE7634481C5DF1654D8FF13035A8401@DC2>
Message-ID: <b2b1b6414b6c7547bf87a8ed9e4727a8@temporal-wave.com>

Pattern matcher or normal tree walker? The pattern stuff is not implemented in the C target yet. 

Jim

> -----Original Message-----
> From: antlr-interest-bounces at antlr.org [mailto:antlr-interest-
> bounces at antlr.org] On Behalf Of Heiko Folkerts
> Sent: Thursday, January 14, 2010 5:01 AM
> To: antlr-interest at antlr.org
> Subject: [antlr-interest] Tree pattern maching using the C target
> 
> Hi all,
> I wrote al litle tree pattern matcher for a specific validation we need
> in our grammar. ANTLR and the C compiler compile it all well but there
> is now "downup" mehtod for running the matcher. Instead I only see our
> own rules in the generated parser. So, is the method to run when using
> a tree pattern macher in the C target different than ^"downup"? How to
> run the matcher?
> 
> I tried to find an answer in the C examples but there was only a
> treeparser and no tree pattern matcher.
> 
> Thx+
> Heiko
> 
> 
> Mit freundlichem Gru?
> Heiko Folkerts
> Systementwicklung und -design
> --
> ______________________________________________
> DAVID GmbH ? Wendenring 1 ? 38114 Braunschweig
> Tel.: +49 531 24379-14
> Fax.: +49 531 24379-79
> E-Mail: mailto:Heiko.Folkerts at david-bs.de
> WWW:   http://www.david-bs.de
> Eintragung: Amtsgericht Braunschweig, HRB 3167
> Gesch?ftsf?hrer: Frank Ptok
> ______________________________________________
> 
> 
> 
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-
> email-address


From scott at javadude.com  Thu Jan 14 09:47:55 2010
From: scott at javadude.com (Scott Stanchfield)
Date: Thu, 14 Jan 2010 12:47:55 -0500
Subject: [antlr-interest] parsing boolean expressions: not not or abc
In-Reply-To: <efb7f6c73c1f9c449d3c84de3349f858@temporal-wave.com>
References: <20100114091024.175140@gmx.net>
	<efb7f6c73c1f9c449d3c84de3349f858@temporal-wave.com>
Message-ID: <d19d16481001140947w234e7e9cq2f05513c27262909@mail.gmail.com>

Good catch on changing the type of the token; I had forgotten to do
that on the note I sent...
-- Scott

----------------------------------------
Scott Stanchfield
http://javadude.com


On Thu, Jan 14, 2010 at 11:59 AM, Jim Idle <jimi at temporal-wave.com> wrote:
> Change your grammar to:
>
> grammar T;
> options {
> ? ? ? ?output=AST;
> }
> tokens {
> ? ? ? ?EXPR;
> }
>
> content : ? ? ? orexpression EOF
> ? ? ? ? ? ? ? ?->^(EXPR orexpression)
> ? ? ? ?;
>
> orexpression
> ? ? ? ?: ? ? ? andexpression (OR^ andexpression)*
> ? ? ? ?;
> andexpression
> ? ? ? ?: ? ? ? expression (AND^ expression)*
> ? ? ? ?;
> expression
> ? ? ? ?: ? ? ? (NOT^)? term
> ? ? ? ?;
> term ? ?: (
> ? ? ? ? ? ? ? ? ?t=WORD
> ? ? ? ? ? ? ? ?| t=AND
> ? ? ? ? ? ? ? ?| t=OR
> ? ? ? ? ? ? ? ?| t=NOT
> ? ? ? ? ?)
> ? ? ? ? ?{
> ? ? ? ? ? ? ? ?$t.setType(WORD);
> ? ? ? ? ?}
> ? ? ? ?;
>
> NOT ? ? : ? ? ? 'not'
> ? ? ? ?;
> AND ? ? : ? ? ? 'and'
> ? ? ? ?;
> OR ? ? ?: ? ? ? 'or'
> ? ? ? ?;
> WORD ? ?: ? ? ? ('a'..'z' | '0'..'9' | '%' | '_')+
> ? ? ? ?;
> WS ? ? ?: ? ? ? (' ' | '\r' | '\n' | '\t') ?{ skip(); }
>
>
> However note that the grammar has to make some assumptions here such as the word 'not' on its own is a term and not (pun not intended) a syntax error where the not is the operator and should expect a term.
>
> Also I suspect that your not processing rule should actually be:
>
> expression
> ? ? ? ?: ? ? ? NOT^ expression
> ? ? ? ?| ? ? ? term
> ? ? ? ?;
>
> But this would eat not not not as a repeated not as in NOT NOT WORD
>
> If the expression rule gets more complicated then ANTLR may not be able to predict properly.
>
> Jim
>
>> -----Original Message-----
>> From: antlr-interest-bounces at antlr.org [mailto:antlr-interest-
>> bounces at antlr.org] On Behalf Of lord.of.board at gmx.de
>> Sent: Thursday, January 14, 2010 1:10 AM
>> To: antlr-interest at antlr.org
>> Subject: [antlr-interest] parsing boolean expressions: not not or abc
>>
>> Hello,
>>
>> I am trying to build a grammar which accepts boolean expressions for
>> filtering. I found some interesting articles on the web, but now I got
>> stuck.
>> I try to parse something like this:
>>
>> ? not not or abc
>>
>> The first "not" is the boolean operator and the second is a text.
>>
>> Or even worse
>>
>> ? not not and not or and not and
>>
>> My grammar look like this:
>>
>> grammar TextFilterGrammar;
>> options {
>> ? ? ? output=AST;
>> }
>> content : ? ? orexpression
>> ? ? ? ;
>> orexpression
>> ? ? ? : ? ? ? andexpression (OR^ andexpression)*
>> ? ? ? ;
>> andexpression
>> ? ? ? : ? ? ? expression (AND^ expression)*
>> ? ? ? ;
>> expression
>> ? ? ? : ? ? ? (NOT^)? term
>> ? ? ? ;
>> term ?: ? ? ? WORD
>> ? ? ? ;
>>
>> NOT ? : ? ? ? 'not'
>> ? ? ? ;
>> AND ? : ? ? ? 'and'
>> ? ? ? ;
>> OR ? ?: ? ? ? 'or'
>> ? ? ? ;
>> WORD ?: ? ? ? ('a'..'z' | '0'..'9' | '%' | '_')+
>> ? ? ? ;
>> WS ? ?: ? ? ? (' ' | '\r' | '\n' | '\t') ?{ skip(); }
>> ? ? ? ;
>>
>> In ANTLRWorks I always get a MismatchedTokenException when trying to
>> parse "not not or ljsdf". Parsing e.g. "not noti or ljsdf" works fine.
>>
>> I managed to get it working with quotation marks, but I would prefer to
>> have a solution without.
>>
>> Best regards,
>> Lordi
>>
>> --
>> GRATIS f?r alle GMX-Mitglieder: Die maxdome Movie-FLAT!
>> Jetzt freischalten unter http://portal.gmx.net/de/go/maxdome01
>>
>> List: http://www.antlr.org/mailman/listinfo/antlr-interest
>> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-
>> email-address
>
>
>
>
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address
>

From parrt at cs.usfca.edu  Thu Jan 14 16:47:05 2010
From: parrt at cs.usfca.edu (Terence Parr)
Date: Thu, 14 Jan 2010 16:47:05 -0800
Subject: [antlr-interest] ANTLRWorks plugin for intellij
Message-ID: <F59F53CF-882A-4F74-AB3E-52283DF444A5@cs.usfca.edu>

hi. how many people use AW as a plugin *inside* intellij?  it really complicates the code and I'm thinking of dumping it; might make it easier for eclipse plugins too if theyr'e not worried about intellij plugin code intermingled in AW.

just getting an idea of how many people use it that way.  It's not the best integration with intellij so I use AW standalone personally.

Thanks,
Ter

From bkiers at gmail.com  Thu Jan 14 23:28:20 2010
From: bkiers at gmail.com (Bart Kiers)
Date: Fri, 15 Jan 2010 08:28:20 +0100
Subject: [antlr-interest] ANTLRWorks plugin for intellij
In-Reply-To: <F59F53CF-882A-4F74-AB3E-52283DF444A5@cs.usfca.edu>
References: <F59F53CF-882A-4F74-AB3E-52283DF444A5@cs.usfca.edu>
Message-ID: <af443a971001142328u39dd133bvcbb9afb1d3eaa973@mail.gmail.com>

I first used the plug-in with IntelliJ (v9), but found it a bit buggy: quite
a few error messages (sorry, not too concrete...). I use the stand alone
ANTLRWorks (much to my liking!).

Regards,

Bart.


On Fri, Jan 15, 2010 at 1:47 AM, Terence Parr <parrt at cs.usfca.edu> wrote:

> hi. how many people use AW as a plugin *inside* intellij?  it really
> complicates the code and I'm thinking of dumping it; might make it easier
> for eclipse plugins too if theyr'e not worried about intellij plugin code
> intermingled in AW.
>
> just getting an idea of how many people use it that way.  It's not the best
> integration with intellij so I use AW standalone personally.
>
> Thanks,
> Ter
>
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe:
> http://www.antlr.org/mailman/options/antlr-interest/your-email-address
>

From arne.schroeder at gmail.com  Fri Jan 15 00:57:24 2010
From: arne.schroeder at gmail.com (=?ISO-8859-1?Q?Arne_Schr=F6der?=)
Date: Fri, 15 Jan 2010 09:57:24 +0100
Subject: [antlr-interest] Missing error when tokens are left to parse
Message-ID: <d972facc1001150057q23140056s6453d145a0763817@mail.gmail.com>

Hello,

I am trying to write a parser for an initialization-file. This file is
devided in sections which are not embraced but have a keyword to start them.

Unfortunately the parser stops when encountering a problem and just ends the
parsing-process, not even reporting an error.

For demostration of the problem I wrote the following example-grammar:

file    : section1 section2?
        ;

section1: 'Section1'
        ;

section2: 'Section2'
        ;

ID      : ('a'..'z'|'A'..'Z')+
        ;

SPACE   : ' ' {$channel = HIDDEN;}
        ;

Now using the input "Section1 bla Section2", I would expect the parser to
stop at "bla", throw an UnwantedTokenException, do a SingleTokenDeletion as
error-recovery and just continue parsing "Section2".
What happens is that it stops at "bla", does not recognize it as section2
and just terminates, leaving the two tokens unparsed and not reporting any
error.

So my question is: How can I avoid my parser doing stuff like that without
changing my files' syntax?


Thanks in advance

Arne

From arne.schroeder at gmail.com  Fri Jan 15 01:43:28 2010
From: arne.schroeder at gmail.com (=?ISO-8859-1?Q?Arne_Schr=F6der?=)
Date: Fri, 15 Jan 2010 10:43:28 +0100
Subject: [antlr-interest] [il-antlr-interest: 27542] Missing error when
	tokens are left to parse
In-Reply-To: <1ec078df1001150127r753cb368p3e70c1039d59101d@mail.gmail.com>
References: <d972facc1001150057q23140056s6453d145a0763817@mail.gmail.com> 
	<1ec078df1001150127r753cb368p3e70c1039d59101d@mail.gmail.com>
Message-ID: <d972facc1001150143m51bb493fi6c8ff8a58fe745fa@mail.gmail.com>

Thank you for your quick help. It might work in that case but does not help
me with my real problem. So I will alter the example to have it closer to my
real problem:

file    : section1 section2?
        ;

section1: 'Section1' method*
        ;

section2: 'Section2' method*
        ;

method  : ID LPARENT RPARENT
        ;

ID      : ('a'..'z'|'A'..'Z')+
        ;

LPARENT : '(' ;
RPARENT : ')' ;

SPACE   : ' ' {$channel = HIDDEN;}
        ;

If I now try to parse "Section1 bla()) Section2" something similar happens:
It parses up to the second ")" and then decides to skip the rest. And I
definitely do not want the second ")" to be there i.e. want it to throw a
recognition-error and recover itself.

On Fri, Jan 15, 2010 at 10:27 AM, Akira Akira <akira.lists.1948 at gmail.com>wrote:

> I am not sure if this is what you want, but what about changing to
> something like the following? (the parts I added are in bold)
>
>
> file    : section1 section2?
>        ;
>
> section1: 'Section1' *CONTENTS*
>        ;
>
> section2: 'Section2' *CONTENTS*
>
>        ;
>
> ID      : ('a'..'z'|'A'..'Z')+
>        ;
>
> *CONTENTS      : ('a'..'z'|'A'..'Z')*
>        ;*
>
> SPACE   : ' ' {$channel = HIDDEN;}
>        ;
>
>
>
> 2010/1/15 Arne Schr?der <arne.schroeder at gmail.com>
>
>> Hello,
>>
>> I am trying to write a parser for an initialization-file. This file is
>> devided in sections which are not embraced but have a keyword to start
>> them.
>>
>> Unfortunately the parser stops when encountering a problem and just ends
>> the
>> parsing-process, not even reporting an error.
>>
>> For demostration of the problem I wrote the following example-grammar:
>>
>> file    : section1 section2?
>>        ;
>>
>> section1: 'Section1'
>>        ;
>>
>> section2: 'Section2'
>>        ;
>>
>> ID      : ('a'..'z'|'A'..'Z')+
>>        ;
>>
>> SPACE   : ' ' {$channel = HIDDEN;}
>>        ;
>>
>> Now using the input "Section1 bla Section2", I would expect the parser to
>> stop at "bla", throw an UnwantedTokenException, do a SingleTokenDeletion
>> as
>> error-recovery and just continue parsing "Section2".
>> What happens is that it stops at "bla", does not recognize it as section2
>> and just terminates, leaving the two tokens unparsed and not reporting any
>> error.
>>
>> So my question is: How can I avoid my parser doing stuff like that without
>> changing my files' syntax?
>>
>>
>> Thanks in advance
>>
>> Arne
>>
>> List: http://www.antlr.org/mailman/listinfo/antlr-interest
>> Unsubscribe:
>> http://www.antlr.org/mailman/options/antlr-interest/your-email-address
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "il-antlr-interest" group.
>> To post to this group, send email to il-antlr-interest at googlegroups.com.
>> To unsubscribe from this group, send email to
>> il-antlr-interest+unsubscribe at googlegroups.com<il-antlr-interest%2Bunsubscribe at googlegroups.com>
>> .
>> For more options, visit this group at
>> http://groups.google.com/group/il-antlr-interest?hl=en.
>>
>>
>>
>>
>

From antlr at mirality.co.nz  Fri Jan 15 03:10:05 2010
From: antlr at mirality.co.nz (Gavin Lambert)
Date: Sat, 16 Jan 2010 00:10:05 +1300
Subject: [antlr-interest] Missing error when tokens are left to  parse
In-Reply-To: <d972facc1001150143m51bb493fi6c8ff8a58fe745fa@mail.gmail.co
 m>
References: <d972facc1001150057q23140056s6453d145a0763817@mail.gmail.com>
	<1ec078df1001150127r753cb368p3e70c1039d59101d@mail.gmail.com>
	<d972facc1001150143m51bb493fi6c8ff8a58fe745fa@mail.gmail.com>
Message-ID: <20100115111014.230933418446@www.antlr.org>

At 22:43 15/01/2010, Arne Schr?der wrote:
 >file    : section1 section2?
 >        ;
[...]
 >If I now try to parse "Section1 bla()) 
Section2" something similar
 >happens:
 >It parses up to the second ")" and then decides 
to skip the rest.
 >And I definitely do not want the second ")" to 
be there i.e. want
 >it to throw a recognition-error and recover itself.

Try adding EOF to the end of your top-level 
rule.  Without that, ANTLR assumes that it is not 
required to parse all the input, so if it 
successfully parses a section1 it will just 
decide that the section2 has been omitted (since it's optional).


From m.y.speyer at inter.nl.net  Fri Jan 15 03:57:12 2010
From: m.y.speyer at inter.nl.net (Marc Speyer)
Date: Fri, 15 Jan 2010 12:57:12 +0100
Subject: [antlr-interest]  Tree pattern maching using the C target
Message-ID: <000901ca95d9$df6dd740$9e4985c0$@y.speyer@inter.nl.net>

Hi all,

I have a similar issue using the C# target. Using the Cymbol.g example of
pattern 17 Symbol Table for Nested Scopes of the Language Implementation
Patterns book I could not get it to work because there is now downup method.
According to the documentation this method walks the AST code using ANTLR's
built-in downup( ) strategy. 

Am I correct assuming that this has not been implemented yet for the C#
target (as Jim implies in his response). Is it difficult to implement it
myself? I guess it would involve implementing the tree pattern matching
stuff.

Marc
P.S. Hope this email files under the proper subject thread, and apologies in
advance if it isn't (Just subscribed to the mailing list but I could not
find out how to get previous posts from it)
 
> Pattern matcher or normal tree walker? The pattern stuff is not
implemented in the C target yet. 
>
> Jim
>
>> -----Original Message-----
>> From: antlr-interest-bounces at antlr.org [mailto:antlr-interest-
>> bounces at antlr.org] On Behalf Of Heiko Folkerts
>> Sent: Thursday, January 14, 2010 5:01 AM
>> To: antlr-interest at antlr.org
>> Subject: [antlr-interest] Tree pattern maching using the C target
>> 
>> Hi all,
>> I wrote al litle tree pattern matcher for a specific validation we need
>> in our grammar. ANTLR and the C compiler compile it all well but there
>> is now "downup" mehtod for running the matcher. Instead I only see our
>> own rules in the generated parser. So, is the method to run when using
>> a tree pattern macher in the C target different than ^"downup"? How to
>> run the matcher?
>> 
>> I tried to find an answer in the C examples but there was only a
>> treeparser and no tree pattern matcher.
>> 
>> Thx+
>> Heiko
>> 
>> 
>> --


From JALuber at gmx.de  Fri Jan 15 04:58:33 2010
From: JALuber at gmx.de (Johannes Luber)
Date: Fri, 15 Jan 2010 13:58:33 +0100
Subject: [antlr-interest] Tree pattern maching using the C target
In-Reply-To: <000901ca95d9$df6dd740$9e4985c0$@y.speyer@inter.nl.net>
References: <000901ca95d9$df6dd740$9e4985c0$@y.speyer@inter.nl.net>
Message-ID: <20100115125833.242280@gmx.net>

> Hi all,
> 
> I have a similar issue using the C# target. Using the Cymbol.g example of
> pattern 17 Symbol Table for Nested Scopes of the Language Implementation
> Patterns book I could not get it to work because there is now downup
> method.
> According to the documentation this method walks the AST code using
> ANTLR's
> built-in downup( ) strategy. 
> 
> Am I correct assuming that this has not been implemented yet for the C#
> target (as Jim implies in his response). Is it difficult to implement it
> myself? I guess it would involve implementing the tree pattern matching
> stuff.
> 
> Marc

You are correct - there is no official version yet, which implements tree pattern matching. I haven't gotten around to the API changes yet (will work on that next week), though I have checked in some untested changes. It would be the easieast if you'd base your own code on that for now.

Johannes

> P.S. Hope this email files under the proper subject thread, and apologies
> in
> advance if it isn't (Just subscribed to the mailing list but I could not
> find out how to get previous posts from it)
>  
> > Pattern matcher or normal tree walker? The pattern stuff is not
> implemented in the C target yet. 
> >
> > Jim
> >
> >> -----Original Message-----
> >> From: antlr-interest-bounces at antlr.org [mailto:antlr-interest-
> >> bounces at antlr.org] On Behalf Of Heiko Folkerts
> >> Sent: Thursday, January 14, 2010 5:01 AM
> >> To: antlr-interest at antlr.org
> >> Subject: [antlr-interest] Tree pattern maching using the C target
> >> 
> >> Hi all,
> >> I wrote al litle tree pattern matcher for a specific validation we need
> >> in our grammar. ANTLR and the C compiler compile it all well but there
> >> is now "downup" mehtod for running the matcher. Instead I only see our
> >> own rules in the generated parser. So, is the method to run when using
> >> a tree pattern macher in the C target different than ^"downup"? How to
> >> run the matcher?
> >> 
> >> I tried to find an answer in the C examples but there was only a
> >> treeparser and no tree pattern matcher.
> >> 
> >> Thx+
> >> Heiko
> >> 
> >> 
> >> --
> 
> 
> 
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe:
> http://www.antlr.org/mailman/options/antlr-interest/your-email-address

-- 
GRATIS f?r alle GMX-Mitglieder: Die maxdome Movie-FLAT!
Jetzt freischalten unter http://portal.gmx.net/de/go/maxdome01

From frogery at voila.fr  Fri Jan 15 05:53:25 2010
From: frogery at voila.fr (frogery at voila.fr)
Date: Fri, 15 Jan 2010 14:53:25 +0100 (CET)
Subject: [antlr-interest] Overriding the emit function to use custom tokens
Message-ID: <17030639.865681263563605671.JavaMail.www@wwinf4613>

Hello,

I wanted to create a custom token object, so I have seen in the FAQ that I had to "override" the lexer emit function. So I did that this way: 

...
        pLexer = antlrLexerNew(pInput);
        pLexer->pLexer->emit = customEmit;
...

but it was not working.

The customEmit function was never called. So I have debugged and I think there is a bug in antlr3lexer.c. In the nextTokenStr function, shouldn't "emit(lexer)" be replaced by "lexer->emit(lexer);"? What do you think?

Thanks,
Yann

____________________________________________________

Vous n?avez pas encore adress? vos voeux ? Retrouvez nos cartes sur http://carte-de-voeux.voila.fr 


____________________________________________________

Vous n?avez pas encore adress? vos voeux?? Retrouvez nos cartes sur http://carte-de-voeux.voila.fr 


From Gordon.Tyler at quest.com  Fri Jan 15 06:48:49 2010
From: Gordon.Tyler at quest.com (Gordon Tyler)
Date: Fri, 15 Jan 2010 06:48:49 -0800
Subject: [antlr-interest] ANTLRWorks plugin for intellij
In-Reply-To: <F59F53CF-882A-4F74-AB3E-52283DF444A5@cs.usfca.edu>
References: <F59F53CF-882A-4F74-AB3E-52283DF444A5@cs.usfca.edu>
Message-ID: <1FE9A296676737419A8912A6FD22AE1D01E0479ED4@alvxmbw04.prod.quest.corp>

I tried it but I found the ANTLRworks editor too different to the IDEA editor to be comfortable. I was hoping for ANTLR syntax support in the normal IDEA editor.

I haven't tried the standalone ANTLRworks editor.

-----Original Message-----
From: antlr-interest-bounces at antlr.org [mailto:antlr-interest-bounces at antlr.org] On Behalf Of Terence Parr
Sent: January 14, 2010 7:47 PM
To: antlr-interest at antlr.org interest
Subject: [antlr-interest] ANTLRWorks plugin for intellij

hi. how many people use AW as a plugin *inside* intellij?  it really complicates the code and I'm thinking of dumping it; might make it easier for eclipse plugins too if theyr'e not worried about intellij plugin code intermingled in AW.

just getting an idea of how many people use it that way.  It's not the best integration with intellij so I use AW standalone personally.

Thanks,
Ter

List: http://www.antlr.org/mailman/listinfo/antlr-interest
Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address

From jimi at temporal-wave.com  Fri Jan 15 09:05:41 2010
From: jimi at temporal-wave.com (Jim Idle)
Date: Fri, 15 Jan 2010 09:05:41 -0800
Subject: [antlr-interest] Missing error when tokens are left to parse
In-Reply-To: <d972facc1001150057q23140056s6453d145a0763817@mail.gmail.com>
References: <d972facc1001150057q23140056s6453d145a0763817@mail.gmail.com>
Message-ID: <1606CC72-9CC1-4F8B-B12B-4DFE70460DCA@temporal-wave.com>

This is an FAQ I think. Your start rule does not end in EOF and so  
ANTLR stops parsing when the next token is not predicted.

Jim

On Jan 15, 2010, at 0:57, Arne Schr?der <arne.schroeder at gmail.com>  
wrote:

> Hello,
>
> I am trying to write a parser for an initialization-file. This file is
> devided in sections which are not embraced but have a keyword to  
> start them.
>
> Unfortunately the parser stops when encountering a problem and just  
> ends the
> parsing-process, not even reporting an error.
>
> For demostration of the problem I wrote the following example-grammar:
>
> file    : section1 section2?
>        ;
>
> section1: 'Section1'
>        ;
>
> section2: 'Section2'
>        ;
>
> ID      : ('a'..'z'|'A'..'Z')+
>        ;
>
> SPACE   : ' ' {$channel = HIDDEN;}
>        ;
>
> Now using the input "Section1 bla Section2", I would expect the  
> parser to
> stop at "bla", throw an UnwantedTokenException, do a  
> SingleTokenDeletion as
> error-recovery and just continue parsing "Section2".
> What happens is that it stops at "bla", does not recognize it as  
> section2
> and just terminates, leaving the two tokens unparsed and not  
> reporting any
> error.
>
> So my question is: How can I avoid my parser doing stuff like that  
> without
> changing my files' syntax?
>
>
> Thanks in advance
>
> Arne
>
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address

From parrt at cs.usfca.edu  Fri Jan 15 09:23:14 2010
From: parrt at cs.usfca.edu (Terence Parr)
Date: Fri, 15 Jan 2010 09:23:14 -0800
Subject: [antlr-interest] ANTLRWorks plugin for intellij
In-Reply-To: <1FE9A296676737419A8912A6FD22AE1D01E0479ED4@alvxmbw04.prod.quest.corp>
References: <F59F53CF-882A-4F74-AB3E-52283DF444A5@cs.usfca.edu>
	<1FE9A296676737419A8912A6FD22AE1D01E0479ED4@alvxmbw04.prod.quest.corp>
Message-ID: <B965D2A1-3C04-46D8-9560-8BD771FD8D07@cs.usfca.edu>

Yeah, Jean graduated long before he could work on plugin I think. he was doing the plugin for "free" and it was an afterthought. 

Ok, I'll talk to Jean.
Ter
On Jan 15, 2010, at 6:48 AM, Gordon Tyler wrote:

> I tried it but I found the ANTLRworks editor too different to the IDEA editor to be comfortable. I was hoping for ANTLR syntax support in the normal IDEA editor.
> 
> I haven't tried the standalone ANTLRworks editor.
> 
> -----Original Message-----
> From: antlr-interest-bounces at antlr.org [mailto:antlr-interest-bounces at antlr.org] On Behalf Of Terence Parr
> Sent: January 14, 2010 7:47 PM
> To: antlr-interest at antlr.org interest
> Subject: [antlr-interest] ANTLRWorks plugin for intellij
> 
> hi. how many people use AW as a plugin *inside* intellij?  it really complicates the code and I'm thinking of dumping it; might make it easier for eclipse plugins too if theyr'e not worried about intellij plugin code intermingled in AW.
> 
> just getting an idea of how many people use it that way.  It's not the best integration with intellij so I use AW standalone personally.
> 
> Thanks,
> Ter
> 
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address


From jimi at temporal-wave.com  Fri Jan 15 09:50:03 2010
From: jimi at temporal-wave.com (Jim Idle)
Date: Fri, 15 Jan 2010 09:50:03 -0800
Subject: [antlr-interest] Overriding the emit function to use custom
	tokens
In-Reply-To: <17030639.865681263563605671.JavaMail.www@wwinf4613>
Message-ID: <97280a5c20eb8b4788d15c30e61120a0@temporal-wave.com>

No, you have to override nextToken too it calls emit directly for performance reasons. 

However, no one really needs to do this. There is a user defined pointer built in to every token and a function pointer that is called when the token is released (if it is not NULL). So you can just add your custom token stuff there and rely on the default runtime.

Jim

> -----Original Message-----
> From: antlr-interest-bounces at antlr.org [mailto:antlr-interest-
> bounces at antlr.org] On Behalf Of frogery at voila.fr
> Sent: Friday, January 15, 2010 5:53 AM
> To: antlr-interest at antlr.org
> Subject: [antlr-interest] Overriding the emit function to use custom
> tokens
> 
> Hello,
> 
> I wanted to create a custom token object, so I have seen in the FAQ
> that I had to "override" the lexer emit function. So I did that this
> way:
> 
> ...
>         pLexer = antlrLexerNew(pInput);
>         pLexer->pLexer->emit = customEmit;
> ...
> 
> but it was not working.
> 
> The customEmit function was never called. So I have debugged and I
> think there is a bug in antlr3lexer.c. In the nextTokenStr function,
> shouldn't "emit(lexer)" be replaced by "lexer->emit(lexer);"? What do
> you think?
> 
> Thanks,
> Yann
> 
> ____________________________________________________
> 
> Vous n?avez pas encore adress? vos voeux ? Retrouvez nos cartes sur
> http://carte-de-voeux.voila.fr
> 
> 
> ____________________________________________________
> 
> Vous n?avez pas encore adress? vos voeux?? Retrouvez nos cartes sur
> http://carte-de-voeux.voila.fr
> 
> 
> 
> 
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-
> email-address


From yurushkin at rambler.ru  Fri Jan 15 09:50:49 2010
From: yurushkin at rambler.ru (=?koi8-r?B?4NLV28vJziDtycjBycw=?=)
Date: Fri, 15 Jan 2010 20:50:49 +0300
Subject: [antlr-interest] Fortran lexer problem
Message-ID: <op.u6k46zict3jqlu@win-mupvrp0jyrf>

Good day,

I want to add comments of Fortran 77:

"c xxxxx";
First symbol in column is 'c' - it means that the following line is a line
of comment.

but I also have NAME token, that will conflict with such COMMENT rule.
('c' can be a name).

Is it possible to select rule by my own predicate? Are there any other
more clear solvings of
this problem?


-- 
Best regards,
Michael

From jimi at temporal-wave.com  Fri Jan 15 10:20:46 2010
From: jimi at temporal-wave.com (Jim Idle)
Date: Fri, 15 Jan 2010 10:20:46 -0800
Subject: [antlr-interest] Fortran lexer problem
In-Reply-To: <op.u6k46zict3jqlu@win-mupvrp0jyrf>
Message-ID: <6482427cf4e64b4f8ad286ed88e1f2c4@temporal-wave.com>

I think Fortran comments that start with C have to have the C in character position 0 (or 1 in Fortran I guess ;-). So your comment rule can be predicated by checking for line position 0 in ANTLR terms.

Jim

> -----Original Message-----
> From: antlr-interest-bounces at antlr.org [mailto:antlr-interest-
> bounces at antlr.org] On Behalf Of ??????? ??????
> Sent: Friday, January 15, 2010 9:51 AM
> To: antlr-interest at antlr.org
> Subject: [antlr-interest] Fortran lexer problem
> 
> Good day,
> 
> I want to add comments of Fortran 77:
> 
> "c xxxxx";
> First symbol in column is 'c' - it means that the following line is a
> line
> of comment.
> 
> but I also have NAME token, that will conflict with such COMMENT rule.
> ('c' can be a name).
> 
> Is it possible to select rule by my own predicate? Are there any other
> more clear solvings of
> this problem?
> 
> 
> --
> Best regards,
> Michael
> 
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-
> email-address


From yurushkin at rambler.ru  Fri Jan 15 10:27:22 2010
From: yurushkin at rambler.ru (=?utf-8?B?0K7RgNGD0YjQutC40L0g0JzQuNGF0LDQuNC7?=)
Date: Fri, 15 Jan 2010 21:27:22 +0300
Subject: [antlr-interest] Fortran lexer problem
In-Reply-To: <6482427cf4e64b4f8ad286ed88e1f2c4@temporal-wave.com>
References: <6482427cf4e64b4f8ad286ed88e1f2c4@temporal-wave.com>
Message-ID: <op.u6k6vwi66gpi6j@win-mupvrp0jyrf>

Excuse me, but how can I specify this condition (is it a first symbol and  
symbol='c')?
Could you send me a piece of lexer grammar?


Jim Idle <jimi at temporal-wave.com> ?????(?) ? ????? ?????? Fri, 15 Jan 2010  
21:20:46 +0300:

> I think Fortran comments that start with C have to have the C in  
> character position 0 (or 1 in Fortran I guess ;-). So your comment rule  
> can be predicated by checking for line position 0 in ANTLR terms.
>
> Jim
>
>> -----Original Message-----
>> From: antlr-interest-bounces at antlr.org [mailto:antlr-interest-
>> bounces at antlr.org] On Behalf Of ??????? ??????
>> Sent: Friday, January 15, 2010 9:51 AM
>> To: antlr-interest at antlr.org
>> Subject: [antlr-interest] Fortran lexer problem
>>
>> Good day,
>>
>> I want to add comments of Fortran 77:
>>
>> "c xxxxx";
>> First symbol in column is 'c' - it means that the following line is a
>> line
>> of comment.
>>
>> but I also have NAME token, that will conflict with such COMMENT rule.
>> ('c' can be a name).
>>
>> Is it possible to select rule by my own predicate? Are there any other
>> more clear solvings of
>> this problem?
>>
>>
>> --
>> Best regards,
>> Michael
>>
>> List: http://www.antlr.org/mailman/listinfo/antlr-interest
>> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-
>> email-address
>
>
>
>
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe:  
> http://www.antlr.org/mailman/options/antlr-interest/your-email-address
>
> __________ Information from ESET Smart Security, version of virus  
> signature database 4775 (20100115) __________
>
> The message was checked by ESET Smart Security.
>
> http://www.esetnod32.ru
>
>
>


-- 
Best regards,
Michael

From yurushkin at rambler.ru  Fri Jan 15 10:56:21 2010
From: yurushkin at rambler.ru (=?utf-8?B?0K7RgNGD0YjQutC40L0g0JzQuNGF0LDQuNC7?=)
Date: Fri, 15 Jan 2010 21:56:21 +0300
Subject: [antlr-interest] Fortran lexer problem
In-Reply-To: <op.u6k6vwi66gpi6j@win-mupvrp0jyrf>
References: <6482427cf4e64b4f8ad286ed88e1f2c4@temporal-wave.com>
	<op.u6k6vwi66gpi6j@win-mupvrp0jyrf>
Message-ID: <op.u6k777vr6gpi6j@win-mupvrp0jyrf>

I have the following term

LINE_COMMENT
     : ({blabla}? ('c' | 'C' | '*') | '!' )  ~('\n')*
         {
             $channel = HIDDEN;
         }
     ;

but it only pasts the following code at the end:


            switch (alt31)
             {
         	case 1:
         	    {
         	        if ( !((blabla)) )
         	        {
         	                CONSTRUCTEX();
         	                EXCEPTION->type         =  
ANTLR3_FAILED_PREDICATE_EXCEPTION;
         	                EXCEPTION->message      = (void *)"blabla";
         	                EXCEPTION->ruleName	 = (void *)"LINE_COMMENT";
         	        }


if "blabla" is false, an error is occured... but it's not right.

??????? ?????? <yurushkin at rambler.ru> ?????(?) ? ????? ?????? Fri, 15 Jan  
2010 21:27:22 +0300:

> Excuse me, but how can I specify this condition (is it a first symbol and
> symbol='c')?
> Could you send me a piece of lexer grammar?
>
>
> Jim Idle <jimi at temporal-wave.com> ?????(?) ? ????? ?????? Fri, 15 Jan  
> 2010
> 21:20:46 +0300:
>
>> I think Fortran comments that start with C have to have the C in
>> character position 0 (or 1 in Fortran I guess ;-). So your comment rule
>> can be predicated by checking for line position 0 in ANTLR terms.
>>
>> Jim
>>
>>> -----Original Message-----
>>> From: antlr-interest-bounces at antlr.org [mailto:antlr-interest-
>>> bounces at antlr.org] On Behalf Of ??????? ??????
>>> Sent: Friday, January 15, 2010 9:51 AM
>>> To: antlr-interest at antlr.org
>>> Subject: [antlr-interest] Fortran lexer problem
>>>
>>> Good day,
>>>
>>> I want to add comments of Fortran 77:
>>>
>>> "c xxxxx";
>>> First symbol in column is 'c' - it means that the following line is a
>>> line
>>> of comment.
>>>
>>> but I also have NAME token, that will conflict with such COMMENT rule.
>>> ('c' can be a name).
>>>
>>> Is it possible to select rule by my own predicate? Are there any other
>>> more clear solvings of
>>> this problem?
>>>
>>>
>>> --
>>> Best regards,
>>> Michael
>>>
>>> List: http://www.antlr.org/mailman/listinfo/antlr-interest
>>> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-
>>> email-address
>>
>>
>>
>>
>> List: http://www.antlr.org/mailman/listinfo/antlr-interest
>> Unsubscribe:
>> http://www.antlr.org/mailman/options/antlr-interest/your-email-address
>>
>> __________ Information from ESET Smart Security, version of virus
>> signature database 4775 (20100115) __________
>>
>> The message was checked by ESET Smart Security.
>>
>> http://www.esetnod32.ru
>>
>>
>>
>
>


-- 
Best regards,
Michael

From zep_antlr at bahj.com  Fri Jan 15 14:02:40 2010
From: zep_antlr at bahj.com (Zachary Palmer)
Date: Fri, 15 Jan 2010 17:02:40 -0500
Subject: [antlr-interest] First and Last Token of a Rule
Message-ID: <4B50E600.6090005@bahj.com>

All,

I think this is a pretty simple operation, but I have no idea how to 
execute it.  Suppose I'm in some action code and have a reference to the 
parser.  Is there a way for me to obtain the most recently used token?  
How about the token that started the most recent grammar rule?

For instance, consider the following grammar (using a Java target language):

foo: 'a' bar* 'd' { doStuff(); };
bar: ('b' | 'c') { doStuff(); };

Let's assume we are feeding this grammar the string "abcd".  In that 
case, doStuff is called three times: once after the token 'b' is matched 
in the bar rule, once after the token 'c' is matched in the bar rule, 
and once after the tokens 'a' through 'd' are matched in the foo rule.  
I would like, from within the body of the doStuff method, to obtain the 
first and last token of each rule matched.  So, for instance, if my 
doStuff method looked like this:

void doStuff() {
    Token first = ...; // first token of the current rule
    Token last = ...; // token most recently used
    System.out.println(first.getText() + ", " + last.getText());
}

then the output to the above grammar when provided the input "abcd" 
should be

b,b
c,c
a,d

This is, of course, a representative example; the real situation is a 
bit more complicated.  The catch is that I don't want to add any 
arguments to the doStuff method or do anything else that would require 
me to change every rule in this 3,000 line grammar.  Is there a way that 
I can get the first token of the current rule and the most recently used 
token without tweaking every single grammar rule?

Many thanks for reading!

Zachary Palmer

From jimi at temporal-wave.com  Fri Jan 15 15:41:56 2010
From: jimi at temporal-wave.com (Jim Idle)
Date: Fri, 15 Jan 2010 15:41:56 -0800
Subject: [antlr-interest] First and Last Token of a Rule
In-Reply-To: <4B50E600.6090005@bahj.com>
Message-ID: <5e608006169f3d4494c1b7c337411109@temporal-wave.com>

The upcoming token at any point is returned by input.LT(1), the previous token by input.LT(-1)

So:

foo
@init {
 CommonToken sToken = input.LT(1);
}
: A bar* D { doStuff(sToken, input.LT(-1)); }
;

And so on. Also look at things like $start depending on what the output is etc.

However, you will be much better off building an AST then walking the tree to do your actions. 

Jim

> -----Original Message-----
> From: antlr-interest-bounces at antlr.org [mailto:antlr-interest-
> bounces at antlr.org] On Behalf Of Zachary Palmer
> Sent: Friday, January 15, 2010 2:03 PM
> To: antlr-interest at antlr.org
> Subject: [antlr-interest] First and Last Token of a Rule
> 
> All,
> 
> I think this is a pretty simple operation, but I have no idea how to
> execute it.  Suppose I'm in some action code and have a reference to
> the
> parser.  Is there a way for me to obtain the most recently used token?
> How about the token that started the most recent grammar rule?
> 
> For instance, consider the following grammar (using a Java target
> language):
> 
> foo: 'a' bar* 'd' { doStuff(); };
> bar: ('b' | 'c') { doStuff(); };
> 
> Let's assume we are feeding this grammar the string "abcd".  In that
> case, doStuff is called three times: once after the token 'b' is
> matched
> in the bar rule, once after the token 'c' is matched in the bar rule,
> and once after the tokens 'a' through 'd' are matched in the foo rule.
> I would like, from within the body of the doStuff method, to obtain the
> first and last token of each rule matched.  So, for instance, if my
> doStuff method looked like this:
> 
> void doStuff() {
>     Token first = ...; // first token of the current rule
>     Token last = ...; // token most recently used
>     System.out.println(first.getText() + ", " + last.getText());
> }
> 
> then the output to the above grammar when provided the input "abcd"
> should be
> 
> b,b
> c,c
> a,d
> 
> This is, of course, a representative example; the real situation is a
> bit more complicated.  The catch is that I don't want to add any
> arguments to the doStuff method or do anything else that would require
> me to change every rule in this 3,000 line grammar.  Is there a way
> that
> I can get the first token of the current rule and the most recently
> used
> token without tweaking every single grammar rule?
> 
> Many thanks for reading!
> 
> Zachary Palmer
> 
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-
> email-address


From zep_antlr at bahj.com  Fri Jan 15 16:09:32 2010
From: zep_antlr at bahj.com (Zachary Palmer)
Date: Fri, 15 Jan 2010 19:09:32 -0500
Subject: [antlr-interest] First and Last Token of a Rule
In-Reply-To: <5e608006169f3d4494c1b7c337411109@temporal-wave.com>
References: <5e608006169f3d4494c1b7c337411109@temporal-wave.com>
Message-ID: <4B5103BC.5030003@bahj.com>

Jim,

Thanks for the reply.  :)   That's good to know.  Any idea about how to 
get the first token in a given rule?  With the information you've given 
me, I could always stick something in an @init and an @after in every 
rule, but I'd definitely like to avoid that.  I guess what I'm really 
wanting is an @allrulesinit and an @allrulesafter (to occur before and 
after the @init and @after, respectively), but it doesn't seem like 
those exist.

In fact, I am building an AST.  The actions I mentioned previously are 
doing just that and every rule I have is of the (unfortunate) form:

foo returns [FooNode ret]
   :
       bar ';'
       {
           $ret = factory.makeFooNode($bar.ret);
       }
   ;

Because I want node creation to be indirected through a factory (and 
because I want a heterogeneous AST), there doesn't seem to be any choice 
but to use this approach.  The people who wrote the ANTLR3 Java 1.5 
grammar I pulled from the ANTLR website seemed to agree; the OpenJDK 
project uses the same approach for their ANTLR parser.  I've gotten 
exactly the tree I needed (built to a different API than the Java 
Compiler API for purposes of my project) and now I want to tag those 
nodes with their start and end tokens.  I might actually have some luck 
with scopes; I should look into that.

Thanks again for the help!

Cheers,

Zach
> The upcoming token at any point is returned by input.LT(1), the previous token by input.LT(-1)
>
> So:
>
> foo
> @init {
>  CommonToken sToken = input.LT(1);
> }
> : A bar* D { doStuff(sToken, input.LT(-1)); }
> ;
>
> And so on. Also look at things like $start depending on what the output is etc.
>
> However, you will be much better off building an AST then walking the tree to do your actions. 
>
> Jim
>
>   
>> -----Original Message-----
>> From: antlr-interest-bounces at antlr.org [mailto:antlr-interest-
>> bounces at antlr.org] On Behalf Of Zachary Palmer
>> Sent: Friday, January 15, 2010 2:03 PM
>> To: antlr-interest at antlr.org
>> Subject: [antlr-interest] First and Last Token of a Rule
>>
>> All,
>>
>> I think this is a pretty simple operation, but I have no idea how to
>> execute it.  Suppose I'm in some action code and have a reference to
>> the
>> parser.  Is there a way for me to obtain the most recently used token?
>> How about the token that started the most recent grammar rule?
>>
>> For instance, consider the following grammar (using a Java target
>> language):
>>
>> foo: 'a' bar* 'd' { doStuff(); };
>> bar: ('b' | 'c') { doStuff(); };
>>
>> Let's assume we are feeding this grammar the string "abcd".  In that
>> case, doStuff is called three times: once after the token 'b' is
>> matched
>> in the bar rule, once after the token 'c' is matched in the bar rule,
>> and once after the tokens 'a' through 'd' are matched in the foo rule.
>> I would like, from within the body of the doStuff method, to obtain the
>> first and last token of each rule matched.  So, for instance, if my
>> doStuff method looked like this:
>>
>> void doStuff() {
>>     Token first = ...; // first token of the current rule
>>     Token last = ...; // token most recently used
>>     System.out.println(first.getText() + ", " + last.getText());
>> }
>>
>> then the output to the above grammar when provided the input "abcd"
>> should be
>>
>> b,b
>> c,c
>> a,d
>>
>> This is, of course, a representative example; the real situation is a
>> bit more complicated.  The catch is that I don't want to add any
>> arguments to the doStuff method or do anything else that would require
>> me to change every rule in this 3,000 line grammar.  Is there a way
>> that
>> I can get the first token of the current rule and the most recently
>> used
>> token without tweaking every single grammar rule?
>>
>> Many thanks for reading!
>>
>> Zachary Palmer
>>
>> List: http://www.antlr.org/mailman/listinfo/antlr-interest
>> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-
>> email-address
>>     
>
>
>
>
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address
>
>   


From wclodius at los-alamos.net  Fri Jan 15 19:04:02 2010
From: wclodius at los-alamos.net (William B. Clodius)
Date: Fri, 15 Jan 2010 20:04:02 -0700
Subject: [antlr-interest] Fortran lexer problem
In-Reply-To: <op.u6k46zict3jqlu@win-mupvrp0jyrf>
References: <op.u6k46zict3jqlu@win-mupvrp0jyrf>
Message-ID: <4C3D1B55-86D5-4873-B23D-88FCA1FE153A@los-alamos.net>

As this is at least your second question on Fortran and ANTLR I suggest you check out the Open Fortran Project. http://fortran-parser.sourceforge.net/
As to the question regarding Fortran comments, Lexing Fortran, particularly the fixed source form, where spacing is not significant, is a pain not really suited to automated tools such as ANTLR. Check out Sale's Algorithm.

On Jan 15, 2010, at 10:50 AM, ??????? ?????? wrote:

> Good day,
> 
> I want to add comments of Fortran 77:
> 
> "c xxxxx";
> First symbol in column is 'c' - it means that the following line is a line
> of comment.
> 
> but I also have NAME token, that will conflict with such COMMENT rule.
> ('c' can be a name).
> 
> Is it possible to select rule by my own predicate? Are there any other
> more clear solvings of
> this problem?
> 
> 
> -- 
> Best regards,
> Michael
> 
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address
> 


From christian.schladetsch at gmail.com  Sat Jan 16 00:11:06 2010
From: christian.schladetsch at gmail.com (Christian Schladetsch)
Date: Sat, 16 Jan 2010 19:11:06 +1100
Subject: [antlr-interest] Incremental Parsing, AST creation,
	and ST generation
Message-ID: <6442c4ae1001160011k6bfc4e8erd4fedf13c3787e3b@mail.gmail.com>

Hello,

I am writing a network protocol using ANTLR. My idea is to use ANTLR to
parse incoming packets formed as possibly nested edicts.

Example full transcript of input is:

foo(a=1,b="baz")
{
  bar();
  spam(c=10)
  {
      grok(@b);
   }
}

Now, the key issue I have is to avoid reparsing the entire file when new
input arrives. For example:

foo()
{
   bar();

This is valid input for my grammar. When new input arrives, such as:

  spam()
  {

I want to re-use the previously parsed items (and AST, and code generated by
the AST via StringTemplate), while just adding the new tokens to the parser,
and new nodes to the tree, and new code to my VM.

Basically, I'd like to know if it is possible to generate an AST from a
parser, then add more input to that parser (and have it possibly fail
parsing), then add more to the AST.

It is impractical to have to re-parse the entire input (and re-create the
entire AST) when new input arrives. Full transcripts can be thousands of
lines long.

There is a way to do this, but I would like to see if I can first leverage
ANTLR. I've used ANTLR with great success (and a lot of bleary eyes), but
this is a new application of it for me and I am unsure of the feasibility.

Thanks in advance,
Christian.

From christian.schladetsch at gmail.com  Sat Jan 16 00:17:53 2010
From: christian.schladetsch at gmail.com (Christian Schladetsch)
Date: Sat, 16 Jan 2010 19:17:53 +1100
Subject: [antlr-interest]  Incremental Parsing, AST Generation
Message-ID: <6442c4ae1001160017v13444961h75afb683b6470abc@mail.gmail.com>

Hello,

I am writing a network protocol using ANTLR. My idea is to use ANTLR to
parse incoming packets formed as possibly nested edicts.

Example full transcript of input is:

foo(a=1,b="baz")
{
  bar();
  spam(c=10)
  {
      grok(@b);
   }
}

Now, the key issue I have is to avoid reparsing the entire file when new
input arrives. For example:

foo()
{
   bar();

This is valid input for my grammar. When new input arrives, such as:

  spam()
  {

I want to re-use the previously parsed items (and AST, and code generated by
the AST via StringTemplate), while just adding the new tokens to the parser,
and new nodes to the tree, and new code to my VM.

Basically, I'd like to know if it is possible to generate an AST from a
parser, then add more input to that parser (and have it possibly fail
parsing), then add more to the AST.

It is impractical to have to re-parse the entire input (and re-create the
entire AST) when new input arrives. Full transcripts can be thousands of
lines long.

There is a way to do this, but I would like to see if I can first leverage
ANTLR. I've used ANTLR with great success (and a lot of bleary eyes), but
this is a new application of it for me and I am unsure of the feasibility.

Thanks in advance,
Christian.

PS. Apologies if this is a dupe.

From jimi at temporal-wave.com  Sat Jan 16 19:14:21 2010
From: jimi at temporal-wave.com (Jim Idle)
Date: Sat, 16 Jan 2010 19:14:21 -0800
Subject: [antlr-interest] Fortran lexer problem
In-Reply-To: <op.u6k777vr6gpi6j@win-mupvrp0jyrf>
References: <6482427cf4e64b4f8ad286ed88e1f2c4@temporal-wave.com>
	<op.u6k6vwi66gpi6j@win-mupvrp0jyrf>
	<op.u6k777vr6gpi6j@win-mupvrp0jyrf>
Message-ID: <BA6CD6B0-6F3D-4D26-843E-654F52BFD846@temporal-wave.com>

You need gated predicate. Read the getting stared articles in the wiki.

Jim

On Jan 15, 2010, at 10:56, ??????? ??????  
<yurushkin at rambler.ru> wrote:

> I have the following term
>
> LINE_COMMENT
>    : ({blabla}? ('c' | 'C' | '*') | '!' )  ~('\n')*
>        {
>            $channel = HIDDEN;
>        }
>    ;
>
> but it only pasts the following code at the end:
>
>
>           switch (alt31)
>            {
>            case 1:
>                {
>                    if ( !((blabla)) )
>                    {
>                            CONSTRUCTEX();
>                            EXCEPTION->type         =  
> ANTLR3_FAILED_PREDICATE_EXCEPTION;
>                            EXCEPTION->message      = (void *)"blabla";
>                            EXCEPTION->ruleName     = (void  
> *)"LINE_COMMENT";
>                    }
>
>
> if "blabla" is false, an error is occured... but it's not right.
>
> ??????? ?????? <yurushkin at rambler.ru> ?????(?) ?  
> ????? ?????? Fri, 15 Jan 2010 21:27:22 +0300:
>
>> Excuse me, but how can I specify this condition (is it a first  
>> symbol and
>> symbol='c')?
>> Could you send me a piece of lexer grammar?
>>
>>
>> Jim Idle <jimi at temporal-wave.com> ?????(?) ? ?????  
>> ?????? Fri, 15 Jan 2010
>> 21:20:46 +0300:
>>
>>> I think Fortran comments that start with C have to have the C in
>>> character position 0 (or 1 in Fortran I guess ;-). So your comment  
>>> rule
>>> can be predicated by checking for line position 0 in ANTLR terms.
>>>
>>> Jim
>>>
>>>> -----Original Message-----
>>>> From: antlr-interest-bounces at antlr.org [mailto:antlr-interest-
>>>> bounces at antlr.org] On Behalf Of ??????? ??????
>>>> Sent: Friday, January 15, 2010 9:51 AM
>>>> To: antlr-interest at antlr.org
>>>> Subject: [antlr-interest] Fortran lexer problem
>>>>
>>>> Good day,
>>>>
>>>> I want to add comments of Fortran 77:
>>>>
>>>> "c xxxxx";
>>>> First symbol in column is 'c' - it means that the following line  
>>>> is a
>>>> line
>>>> of comment.
>>>>
>>>> but I also have NAME token, that will conflict with such COMMENT  
>>>> rule.
>>>> ('c' can be a name).
>>>>
>>>> Is it possible to select rule by my own predicate? Are there any  
>>>> other
>>>> more clear solvings of
>>>> this problem?
>>>>
>>>>
>>>> --
>>>> Best regards,
>>>> Michael
>>>>
>>>> List: http://www.antlr.org/mailman/listinfo/antlr-interest
>>>> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-
>>>> email-address
>>>
>>>
>>>
>>>
>>> List: http://www.antlr.org/mailman/listinfo/antlr-interest
>>> Unsubscribe:
>>> http://www.antlr.org/mailman/options/antlr-interest/your-email-address
>>>
>>> __________ Information from ESET Smart Security, version of virus
>>> signature database 4775 (20100115) __________
>>>
>>> The message was checked by ESET Smart Security.
>>>
>>> http://www.esetnod32.ru
>>>
>>>
>>>
>>
>>
>
>
> -- 
> Best regards,
> Michael

From hikemike at gmail.com  Sun Jan 17 13:04:35 2010
From: hikemike at gmail.com (Michael C. Starkie)
Date: Sun, 17 Jan 2010 16:04:35 -0500
Subject: [antlr-interest] Syntactic Predicates for matching literals within
	char sequences?
Message-ID: <5a5669421001171304va494785n1ee81b35bb153785@mail.gmail.com>

Hi,
I'm new to Antlr and I'm trying to match the string literal 'DATA_IN'
which appears multiple times in a sequence of ASCII chars.  However,
the parser get's confused when it encounters strings like 'DAP' or
'DA<any char>':

mismatched character 'P' expecting 'T'

lexer:
DATA_IN : 'DATA_IN';
ANY_CHAR : '\u0002'..'\u007F';

parser:
rule: line+ ;
line: data_in check;
data_in_check
options { backtrack=true; }
: data_in | any_char;

data_in : DATA_IN
any_char : ANY_CHAR;

Mike

From kferrio at gmail.com  Sun Jan 17 13:10:59 2010
From: kferrio at gmail.com (kferrio at gmail.com)
Date: Sun, 17 Jan 2010 21:10:59 +0000
Subject: [antlr-interest] Fortran lexer problem
In-Reply-To: <op.u6k777vr6gpi6j@win-mupvrp0jyrf>
References: <6482427cf4e64b4f8ad286ed88e1f2c4@temporal-wave.com><op.u6k6vwi66gpi6j@win-mupvrp0jyrf><op.u6k777vr6gpi6j@win-mupvrp0jyrf>
Message-ID: <1278375498-1263762662-cardhu_decombobulator_blackberry.rim.net-263095628-@bda428.bisx.prod.on.blackberry>

Michael...  I feel pity for you if have to parse F77.  You're going to run into a few problems harder to solve/avoid than this.  So if you're just going to discard comments anyway... I suggest you prefilter your input with a tool like 'sed' to strip fixed format comments.  Then you can get on with quirky things like Fortran edit descriptors.  :)

Kyle 

Sent from my Verizon Wireless BlackBerry

-----Original Message-----
From: ??????? ?????? <yurushkin at rambler.ru>
Date: Fri, 15 Jan 2010 21:56:21 
To: ??????? ??????<yurushkin at rambler.ru>; Jim Idle<jimi at temporal-wave.com>; antlr-interest at antlr.org<antlr-interest at antlr.org>
Subject: Re: [antlr-interest] Fortran lexer problem

I have the following term

LINE_COMMENT
     : ({blabla}? ('c' | 'C' | '*') | '!' )  ~('\n')*
         {
             $channel = HIDDEN;
         }
     ;

but it only pasts the following code at the end:


            switch (alt31)
             {
         	case 1:
         	    {
         	        if ( !((blabla)) )
         	        {
         	                CONSTRUCTEX();
         	                EXCEPTION->type         =  
ANTLR3_FAILED_PREDICATE_EXCEPTION;
         	                EXCEPTION->message      = (void *)"blabla";
         	                EXCEPTION->ruleName	 = (void *)"LINE_COMMENT";
         	        }


if "blabla" is false, an error is occured... but it's not right.

??????? ?????? <yurushkin at rambler.ru> ?????(?) ? ????? ?????? Fri, 15 Jan  
2010 21:27:22 +0300:

> Excuse me, but how can I specify this condition (is it a first symbol and
> symbol='c')?
> Could you send me a piece of lexer grammar?
>
>
> Jim Idle <jimi at temporal-wave.com> ?????(?) ? ????? ?????? Fri, 15 Jan  
> 2010
> 21:20:46 +0300:
>
>> I think Fortran comments that start with C have to have the C in
>> character position 0 (or 1 in Fortran I guess ;-). So your comment rule
>> can be predicated by checking for line position 0 in ANTLR terms.
>>
>> Jim
>>
>>> -----Original Message-----
>>> From: antlr-interest-bounces at antlr.org [mailto:antlr-interest-
>>> bounces at antlr.org] On Behalf Of ??????? ??????
>>> Sent: Friday, January 15, 2010 9:51 AM
>>> To: antlr-interest at antlr.org
>>> Subject: [antlr-interest] Fortran lexer problem
>>>
>>> Good day,
>>>
>>> I want to add comments of Fortran 77:
>>>
>>> "c xxxxx";
>>> First symbol in column is 'c' - it means that the following line is a
>>> line
>>> of comment.
>>>
>>> but I also have NAME token, that will conflict with such COMMENT rule.
>>> ('c' can be a name).
>>>
>>> Is it possible to select rule by my own predicate? Are there any other
>>> more clear solvings of
>>> this problem?
>>>
>>>
>>> --
>>> Best regards,
>>> Michael
>>>
>>> List: http://www.antlr.org/mailman/listinfo/antlr-interest
>>> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-
>>> email-address
>>
>>
>>
>>
>> List: http://www.antlr.org/mailman/listinfo/antlr-interest
>> Unsubscribe:
>> http://www.antlr.org/mailman/options/antlr-interest/your-email-address
>>
>> __________ Information from ESET Smart Security, version of virus
>> signature database 4775 (20100115) __________
>>
>> The message was checked by ESET Smart Security.
>>
>> http://www.esetnod32.ru
>>
>>
>>
>
>


-- 
Best regards,
Michael

List: http://www.antlr.org/mailman/listinfo/antlr-interest
Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address

From yurushkin at rambler.ru  Sun Jan 17 13:17:29 2010
From: yurushkin at rambler.ru (=?utf-8?B?0K7RgNGD0YjQutC40L0g0JzQuNGF0LDQuNC7?=)
Date: Mon, 18 Jan 2010 00:17:29 +0300
Subject: [antlr-interest] Fortran lexer problem
In-Reply-To: <1278375498-1263762662-cardhu_decombobulator_blackberry.rim.net-263095628-@bda428.bisx.prod.on.blackberry>
References: <6482427cf4e64b4f8ad286ed88e1f2c4@temporal-wave.com>
	<op.u6k6vwi66gpi6j@win-mupvrp0jyrf>
	<op.u6k777vr6gpi6j@win-mupvrp0jyrf>
	<1278375498-1263762662-cardhu_decombobulator_blackberry.rim.net-263095628-@bda428.bisx.prod.on.blackberry>
Message-ID: <op.u6o33fhb6gpi6j@win-mupvrp0jyrf>

Thank you. I have decided to follow your question :) It was interesting  
for me to find
more 'clear' way.


<kferrio at gmail.com> ?????(?) ? ????? ?????? Mon, 18 Jan 2010 00:10:59  
+0300:

> Michael...  I feel pity for you if have to parse F77.  You're going to  
> run into a few problems harder to solve/avoid than this.  So if you're  
> just going to discard comments anyway... I suggest you prefilter your  
> input with a tool like 'sed' to strip fixed format comments.  Then you  
> can get on with quirky things like Fortran edit descriptors.  :)
>
> Kyle
>
> Sent from my Verizon Wireless BlackBerry
>
> -----Original Message-----
> From: ??????? ?????? <yurushkin at rambler.ru>
> Date: Fri, 15 Jan 2010 21:56:21
> To: ??????? ??????<yurushkin at rambler.ru>; Jim  
> Idle<jimi at temporal-wave.com>;  
> antlr-interest at antlr.org<antlr-interest at antlr.org>
> Subject: Re: [antlr-interest] Fortran lexer problem
>
> I have the following term
>
> LINE_COMMENT
>      : ({blabla}? ('c' | 'C' | '*') | '!' )  ~('\n')*
>          {
>              $channel = HIDDEN;
>          }
>      ;
>
> but it only pasts the following code at the end:
>
>
>             switch (alt31)
>              {
>          	case 1:
>          	    {
>          	        if ( !((blabla)) )
>          	        {
>          	                CONSTRUCTEX();
>          	                EXCEPTION->type         =
> ANTLR3_FAILED_PREDICATE_EXCEPTION;
>          	                EXCEPTION->message      = (void *)"blabla";
>          	                EXCEPTION->ruleName	 = (void *)"LINE_COMMENT";
>          	        }
>
>
> if "blabla" is false, an error is occured... but it's not right.
>
> ??????? ?????? <yurushkin at rambler.ru> ?????(?) ? ????? ?????? Fri, 15 Jan
> 2010 21:27:22 +0300:
>
>> Excuse me, but how can I specify this condition (is it a first symbol  
>> and
>> symbol='c')?
>> Could you send me a piece of lexer grammar?
>>
>>
>> Jim Idle <jimi at temporal-wave.com> ?????(?) ? ????? ?????? Fri, 15 Jan
>> 2010
>> 21:20:46 +0300:
>>
>>> I think Fortran comments that start with C have to have the C in
>>> character position 0 (or 1 in Fortran I guess ;-). So your comment rule
>>> can be predicated by checking for line position 0 in ANTLR terms.
>>>
>>> Jim
>>>
>>>> -----Original Message-----
>>>> From: antlr-interest-bounces at antlr.org [mailto:antlr-interest-
>>>> bounces at antlr.org] On Behalf Of ??????? ??????
>>>> Sent: Friday, January 15, 2010 9:51 AM
>>>> To: antlr-interest at antlr.org
>>>> Subject: [antlr-interest] Fortran lexer problem
>>>>
>>>> Good day,
>>>>
>>>> I want to add comments of Fortran 77:
>>>>
>>>> "c xxxxx";
>>>> First symbol in column is 'c' - it means that the following line is a
>>>> line
>>>> of comment.
>>>>
>>>> but I also have NAME token, that will conflict with such COMMENT rule.
>>>> ('c' can be a name).
>>>>
>>>> Is it possible to select rule by my own predicate? Are there any other
>>>> more clear solvings of
>>>> this problem?
>>>>
>>>>
>>>> --
>>>> Best regards,
>>>> Michael
>>>>
>>>> List: http://www.antlr.org/mailman/listinfo/antlr-interest
>>>> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-
>>>> email-address
>>>
>>>
>>>
>>>
>>> List: http://www.antlr.org/mailman/listinfo/antlr-interest
>>> Unsubscribe:
>>> http://www.antlr.org/mailman/options/antlr-interest/your-email-address
>>>
>>> __________ Information from ESET Smart Security, version of virus
>>> signature database 4775 (20100115) __________
>>>
>>> The message was checked by ESET Smart Security.
>>>
>>> http://www.esetnod32.ru
>>>
>>>
>>>
>>
>>
>
>


-- 
Best regards,
Michael

From kferrio at gmail.com  Sun Jan 17 13:20:14 2010
From: kferrio at gmail.com (kferrio at gmail.com)
Date: Sun, 17 Jan 2010 21:20:14 +0000
Subject: [antlr-interest] Syntactic Predicates for matching literals
	withinchar sequences?
Message-ID: <907896710-1263763215-cardhu_decombobulator_blackberry.rim.net-1973379806-@bda428.bisx.prod.on.blackberry>

You probably want to make ANY_CHAR a lexer fragment so that it does not consume input except as part of a larger rule which calls it.  Or maybe I missed your intent.

Kyle
------Original Message------
From: Michael C. Starkie
Sender: ANTLR
To: antlr-interest at antlr.org
Subject: [antlr-interest] Syntactic Predicates for matching literals withinchar sequences?
Sent: Jan 17, 2010 2:04 PM

Hi,
I'm new to Antlr and I'm trying to match the string literal 'DATA_IN'
which appears multiple times in a sequence of ASCII chars.  However,
the parser get's confused when it encounters strings like 'DAP' or
'DA<any char>':

mismatched character 'P' expecting 'T'

lexer:
DATA_IN : 'DATA_IN';
ANY_CHAR : '\u0002'..'\u007F';

parser:
rule: line+ ;
line: data_in check;
data_in_check
options { backtrack=true; }
: data_in | any_char;

data_in : DATA_IN
any_char : ANY_CHAR;

Mike

List: http://www.antlr.org/mailman/listinfo/antlr-interest
Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address


Sent from my Verizon Wireless BlackBerry

From frogery at voila.fr  Mon Jan 18 00:02:45 2010
From: frogery at voila.fr (frogery at voila.fr)
Date: Mon, 18 Jan 2010 09:02:45 +0100 (CET)
Subject: [antlr-interest] Overriding the emit function to use
	custom	tokens
Message-ID: <19590761.1131651263801765450.JavaMail.www@wwinf4603>

Jim,

Indeed, I want to use the custom pointer defined in ANTLR3_COMMON_TOKEN_struct. I have done this:

@init 
{
    double* pCustom = ANTLR3_MALLOC(sizeof(double));
    *pCustom = 0;
    CUSTOM = (ANTLR3_UINT32)pCustom;
}

My problem is that I have not found any way to set the freeCustom pointer (that is the pointer to a function that knows how to free the custom structure when the token is destroyed). I have probably missed something but the only way I have found to set this freeCustom pointer was to override the emit function. Is there another way to do it?

Thanks,
Yann


> Message du 15/01/10 ? 18h50
> De : "Jim Idle" <jimi at temporal-wave.com>
> A : "antlr-interest at antlr.org" <antlr-interest at antlr.org>
> Copie ? : 
> Objet : Re: [antlr-interest] Overriding the emit function to use custom	tokens
> 
> No, you have to override nextToken too it calls emit directly for performance reasons. 
> 
> However, no one really needs to do this. There is a user defined pointer built in to every token and a function pointer that is called when the token is released (if it is not NULL). So you can just add your custom token stuff there and rely on the default runtime.
> 
> Jim
> 
> > -----Original Message-----
> > From: antlr-interest-bounces at antlr.org [mailto:antlr-interest-
> > bounces at antlr.org] On Behalf Of frogery at voila.fr
> > Sent: Friday, January 15, 2010 5:53 AM
> > To: antlr-interest at antlr.org
> > Subject: [antlr-interest] Overriding the emit function to use custom
> > tokens
> > 
> > Hello,
> > 
> > I wanted to create a custom token object, so I have seen in the FAQ
> > that I had to "override" the lexer emit function. So I did that this
> > way:
> > 
> > ...
> >         pLexer = antlrLexerNew(pInput);
> >         pLexer->pLexer->emit = customEmit;
> > ...
> > 
> > but it was not working.
> > 
> > The customEmit function was never called. So I have debugged and I
> > think there is a bug in antlr3lexer.c. In the nextTokenStr function,
> > shouldn't "emit(lexer)" be replaced by "lexer->emit(lexer);"? What do
> > you think?
> > 
> > Thanks,
> > Yann
> > 
> > ____________________________________________________
> > 
> > Vous n?avez pas encore adress? vos voeux ? Retrouvez nos cartes sur
> > http://carte-de-voeux.voila.fr
> > 
> > 
> > ____________________________________________________
> > 
> > Vous n?avez pas encore adress? vos voeux ? Retrouvez nos cartes sur
> > http://carte-de-voeux.voila.fr
> > 
> > 
> > 
> > 
> > List: http://www.antlr.org/mailman/listinfo/antlr-interest
> > Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-
> > email-address
> 
> 
> 
> 
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address
> 

____________________________________________________

Vous n?avez pas encore adress? vos voeux?? Retrouvez nos cartes sur http://carte-de-voeux.voila.fr 


From michnay at gmail.com  Mon Jan 18 06:17:51 2010
From: michnay at gmail.com (=?ISO-8859-1?Q?Michnay_Bal=E1zs?=)
Date: Mon, 18 Jan 2010 15:17:51 +0100
Subject: [antlr-interest] Rule ignored in CSS grammar
Message-ID: <df0726361001180617w7bea943cj8522b97aafbe8e55@mail.gmail.com>

Hi Guys,

The attached grammar is supposed to parse CSS files. I used this as an
initial version:

http://www.antlr.org/grammar/1214945003224/csst3.g

First I tried to add a functionality to prevent "_" chars for property
names, so I created a new lexer rule "CSSPROPERTYNAME" to ensure this. The
"declaration" rule has been updated accordingly. The funny thing is that now
the "selector" rule fails to recognize tag selectors like:

.class_selector img {
  property: value;
  ...
}

Since my update should only affect property names and not selectors, I
really do not understand what the problem is. I tried to define lexer rules
as both fragments and literal values, no luck.

I used ANTLRWorks to debug this and have noticed that in the "selector" rule
"selectorOperation" is ignored:

selector
  : elem selectorOperation* attrib* pseudo? ->  elem selectorOperation*
attrib* pseudo*
  ;

Any ideas?

Thanks.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: css.g
Type: application/octet-stream
Size: 2973 bytes
Desc: not available
Url : http://www.antlr.org/pipermail/antlr-interest/attachments/20100118/51689444/attachment.obj 

From m.y.speyer at inter.nl.net  Mon Jan 18 08:39:00 2010
From: m.y.speyer at inter.nl.net (Marc Speyer)
Date: Mon, 18 Jan 2010 17:39:00 +0100
Subject: [antlr-interest] Tree pattern maching using the C# (was C)
	target
In-Reply-To: <20100115125833.242280@gmx.net>
References: <000901ca95d9$df6dd740$9e4985c0$@y.speyer@inter.nl.net>
	<20100115125833.242280@gmx.net>
Message-ID: <002401ca985c$bd00f450$3702dcf0$@y.speyer@inter.nl.net>

Hi Johannes,

I tried the version that you mentioned by downloading it from
antlr:/runtime/CSharp2 in the Fisheye code repository and then tried to
compile it using VS2008. This didn't work because a file "TokenConstants.cs"
was reported missing by VS2008 and gave me compilation errors. I managed to
get a version from the CSharp3 repository and after making one change I
could compile. I noticed that the Downup method is part of the Treefilter
class which inherits from the TreeParser class. The grammar for the tree
parser from the example has the following header:

// START: header
tree grammar DefRef;
options {
  tokenVocab = Cymbol;
  ASTLabelType = CommonTree;
  filter = true;
  language=CSharp2;
}
@members {
    SymbolTable symtab;
    Scope currentScope;
    public DefRef(ITreeNodeStream input, SymbolTable symtab) 
    	: this(input) 
    {
        this.symtab = symtab;
        currentScope = symtab.globals;
    }
}
// END: header

Generating the tree parser gives DefRef.cs with the DefRef class declared
as:

public partial class DefRef : TreeParser


Now I can cast this into the TreeFilter class but to test things quickly I
changed the above line in the DefRef.cs into:

public partial class DefRef : TreeFilter


In the calling program I use:

DefRef def = new DefRef(nodes, symtab); // use custom constructor
def.Downup(t); // trigger symtab actions upon certain subtrees

When I run this nothings happens whereas I have grammar rules and actions
like:

exitBlock
    :   BLOCK
        {
        Console.WriteLine("locals: "+currentScope);
        currentScope = currentScope.getEnclosingScope();    // pop scope
        }
    ;

I have not figured out yet why this doesn't work. The examples is a
one-to-one port of the Java example of pattern 17 Symbol Table for Nested
Scopes of the Language Implementation Patterns.

Any idea?

Thanks,

Marc
>-----Original Message-----
>From: Johannes Luber [mailto:JALuber at gmx.de]
>Sent: Friday, January 15, 2010 1:59 PM
>To: Marc Speyer; antlr-interest at antlr.org
>Subject: Re: [antlr-interest] Tree pattern maching using the C target
>
>> Hi all,
>>
>> I have a similar issue using the C# target. Using the Cymbol.g example of
>> pattern 17 Symbol Table for Nested Scopes of the Language Implementation
>> Patterns book I could not get it to work because there is now downup
>> method.
>> According to the documentation this method walks the AST code using
>> ANTLR's
>> built-in downup( ) strategy.
>>
>> Am I correct assuming that this has not been implemented yet for the C#
>> target (as Jim implies in his response). Is it difficult to implement it
>> myself? I guess it would involve implementing the tree pattern matching
>> stuff.
>>
>> Marc
>
>You are correct - there is no official version yet, which implements tree
>pattern matching. I haven't gotten around to the API changes yet (will work
>on that next week), though I have checked in some untested changes. It
>would be the easieast if you'd base your own code on that for now.
>
>Johannes
>
>> P.S. Hope this email files under the proper subject thread, and apologies
>> in
>> advance if it isn't (Just subscribed to the mailing list but I could not
>> find out how to get previous posts from it)
>>
>> > Pattern matcher or normal tree walker? The pattern stuff is not
>> implemented in the C target yet.
>> >
>> > Jim
>> >
>> >> -----Original Message-----
>> >> From: antlr-interest-bounces at antlr.org [mailto:antlr-interest-
>> >> bounces at antlr.org] On Behalf Of Heiko Folkerts
>> >> Sent: Thursday, January 14, 2010 5:01 AM
>> >> To: antlr-interest at antlr.org
>> >> Subject: [antlr-interest] Tree pattern maching using the C target
>> >>
>> >> Hi all,
>> >> I wrote al litle tree pattern matcher for a specific validation we
>need
>> >> in our grammar. ANTLR and the C compiler compile it all well but there
>> >> is now "downup" mehtod for running the matcher. Instead I only see our
>> >> own rules in the generated parser. So, is the method to run when using
>> >> a tree pattern macher in the C target different than ^"downup"? How to
>> >> run the matcher?
>> >>
>> >> I tried to find an answer in the C examples but there was only a
>> >> treeparser and no tree pattern matcher.
>> >>
>> >> Thx+
>> >> Heiko
>> >>
>> >>
>> >> --
>>
>>
>>
>> List: http://www.antlr.org/mailman/listinfo/antlr-interest
>> Unsubscribe:
>> http://www.antlr.org/mailman/options/antlr-interest/your-email-address
>
>--
>GRATIS f?r alle GMX-Mitglieder: Die maxdome Movie-FLAT!
>Jetzt freischalten unter http://portal.gmx.net/de/go/maxdome01


From jimi at temporal-wave.com  Mon Jan 18 09:01:45 2010
From: jimi at temporal-wave.com (Jim Idle)
Date: Mon, 18 Jan 2010 09:01:45 -0800
Subject: [antlr-interest] Rule ignored in CSS grammar
In-Reply-To: <df0726361001180617w7bea943cj8522b97aafbe8e55@mail.gmail.com>
Message-ID: <3212d1f335ec784fa51b4f69cb159efc@temporal-wave.com>

You might start with this one 

http://www.antlr.org/grammar/1240941192304/css21.g


(CSS 2.1 grammar that I contributed) and just upgrade it to CSS 3, which is not much different to be honest. The main difficulties are properly lexing the input and the example you quote does not do the lexing correctly. Adding to the parsing constructs is trivial.

I am not sure why you would try to prevent the use of '_' as that is part of the spec. However the way you are doing it will not work anyway because the lexer is context free and just produces the tokens it sees (it is not driven by the parser). 

Jim


> -----Original Message-----
> From: antlr-interest-bounces at antlr.org [mailto:antlr-interest-
> bounces at antlr.org] On Behalf Of Michnay Bal?zs
> Sent: Monday, January 18, 2010 6:18 AM
> To: antlr-interest at antlr.org
> Subject: [antlr-interest] Rule ignored in CSS grammar
> 
> Hi Guys,
> 
> The attached grammar is supposed to parse CSS files. I used this as an
> initial version:
> 
> http://www.antlr.org/grammar/1214945003224/csst3.g
> 
> First I tried to add a functionality to prevent "_" chars for property
> names, so I created a new lexer rule "CSSPROPERTYNAME" to ensure this.
> The "declaration" rule has been updated accordingly. The funny thing is
> that now the "selector" rule fails to recognize tag selectors like:
> 
> .class_selector img {
>   property: value;
>   ...
> }
> 
> Since my update should only affect property names and not selectors, I
> really do not understand what the problem is. I tried to define lexer
> rules as both fragments and literal values, no luck.
> 
> I used ANTLRWorks to debug this and have noticed that in the "selector"
> rule "selectorOperation" is ignored:
> 
> selector
>   : elem selectorOperation* attrib* pseudo? ->  elem selectorOperation*
> attrib* pseudo*
>   ;
> 
> Any ideas?
> 
> Thanks.
> No virus found in this incoming message.
> Checked by AVG - www.avg.com
> Version: 9.0.725 / Virus Database: 270.14.148/2629 - Release Date:
> 01/17/10 11:35:00
> 
> 


From Jim.Mayer at xerox.com  Mon Jan 18 10:38:00 2010
From: Jim.Mayer at xerox.com (Mayer, Jim)
Date: Mon, 18 Jan 2010 10:38:00 -0800
Subject: [antlr-interest] ANTLRWorks user registration and firewalls
Message-ID: <80EA5989D3149B42B9816C8BE2BADD230E709BA2@USA7061MS02.na.xerox.net>

Hi,

 
I'm having difficulty getting ANTLRWorks to start up at work.  At home,
the system works fine.  A quick inspection of the code suggests that the
problem is that ANTLRWorks tracks usage statistics and insists upon
getting an "ID" from a site at antlr.org as part of its initial startup
(this happens even if you ask it to not send information during the
"Welcome to ANTLRWorks" dialog).

 
Has anyone else run into this problem?  I did some web searches and
didn't see any.

 
In addition, I am uncomfortable that the package collects usage
statistics (even innocuous ones) without announcing the fact or
requesting permission.  I would prefer that ANTLRWorks use an "opt in"
mechanism, and that if users decline to register that the package have
no communication off box.

 
Thanks.

 
-- Jim Mayer


From David.Grieve at Sun.COM  Mon Jan 18 12:29:17 2010
From: David.Grieve at Sun.COM (David Grieve)
Date: Mon, 18 Jan 2010 15:29:17 -0500
Subject: [antlr-interest] Detecting a space as a token
Message-ID: <9C71029D-1ED3-4C32-9E08-7BA4C8C40B92@Sun.com>

In CSS, a selector is (roughly) a sequence of simple selectors joined by a combinator. http://www.antlr.org/grammar/1240941192304/css21.g has the following rules which correspond to this. 

combinator
	: PLUS
	| GREATER
	|
	;
	
selector
	: simpleSelector (combinator simpleSelector)*
	;

The issue I'm having is how to handle the combinator which is a space in the selector rule. Specifically, I should be able to parse 
	
	A .b

as two simple selectors: A and .b. However, since whitespace is ignored, this is getting parsed as one selector. The following parses as desired: 

	A *.b

Using the universal selector as part of the second simple selector is a workaround that I shouldn't have to employ. 

How can I parse "A<space>.b" such that the space is recognized as a combinator? Thanks in advance for any help!
David Grieve 
Sun Microsystems, Inc.


From jimi at temporal-wave.com  Mon Jan 18 14:45:29 2010
From: jimi at temporal-wave.com (Jim Idle)
Date: Mon, 18 Jan 2010 14:45:29 -0800
Subject: [antlr-interest] Detecting a space as a token
In-Reply-To: <9C71029D-1ED3-4C32-9E08-7BA4C8C40B92@Sun.com>
Message-ID: <3cf75046cfa0ba49b72c4c110e903cd6@temporal-wave.com>

All you need do is use a predicate at the DOT, which is where the esPred rule is. You can change the syntactic predicate to a semantic predicate and check for #, ., ( and : via input.LT() but can also look at the previous token even if off channel to make sure it is not a space:

simpleSelector
	: elementName 
		({ mySemPred() }?=>elementSubsequent)*
		
	| ({ mySemPred() }?=>elementSubsequent)+
	;
@parser:members {

boolean mySemPred() {
switch (input.LA(1)) {
   case DOT:
	// Only if no preceding spaces (but is that correct for CSS?
	//
	if ((TokenStream)input).get( input.index()-1 ).getType() != WS) { return true; } else {return false; }
	break;
   case HASH:
   case LBRACKET:
   case COLON:
      return true;
   default:
      return false;
 }
}   

> -----Original Message-----
> From: antlr-interest-bounces at antlr.org [mailto:antlr-interest-
> bounces at antlr.org] On Behalf Of David Grieve
> Sent: Monday, January 18, 2010 12:29 PM
> To: antlr-interest at antlr.org
> Subject: [antlr-interest] Detecting a space as a token
> 
> In CSS, a selector is (roughly) a sequence of simple selectors joined
> by a combinator. http://www.antlr.org/grammar/1240941192304/css21.g has
> the following rules which correspond to this.
> 
> combinator
> 	: PLUS
> 	| GREATER
> 	|
> 	;
> 
> selector
> 	: simpleSelector (combinator simpleSelector)*
> 	;
> 
> The issue I'm having is how to handle the combinator which is a space
> in the selector rule. Specifically, I should be able to parse
> 
> 	A .b
> 
> as two simple selectors: A and .b. However, since whitespace is
> ignored, this is getting parsed as one selector. The following parses
> as desired:
> 
> 	A *.b
> 
> Using the universal selector as part of the second simple selector is a
> workaround that I shouldn't have to employ.
> 
> How can I parse "A<space>.b" such that the space is recognized as a
> combinator? Thanks in advance for any help!
> David Grieve
> Sun Microsystems, Inc.
> 
> 
> 
> 
> 
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-
> email-address


From pcc482719 at gmail.com  Tue Jan 19 08:47:52 2010
From: pcc482719 at gmail.com (Peter C. Chapin)
Date: Tue, 19 Jan 2010 11:47:52 -0500
Subject: [antlr-interest] v3.2 C# runtime?
Message-ID: <4B55E238.1080408@gmail.com>

I'm looking for the ANTLR v3.2 C# runtime support assemblies. I must be
missing something because I'm having no luck finding it. The page here

    http://www.antlr.org/download/CSharp

does not include it. I thought, "Oh, okay... I'll download the source
code and compile it myself." However the file antlr-3.2.tar.gz pointed
at by the "ANTLR 3.2 source distribution" link on
http://www.antlr.org/download.html seems to only contain the source for
the Java runtime. I attempted a Google search but it only turned up the
C# runtime for v3.1.3.

Peter


From parrt at antlr.org  Tue Jan 19 12:30:20 2010
From: parrt at antlr.org (Terence Parr)
Date: Tue, 19 Jan 2010 12:30:20 -0800
Subject: [antlr-interest] ANTLR v4 planning stages
Message-ID: <6B4FBB03-3E1A-4DC9-9788-20787BF2A94F@antlr.org>

hiya. I'm now ready to embark on ANTLR (and ANTLRWorks) code development after 2 years in book-writing mode. I've come to the conclusion that we need to completely rebuild ANTLR v3, yielding v4. After I finish, I'll update The definitive ANTLR reference book for v4.

how this came about: I must reimplement ANTLR v3 in v3, just like I did recently for ST (yielding ST v4). Besides being untidy, important projects like eclipse cannot include ANTLR at the moment due to license restrictions on it's v2 dependency. After discussions with the other developers, I've come to the conclusion that it would be best to rewrite the tool itself from scratch.  I'm talking about the tool itself here.  The runtime should remain the same, although I hope to optimize the generated code quite a bit.

here is the planning page:

http://www.antlr.org/wiki/display/~admin/ANTLR+v4+plans

no doubt there will be bug fix releases for v3 as we go along.

While there was a huge discontinuity between v2 and v3, that was because of the completely new approach. v3 to v4 should be backward compatible or most grammars. The rest should only require a few tweaks. My goal is simply to reimplement existing functionality first and then consider a number of improvements (such as the cool new expression grammar notation). Consider v4.0 a giant re-factoring pass on the internals.

Ter

From parrt at antlr.org  Tue Jan 19 12:30:50 2010
From: parrt at antlr.org (Terence Parr)
Date: Tue, 19 Jan 2010 12:30:50 -0800
Subject: [antlr-interest] ANTLR v4 lexer thoughts
Message-ID: <32CAC08E-830F-4738-8AEC-74CD5CA8C7C0@antlr.org>

In the realm of future improvements, I'm thinking about changing the generate code for lexer grammars. My thoughts are here:

http://www.antlr.org/wiki/display/~admin/2010/01/19/ANTLR+v4+lexers

Ter

From scott at javadude.com  Tue Jan 19 12:49:59 2010
From: scott at javadude.com (Scott Stanchfield)
Date: Tue, 19 Jan 2010 15:49:59 -0500
Subject: [antlr-interest] ANTLR v4 planning stages
In-Reply-To: <6B4FBB03-3E1A-4DC9-9788-20787BF2A94F@antlr.org>
References: <6B4FBB03-3E1A-4DC9-9788-20787BF2A94F@antlr.org>
Message-ID: <d19d16481001191249w22e918e2q7ee3c7b76ce5ab0d@mail.gmail.com>

RE Language-agnostic actions - if you treat this as a strategy pattern
(like I seem to recall you did in the antlr 2 code base) this could
work really well. What would be really cool IMNSHO:

   grammar Foo;
    foo : xxxxxx  {@doX(...); }  ;
    fee : xxxxxx  {@doY(...); }  ;

and the generators could generate a spec/interface/abstract class for
the action methods, like in Java:

  public interface FooActionStrategy {
      void doX(...);
      void doY(...);
  }

and generate

   setActionStrategy(FooActionStrategy x) {...}

that would be used in the code. All that's needed is an implementation.

If all of the action code were simple action-strategy calls, this
should be generatable in pretty much any target language. (Of course I
haven't given this much thought, but it feels pretty good OTTOMH)

-- Scott

----------------------------------------
Scott Stanchfield
http://javadude.com

From scott at javadude.com  Tue Jan 19 12:50:19 2010
From: scott at javadude.com (Scott Stanchfield)
Date: Tue, 19 Jan 2010 15:50:19 -0500
Subject: [antlr-interest] ANTLR v4 planning stages
In-Reply-To: <6B4FBB03-3E1A-4DC9-9788-20787BF2A94F@antlr.org>
References: <6B4FBB03-3E1A-4DC9-9788-20787BF2A94F@antlr.org>
Message-ID: <d19d16481001191250k6ec6659eu2b3d228fd0eb7325@mail.gmail.com>

A little thing to add to the todo list if possible:

I've been looking into debugging support in eclipse. When generating
code, can you add in source-grammar-line/col-matchup comments a bit
more often? in particular, having them appear just before any action
code that's dropped into the generated code would be cool.

Even better: if the comments could also appear before/after attribute
expansion that would help as well.

My goal is to be able to use the target-language debugger and map the
current code position back to the grammar. This allows walking the
grammar while being able to use all of the features of the
target-language debugger (like inspecting variables and such).

I know how to set this up for Java (I did it in ANTLR 2 using Java
SMAPs and it worked well), and I suspect other target languages could
do something similar with a bit more information.


BTW: +1 for $FIRST/$FOLLOW!

-- Scott

----------------------------------------
Scott Stanchfield
http://javadude.com

From scott at javadude.com  Tue Jan 19 12:52:34 2010
From: scott at javadude.com (Scott Stanchfield)
Date: Tue, 19 Jan 2010 15:52:34 -0500
Subject: [antlr-interest] ANTLR v4 planning stages
In-Reply-To: <d19d16481001191249w22e918e2q7ee3c7b76ce5ab0d@mail.gmail.com>
References: <6B4FBB03-3E1A-4DC9-9788-20787BF2A94F@antlr.org>
	<d19d16481001191249w22e918e2q7ee3c7b76ce5ab0d@mail.gmail.com>
Message-ID: <d19d16481001191252t5193408mc7254ebc899b0ba2@mail.gmail.com>

oh - just noticed the language-agnostic symbol-table management -
might be able to do something similar using a strategy pattern.
Perhaps use something like IDL (semi-gack) to specify scopes?
-- Scott

----------------------------------------
Scott Stanchfield
http://javadude.com

From parrt at cs.usfca.edu  Tue Jan 19 12:55:08 2010
From: parrt at cs.usfca.edu (Terence Parr)
Date: Tue, 19 Jan 2010 12:55:08 -0800
Subject: [antlr-interest] ANTLR v4 planning stages
In-Reply-To: <d19d16481001191249w22e918e2q7ee3c7b76ce5ab0d@mail.gmail.com>
References: <6B4FBB03-3E1A-4DC9-9788-20787BF2A94F@antlr.org>
	<d19d16481001191249w22e918e2q7ee3c7b76ce5ab0d@mail.gmail.com>
Message-ID: <9FBCAA66-66F1-4BE0-A482-62FBA7268FC4@cs.usfca.edu>

yeah, I was wondering how we would integrate a generic language (NIL? neutral imperative language? chuckle) with the surrounding code in whatever language. Named method calls like this could work well.
Ter
On Jan 19, 2010, at 12:49 PM, Scott Stanchfield wrote:

> RE Language-agnostic actions - if you treat this as a strategy pattern
> (like I seem to recall you did in the antlr 2 code base) this could
> work really well. What would be really cool IMNSHO:
> 
>   grammar Foo;
>    foo : xxxxxx  {@doX(...); }  ;
>    fee : xxxxxx  {@doY(...); }  ;
> 
> and the generators could generate a spec/interface/abstract class for
> the action methods, like in Java:
> 
>  public interface FooActionStrategy {
>      void doX(...);
>      void doY(...);
>  }
> 
> and generate
> 
>   setActionStrategy(FooActionStrategy x) {...}
> 
> that would be used in the code. All that's needed is an implementation.
> 
> If all of the action code were simple action-strategy calls, this
> should be generatable in pretty much any target language. (Of course I
> haven't given this much thought, but it feels pretty good OTTOMH)
> 
> -- Scott
> 
> ----------------------------------------
> Scott Stanchfield
> http://javadude.com
> 
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address


From parrt at antlr.org  Tue Jan 19 12:56:31 2010
From: parrt at antlr.org (Terence Parr)
Date: Tue, 19 Jan 2010 12:56:31 -0800
Subject: [antlr-interest] ANTLR v4 planning stages
In-Reply-To: <d19d16481001191250k6ec6659eu2b3d228fd0eb7325@mail.gmail.com>
References: <6B4FBB03-3E1A-4DC9-9788-20787BF2A94F@antlr.org>
	<d19d16481001191250k6ec6659eu2b3d228fd0eb7325@mail.gmail.com>
Message-ID: <C7C1EE44-B046-4CBD-A8FE-6918B89F34F7@antlr.org>


On Jan 19, 2010, at 12:50 PM, Scott Stanchfield wrote:

> A little thing to add to the todo list if possible:
> 
> I've been looking into debugging support in eclipse. When generating
> code, can you add in source-grammar-line/col-matchup comments a bit
> more often? in particular, having them appear just before any action
> code that's dropped into the generated code would be cool.

yeah, easy to do

> Even better: if the comments could also appear before/after attribute
> expansion that would help as well.

 that could work although it might cloud the output a little bit.

> My goal is to be able to use the target-language debugger and map the
> current code position back to the grammar. This allows walking the
> grammar while being able to use all of the features of the
> target-language debugger (like inspecting variables and such).

yeah, it's amazing how much those comments help even as they are.

> I know how to set this up for Java (I did it in ANTLR 2 using Java
> SMAPs and it worked well), and I suspect other target languages could
> do something similar with a bit more information.
> 
> BTW: +1 for $FIRST/$FOLLOW!

yep, long overdue
T

From scott at javadude.com  Tue Jan 19 13:03:05 2010
From: scott at javadude.com (Scott Stanchfield)
Date: Tue, 19 Jan 2010 16:03:05 -0500
Subject: [antlr-interest] ANTLR v4 planning stages
In-Reply-To: <9FBCAA66-66F1-4BE0-A482-62FBA7268FC4@cs.usfca.edu>
References: <6B4FBB03-3E1A-4DC9-9788-20787BF2A94F@antlr.org>
	<d19d16481001191249w22e918e2q7ee3c7b76ce5ab0d@mail.gmail.com>
	<9FBCAA66-66F1-4BE0-A482-62FBA7268FC4@cs.usfca.edu>
Message-ID: <d19d16481001191303q45c6288dx526541d2ebada29e@mail.gmail.com>

NIL ;) Ahhh, gotta love the TLAs...

If you keep the actions to strictly method calls passing
attribute-expressions as values (and don't allow anything else) I'd
think it would keep things simple for use and code generation.

I assume you'd still allow target-language-specific actions, too, eh?
Perhaps an option to specify NIL actions (wow - that might be a
confusing name ;) or target-language actions - might be best not to
mix 'em...

-- Scott

----------------------------------------
Scott Stanchfield
http://javadude.com

From parrt at cs.usfca.edu  Tue Jan 19 13:07:04 2010
From: parrt at cs.usfca.edu (Terence Parr)
Date: Tue, 19 Jan 2010 13:07:04 -0800
Subject: [antlr-interest] ANTLR v4 planning stages
In-Reply-To: <d19d16481001191303q45c6288dx526541d2ebada29e@mail.gmail.com>
References: <6B4FBB03-3E1A-4DC9-9788-20787BF2A94F@antlr.org>
	<d19d16481001191249w22e918e2q7ee3c7b76ce5ab0d@mail.gmail.com>
	<9FBCAA66-66F1-4BE0-A482-62FBA7268FC4@cs.usfca.edu>
	<d19d16481001191303q45c6288dx526541d2ebada29e@mail.gmail.com>
Message-ID: <455CF418-6925-460F-A6AB-875F40DD6783@cs.usfca.edu>


On Jan 19, 2010, at 1:03 PM, Scott Stanchfield wrote:

> NIL ;) Ahhh, gotta love the TLAs...
> 
> If you keep the actions to strictly method calls passing
> attribute-expressions as values (and don't allow anything else) I'd
> think it would keep things simple for use and code generation.

i was thinking something like  arbitrary code in some simple imperative language and then any call to @foo() or whatever would call foo in the target language.

> I assume you'd still allow target-language-specific actions, too, eh?
> Perhaps an option to specify NIL actions (wow - that might be a
> confusing name ;) or target-language actions - might be best not to
> mix 'em...

we'd use language=Java for the target language as we do now and then add perhaps actions=NIL to specify what the actions look like. The default would be actions=language.

Perhaps ALE or AIL=ANTLR imperative language? :)
T

From scott at javadude.com  Tue Jan 19 13:19:22 2010
From: scott at javadude.com (Scott Stanchfield)
Date: Tue, 19 Jan 2010 16:19:22 -0500
Subject: [antlr-interest] ANTLR v4 planning stages
In-Reply-To: <455CF418-6925-460F-A6AB-875F40DD6783@cs.usfca.edu>
References: <6B4FBB03-3E1A-4DC9-9788-20787BF2A94F@antlr.org>
	<d19d16481001191249w22e918e2q7ee3c7b76ce5ab0d@mail.gmail.com>
	<9FBCAA66-66F1-4BE0-A482-62FBA7268FC4@cs.usfca.edu>
	<d19d16481001191303q45c6288dx526541d2ebada29e@mail.gmail.com>
	<455CF418-6925-460F-A6AB-875F40DD6783@cs.usfca.edu>
Message-ID: <d19d16481001191319l129dcedfr45bd4190523459ff@mail.gmail.com>

>> If you keep the actions to strictly method calls passing
>> attribute-expressions as values (and don't allow anything else) I'd
>> think it would keep things simple for use and code generation.
>
> i was thinking something like ?arbitrary code in some simple imperative language and then any call to @foo() or whatever would call foo in the target language.

The thing I'd worry about would be feature creep in that language.
Everyone would want "just one more feature" so it could better support
their target language. You'd need to nail down that simple language so
the generators for it could be written - if any new features were
added all generators would be hit.

If you kept it to simple method calls, they could do whatever logic
they want inside the called method. This would force them to keep the
grammar cleaner as well, actions just being calls to the strategy.

Anyway, that's my 3c. I know you like writing languages ;) but my
recommendation would be keep it simple and small, and anything more
complex can be done inside the called methods.

Chew on it a bit and see if anything interesting gets spit up or swallowed...

>> I assume you'd still allow target-language-specific actions, too, eh?
>> Perhaps an option to specify NIL actions (wow - that might be a
>> confusing name ;) or target-language actions - might be best not to
>> mix 'em...
>
> we'd use language=Java for the target language as we do now and then add perhaps actions=NIL to specify what the actions look like. The default would be actions=language.

Cool - best of both worlds.

> Perhaps ALE or AIL=ANTLR imperative language? :)

Not bad... (though "AIL" conjures images of sickness)

ANTLRScript?  (shudder)

-- Scott

From parrt at cs.usfca.edu  Tue Jan 19 16:33:20 2010
From: parrt at cs.usfca.edu (Terence Parr)
Date: Tue, 19 Jan 2010 16:33:20 -0800
Subject: [antlr-interest] ANTLR v4 planning stages
In-Reply-To: <d19d16481001191319l129dcedfr45bd4190523459ff@mail.gmail.com>
References: <6B4FBB03-3E1A-4DC9-9788-20787BF2A94F@antlr.org>
	<d19d16481001191249w22e918e2q7ee3c7b76ce5ab0d@mail.gmail.com>
	<9FBCAA66-66F1-4BE0-A482-62FBA7268FC4@cs.usfca.edu>
	<d19d16481001191303q45c6288dx526541d2ebada29e@mail.gmail.com>
	<455CF418-6925-460F-A6AB-875F40DD6783@cs.usfca.edu>
	<d19d16481001191319l129dcedfr45bd4190523459ff@mail.gmail.com>
Message-ID: <8928CE11-E3D6-4647-81F9-66F810759FE1@cs.usfca.edu>


On Jan 19, 2010, at 1:19 PM, Scott Stanchfield wrote:
>> i was thinking something like  arbitrary code in some simple imperative language and then any call to @foo() or whatever would call foo in the target language.
> 
> The thing I'd worry about would be feature creep in that language.
> Everyone would want "just one more feature" so it could better support
> their target language. You'd need to nail down that simple language so
> the generators for it could be written - if any new features were
> added all generators would be hit.

yeah, a real danger.

> ANTLRScript?  (shudder)

i like! good idea.

Ter

From JALuber at gmx.de  Tue Jan 19 16:42:02 2010
From: JALuber at gmx.de (Johannes Luber)
Date: Wed, 20 Jan 2010 01:42:02 +0100
Subject: [antlr-interest] Tree pattern maching using the C# (was
	C)	target
In-Reply-To: <002401ca985c$bd00f450$3702dcf0$@y.speyer@inter.nl.net>
References: <000901ca95d9$df6dd740$9e4985c0$@y.speyer@inter.nl.net>
	<20100115125833.242280@gmx.net>
	<002401ca985c$bd00f450$3702dcf0$@y.speyer@inter.nl.net>
Message-ID: <20100120004202.274260@gmx.net>

> Hi Johannes,
> 
> I tried the version that you mentioned by downloading it from
> antlr:/runtime/CSharp2 in the Fisheye code repository and then tried to
> compile it using VS2008. This didn't work because a file
> "TokenConstants.cs"
> was reported missing by VS2008 and gave me compilation errors. I managed
> to
> get a version from the CSharp3 repository and after making one change I
> could compile.

Oops - I thought that I had checked in that file already. Can you send both TokenConstants.cs (for comparing with my own version) and the modified grammar file to the list? I'm not sure where the error can be as I lifted more than a few file from the CSharp3 target.

Sam, can you check if the grammar works with CSharp3 target? It would be helpful to narrow down the cause.

Johannes

> I noticed that the Downup method is part of the Treefilter
> class which inherits from the TreeParser class. The grammar for the tree
> parser from the example has the following header:
> 
> // START: header
> tree grammar DefRef;
> options {
>   tokenVocab = Cymbol;
>   ASTLabelType = CommonTree;
>   filter = true;
>   language=CSharp2;
> }
> @members {
>     SymbolTable symtab;
>     Scope currentScope;
>     public DefRef(ITreeNodeStream input, SymbolTable symtab) 
>     	: this(input) 
>     {
>         this.symtab = symtab;
>         currentScope = symtab.globals;
>     }
> }
> // END: header
> 
> Generating the tree parser gives DefRef.cs with the DefRef class declared
> as:
> 
> public partial class DefRef : TreeParser
> 
> 
> Now I can cast this into the TreeFilter class but to test things quickly I
> changed the above line in the DefRef.cs into:
> 
> public partial class DefRef : TreeFilter
> 
> 
> In the calling program I use:
> 
> DefRef def = new DefRef(nodes, symtab); // use custom constructor
> def.Downup(t); // trigger symtab actions upon certain subtrees
> 
> When I run this nothings happens whereas I have grammar rules and actions
> like:
> 
> exitBlock
>     :   BLOCK
>         {
>         Console.WriteLine("locals: "+currentScope);
>         currentScope = currentScope.getEnclosingScope();    // pop scope
>         }
>     ;
> 
> I have not figured out yet why this doesn't work. The examples is a
> one-to-one port of the Java example of pattern 17 Symbol Table for Nested
> Scopes of the Language Implementation Patterns.
> 
> Any idea?
> 
> Thanks,
> 
> Marc
> >-----Original Message-----
> >From: Johannes Luber [mailto:JALuber at gmx.de]
> >Sent: Friday, January 15, 2010 1:59 PM
> >To: Marc Speyer; antlr-interest at antlr.org
> >Subject: Re: [antlr-interest] Tree pattern maching using the C target
> >
> >> Hi all,
> >>
> >> I have a similar issue using the C# target. Using the Cymbol.g example
> of
> >> pattern 17 Symbol Table for Nested Scopes of the Language
> Implementation
> >> Patterns book I could not get it to work because there is now downup
> >> method.
> >> According to the documentation this method walks the AST code using
> >> ANTLR's
> >> built-in downup( ) strategy.
> >>
> >> Am I correct assuming that this has not been implemented yet for the C#
> >> target (as Jim implies in his response). Is it difficult to implement
> it
> >> myself? I guess it would involve implementing the tree pattern matching
> >> stuff.
> >>
> >> Marc
> >
> >You are correct - there is no official version yet, which implements tree
> >pattern matching. I haven't gotten around to the API changes yet (will
> work
> >on that next week), though I have checked in some untested changes. It
> >would be the easieast if you'd base your own code on that for now.
> >
> >Johannes
> >
> >> P.S. Hope this email files under the proper subject thread, and
> apologies
> >> in
> >> advance if it isn't (Just subscribed to the mailing list but I could
> not
> >> find out how to get previous posts from it)
> >>
> >> > Pattern matcher or normal tree walker? The pattern stuff is not
> >> implemented in the C target yet.
> >> >
> >> > Jim
> >> >
> >> >> -----Original Message-----
> >> >> From: antlr-interest-bounces at antlr.org [mailto:antlr-interest-
> >> >> bounces at antlr.org] On Behalf Of Heiko Folkerts
> >> >> Sent: Thursday, January 14, 2010 5:01 AM
> >> >> To: antlr-interest at antlr.org
> >> >> Subject: [antlr-interest] Tree pattern maching using the C target
> >> >>
> >> >> Hi all,
> >> >> I wrote al litle tree pattern matcher for a specific validation we
> >need
> >> >> in our grammar. ANTLR and the C compiler compile it all well but
> there
> >> >> is now "downup" mehtod for running the matcher. Instead I only see
> our
> >> >> own rules in the generated parser. So, is the method to run when
> using
> >> >> a tree pattern macher in the C target different than ^"downup"? How
> to
> >> >> run the matcher?
> >> >>
> >> >> I tried to find an answer in the C examples but there was only a
> >> >> treeparser and no tree pattern matcher.
> >> >>
> >> >> Thx+
> >> >> Heiko
> >> >>
> >> >>
> >> >> --
> >>
> >>
> >>
> >> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> >> Unsubscribe:
> >> http://www.antlr.org/mailman/options/antlr-interest/your-email-address
> >
> >--
> >GRATIS f?r alle GMX-Mitglieder: Die maxdome Movie-FLAT!
> >Jetzt freischalten unter http://portal.gmx.net/de/go/maxdome01
> 
> 
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe:
> http://www.antlr.org/mailman/options/antlr-interest/your-email-address

-- 
Jetzt kostenlos herunterladen: Internet Explorer 8 und Mozilla Firefox 3.5 -
sicherer, schneller und einfacher! http://portal.gmx.net/de/go/atbrowser

From JALuber at gmx.de  Tue Jan 19 16:46:34 2010
From: JALuber at gmx.de (Johannes Luber)
Date: Wed, 20 Jan 2010 01:46:34 +0100
Subject: [antlr-interest] v3.2 C# runtime?
In-Reply-To: <4B55E238.1080408@gmail.com>
References: <4B55E238.1080408@gmail.com>
Message-ID: <20100120004634.274280@gmx.net>

> I'm looking for the ANTLR v3.2 C# runtime support assemblies. I must be
> missing something because I'm having no luck finding it. The page here
> 
>     http://www.antlr.org/download/CSharp
> 
> does not include it. I thought, "Oh, okay... I'll download the source
> code and compile it myself." However the file antlr-3.2.tar.gz pointed
> at by the "ANTLR 3.2 source distribution" link on
> http://www.antlr.org/download.html seems to only contain the source for
> the Java runtime. I attempted a Google search but it only turned up the
> C# runtime for v3.1.3.
> 
> Peter

I'm still working on the 3.2 version. Unless you need the tree pattern matching feature you can stick to the latest version. Otherwise you'd need the repository version, where at least one missing file and a bug have been reported already.

Johannes
-- 
Jetzt kostenlos herunterladen: Internet Explorer 8 und Mozilla Firefox 3.5 -
sicherer, schneller und einfacher! http://portal.gmx.net/de/go/atbrowser

From sharwell at pixelminegames.com  Tue Jan 19 17:30:43 2010
From: sharwell at pixelminegames.com (Sam Harwell)
Date: Tue, 19 Jan 2010 19:30:43 -0600
Subject: [antlr-interest] ANTLR v4 planning stages
References: <6B4FBB03-3E1A-4DC9-9788-20787BF2A94F@antlr.org>
	<d19d16481001191250k6ec6659eu2b3d228fd0eb7325@mail.gmail.com>
Message-ID: <DD5A5D428FE040429CCDF377FAA892840152DE82@martini.ironwillgames.com>

Would it work for you if this information was placed in an xml file next
to the generated code?

Sam

-----Original Message-----
From: antlr-interest-bounces at antlr.org
[mailto:antlr-interest-bounces at antlr.org] On Behalf Of Scott Stanchfield
Sent: Tuesday, January 19, 2010 2:50 PM
To: Terence Parr
Cc: antlr-interest at antlr.org interest
Subject: Re: [antlr-interest] ANTLR v4 planning stages

A little thing to add to the todo list if possible:

I've been looking into debugging support in eclipse. When generating
code, can you add in source-grammar-line/col-matchup comments a bit
more often? in particular, having them appear just before any action
code that's dropped into the generated code would be cool.

Even better: if the comments could also appear before/after attribute
expansion that would help as well.

My goal is to be able to use the target-language debugger and map the
current code position back to the grammar. This allows walking the
grammar while being able to use all of the features of the
target-language debugger (like inspecting variables and such).

I know how to set this up for Java (I did it in ANTLR 2 using Java
SMAPs and it worked well), and I suspect other target languages could
do something similar with a bit more information.


BTW: +1 for $FIRST/$FOLLOW!

-- Scott

----------------------------------------
Scott Stanchfield
http://javadude.com

List: http://www.antlr.org/mailman/listinfo/antlr-interest
Unsubscribe:
http://www.antlr.org/mailman/options/antlr-interest/your-email-address

From sharwell at pixelminegames.com  Tue Jan 19 17:31:16 2010
From: sharwell at pixelminegames.com (Sam Harwell)
Date: Tue, 19 Jan 2010 19:31:16 -0600
Subject: [antlr-interest] Expression parsing ideas for ANTLR v4
Message-ID: <DD5A5D428FE040429CCDF377FAA892840152DE83@martini.ironwillgames.com>

Several expression parsers are limited to handling the binary operator
portion of the expression. In addition to the obvious limitations, it
poses an additional problem for languages like C++ where the assignment
operators are split (in precedence) from the rest of the binary
operators by the ternary operator (?:). My most complicated production
ANTLR grammar (parses the UnrealScript language) currently uses a
completely new expression parser that offers a great deal more
flexibility than the previous approaches I tried. I don't think it's the
end-all solution for integrating expression parsing into ANTLR for v4,
but I believe it's a worthwhile example to show what's possible. Here
are some pros and cons of the implementation:

 
Pros:

*         The source code declaring the operator precedence and
associativity is very clean (see reference to
UnrealScriptParserHelper.cs below)

*         Very fast execution

*         Supports a great deal more operations than simply binary
operators

*         Supports operator precedence and associativity in groups

*         Directly supports changing the token type during AST
generation - for example if the token '-' is named MINUS, you could
produce an AST with AST_SUBTRACT when it appears as a binary operator
and AST_NEGATE when it appears as a unary operator.

 
Cons:

*         Not currently integrated into the ANTLR language (executes in
code)

*         No compile-time detection of ambiguous operator rules

*         Not implemented as fully as is possible

 
General idea: Parse every component of an expression into a list - this
includes all operators and "atoms". The list is then passed to a
"precedence processor" to produce a tree for that expression.

 
Operator categories: This parser was built with the following categories
in mind, but the grouping operators are not implemented at this point.
With this as a starting place, it's clear how the list might be expanded
in the future:

 
*         Unary operator: either prefix or postfix

*         Binary operator

*         Ternary operator

*         Grouping operator: for example, the ( and ) in (expression)

*         Postfix grouping operator: for example, the ( and ) in
methodName(args) or the [ and ] in var[index].

*         Prefix grouping operator: for example, the ( and ) in
(TargetType)objectToCast.

 
Attached is:

 
*         UnrealScriptParserHelper.cs: The complete code for declaring a
working precedence parser for UnrealScript.

*         Antlr.Runtime.Expressions.zip: The current implementation of
this feature.

 
I'm very interested in any feedback y'all may have on this.

 
Thank you,

Sam Harwell

 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: UnrealScriptParserHelper.cs
Type: application/octet-stream
Size: 5227 bytes
Desc: UnrealScriptParserHelper.cs
Url : http://www.antlr.org/pipermail/antlr-interest/attachments/20100119/97cae7d9/attachment.obj 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Antlr.Runtime.Expressions.zip
Type: application/x-zip-compressed
Size: 6152 bytes
Desc: Antlr.Runtime.Expressions.zip
Url : http://www.antlr.org/pipermail/antlr-interest/attachments/20100119/97cae7d9/attachment.bin 

From pcc482719 at gmail.com  Tue Jan 19 18:48:07 2010
From: pcc482719 at gmail.com (Peter C. Chapin)
Date: Tue, 19 Jan 2010 21:48:07 -0500
Subject: [antlr-interest] v3.2 C# runtime?
In-Reply-To: <20100120004634.274280@gmx.net>
References: <4B55E238.1080408@gmail.com> <20100120004634.274280@gmx.net>
Message-ID: <4B566EE7.4040601@gmail.com>

On 2010-01-19 19:46, Johannes Luber wrote:

> I'm still working on the 3.2 version. Unless you need the tree pattern
> matching feature you can stick to the latest version. Otherwise you'd
> need the repository version, where at least one missing file and a bug
> have been reported already.

Okay, that's good to know! At least I'm not crazy for not finding it on
the regular download page. Anyway thanks for the update. I'll continue
with 3.1.3 for now.

Peter


From gustaf.j at gmail.com  Wed Jan 20 01:49:14 2010
From: gustaf.j at gmail.com (Gustaf Johansson)
Date: Wed, 20 Jan 2010 10:49:14 +0100
Subject: [antlr-interest] Implicit imports
Message-ID: <5f59a7211001200149i7c7ad186k50a1589d4862906b@mail.gmail.com>

Hi,

I have a grammar in which there can be implicit imports of a few
definitions example:

module A {
  enum myEnumA { A1, A2, A3 }
}

module B {
  import module A;
  function myFuncB (int, myEnumA) {
    ...
  }
}


module Prog {
  import B;
  myFuncB (1, A2);  *
}

*Here A2 is implicitly known to be of type myEnumA, since the
definition of myFuncB is in B and B imports A.

The problem i have is that my parser reports A2 as unknown.
I have not come up with a good and simple solution to this.
I have been thinking along the lines of:
Check definition of myFuncB and if it takes a enum as argument, check
the local module's imports for the definition of that enum.

Any help is really appreciated.

Best Regards Gustaf

From linlin.xie at siemens.com  Wed Jan 20 03:31:44 2010
From: linlin.xie at siemens.com (Xie, Linlin)
Date: Wed, 20 Jan 2010 12:31:44 +0100
Subject: [antlr-interest] UTF-8 input?
Message-ID: <79118B9FE8CE8E49B0D71964A79CB647033CA2D5@dekomplm002.net.plm.eds.com>

Can anyone tell me if antlr3.1.3 generated parser works with UTF-8
input? If it does, how should I configure in the grammar? I noticed
there are two macros ANTLR3_INLINE_INPUT_ASCII and
ANTLR3_INLINE_INPUT_UTF16, but no UTF-8 one.

 
Many thanks!

Linlin


From arne.schroeder at gmail.com  Wed Jan 20 03:58:27 2010
From: arne.schroeder at gmail.com (=?ISO-8859-1?Q?Arne_Schr=F6der?=)
Date: Wed, 20 Jan 2010 12:58:27 +0100
Subject: [antlr-interest] Missing error when tokens are left to parse
In-Reply-To: <4b504d15.2508c00a.6ff6.4932SMTPIN_ADDED@mx.google.com>
References: <d972facc1001150057q23140056s6453d145a0763817@mail.gmail.com> 
	<1ec078df1001150127r753cb368p3e70c1039d59101d@mail.gmail.com> 
	<d972facc1001150143m51bb493fi6c8ff8a58fe745fa@mail.gmail.com> 
	<4b504d15.2508c00a.6ff6.4932SMTPIN_ADDED@mx.google.com>
Message-ID: <d972facc1001200358q4255b1f7l2d34486c255926d7@mail.gmail.com>

Thank you for your help.

It now works insofar as the parser now throws an error-message when not
encountering EOF after all rules are finished.

On Fri, Jan 15, 2010 at 12:10 PM, Gavin Lambert <antlr at mirality.co.nz>wrote:

> At 22:43 15/01/2010, Arne Schr?der wrote:
> >file    : section1 section2?
> >        ;
> [...]
>
> >If I now try to parse "Section1 bla()) Section2" something similar
> >happens:
> >It parses up to the second ")" and then decides to skip the rest.
> >And I definitely do not want the second ")" to be there i.e. want
> >it to throw a recognition-error and recover itself.
>
> Try adding EOF to the end of your top-level rule.  Without that, ANTLR
> assumes that it is not required to parse all the input, so if it
> successfully parses a section1 it will just decide that the section2 has
> been omitted (since it's optional).
>
>

From JALuber at gmx.de  Wed Jan 20 04:08:14 2010
From: JALuber at gmx.de (Johannes Luber)
Date: Wed, 20 Jan 2010 13:08:14 +0100
Subject: [antlr-interest] Expression parsing ideas for ANTLR v4
In-Reply-To: <DD5A5D428FE040429CCDF377FAA892840152DE83@martini.ironwillgames.com>
References: <DD5A5D428FE040429CCDF377FAA892840152DE83@martini.ironwillgames.com>
Message-ID: <20100120120814.15250@gmx.net>

> Several expression parsers are limited to handling the binary operator
> portion of the expression. In addition to the obvious limitations, it
> poses an additional problem for languages like C++ where the assignment
> operators are split (in precedence) from the rest of the binary
> operators by the ternary operator (?:). My most complicated production
> ANTLR grammar (parses the UnrealScript language) currently uses a
> completely new expression parser that offers a great deal more
> flexibility than the previous approaches I tried. I don't think it's the
> end-all solution for integrating expression parsing into ANTLR for v4,
> but I believe it's a worthwhile example to show what's possible. Here
> are some pros and cons of the implementation:
> 
...
> 
> I'm very interested in any feedback y'all may have on this.
> 

As a layman in expression parsing I don't feel qualified to comment on if your solution lacks certain features, but the way you define the operators looks clean to me. One knows immediately how operators work in a given language. The only not obvious thing is if the precedence is ascending or descending. I guess ascending from my knowledge of C#.

BTW, which tokens are encoded as CATEQ and CAT2EQ?

Johannes
-- 
Preisknaller: GMX DSL Flatrate f?r nur 16,99 Euro/mtl.!
http://portal.gmx.net/de/go/dsl02

From linlin.xie at siemens.com  Wed Jan 20 04:23:33 2010
From: linlin.xie at siemens.com (Xie, Linlin)
Date: Wed, 20 Jan 2010 13:23:33 +0100
Subject: [antlr-interest] FW:  UTF-8 input?
Message-ID: <79118B9FE8CE8E49B0D71964A79CB647033CA342@dekomplm002.net.plm.eds.com>

Sorry, I mean the antlr generated C parser!

Thanks!

-----Original Message-----
From: antlr-interest-bounces at antlr.org
[mailto:antlr-interest-bounces at antlr.org] On Behalf Of Xie, Linlin
Sent: 20 January 2010 11:32
To: antlr-interest at antlr.org
Subject: [antlr-interest] UTF-8 input?

Can anyone tell me if antlr3.1.3 generated parser works with UTF-8
input? If it does, how should I configure in the grammar? I noticed
there are two macros ANTLR3_INLINE_INPUT_ASCII and
ANTLR3_INLINE_INPUT_UTF16, but no UTF-8 one.

 
Many thanks!

Linlin


List: http://www.antlr.org/mailman/listinfo/antlr-interest
Unsubscribe:
http://www.antlr.org/mailman/options/antlr-interest/your-email-address

From sharwell at pixelminegames.com  Wed Jan 20 06:09:08 2010
From: sharwell at pixelminegames.com (Sam Harwell)
Date: Wed, 20 Jan 2010 08:09:08 -0600
Subject: [antlr-interest] Expression parsing ideas for ANTLR v4
References: <DD5A5D428FE040429CCDF377FAA892840152DE83@martini.ironwillgames.com>
	<20100120120814.15250@gmx.net>
Message-ID: <DD5A5D428FE040429CCDF377FAA892840152DE86@martini.ironwillgames.com>

UnrealScript uses $ and @ for two types of string concatenation. It also has $= and @= to match.


-----Original Message-----
From: Johannes Luber [mailto:JALuber at gmx.de] 
Sent: Wednesday, January 20, 2010 6:08 AM
To: Sam Harwell; antlr-interest at antlr.org
Subject: Re: [antlr-interest] Expression parsing ideas for ANTLR v4

> Several expression parsers are limited to handling the binary operator
> portion of the expression. In addition to the obvious limitations, it
> poses an additional problem for languages like C++ where the assignment
> operators are split (in precedence) from the rest of the binary
> operators by the ternary operator (?:). My most complicated production
> ANTLR grammar (parses the UnrealScript language) currently uses a
> completely new expression parser that offers a great deal more
> flexibility than the previous approaches I tried. I don't think it's the
> end-all solution for integrating expression parsing into ANTLR for v4,
> but I believe it's a worthwhile example to show what's possible. Here
> are some pros and cons of the implementation:
> 
...
> 
> I'm very interested in any feedback y'all may have on this.
> 

As a layman in expression parsing I don't feel qualified to comment on if your solution lacks certain features, but the way you define the operators looks clean to me. One knows immediately how operators work in a given language. The only not obvious thing is if the precedence is ascending or descending. I guess ascending from my knowledge of C#.

BTW, which tokens are encoded as CATEQ and CAT2EQ?

Johannes
-- 
Preisknaller: GMX DSL Flatrate f?r nur 16,99 Euro/mtl.!
http://portal.gmx.net/de/go/dsl02

From jimi at temporal-wave.com  Wed Jan 20 08:30:47 2010
From: jimi at temporal-wave.com (Jim Idle)
Date: Wed, 20 Jan 2010 08:30:47 -0800
Subject: [antlr-interest] UTF-8 input?
In-Reply-To: <79118B9FE8CE8E49B0D71964A79CB647033CA2D5@dekomplm002.net.plm.eds.com>
Message-ID: <b4a29e8bfd66e445960573c91dcd6f93@temporal-wave.com>

You need to remember to state which target you are talking about.

I have written a new universal input stream for the next version of the C runtime. It takes 8bit, 16 bit, UTF-8, UTF-16, UCS2, UTF32 and EBCDIC (code gen will change slightly to support this). It is not well tested right now but will be available as a snapshot 3.3 release shortly in the downloads page.

In the meantime the easiest thing to do is to convert to UCS2 using the supplied converter in the current runtime. Though this will not work with surrogate pairs in UTF-16 though but most people do not need that.

If you really need UTf-8 without conversion then it is easy enough to write, or you can just steal the code from my check in of the code in about 10 minutes. Note that while the streams work, I have not provided ANTLR3_STRING support for UTF-8 and so on yet and so getting $text from such a stream may or may not work,

Jim

> -----Original Message-----
> From: antlr-interest-bounces at antlr.org [mailto:antlr-interest-
> bounces at antlr.org] On Behalf Of Xie, Linlin
> Sent: Wednesday, January 20, 2010 3:32 AM
> To: antlr-interest at antlr.org
> Subject: [antlr-interest] UTF-8 input?
> 
> Can anyone tell me if antlr3.1.3 generated parser works with UTF-8
> input? If it does, how should I configure in the grammar? I noticed
> there are two macros ANTLR3_INLINE_INPUT_ASCII and
> ANTLR3_INLINE_INPUT_UTF16, but no UTF-8 one.
> 
> 
> 
> Many thanks!
> 
> Linlin
> 
> 
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-
> email-address


From aph at redhat.com  Wed Jan 20 10:57:43 2010
From: aph at redhat.com (Andrew Haley)
Date: Wed, 20 Jan 2010 18:57:43 +0000
Subject: [antlr-interest] java.g does not compile
Message-ID: <4B575227.5050904@redhat.com>

I just downloaded java.g from
http://openjdk.java.net/projects/compiler-grammar/antlrworks/Java.g
and

~ $ java -jar Downloads/antlr-3.2.jar java.g
warning(209): java.g:1771:1: Multiple token rules can match input such as "'*'": STAR, STAREQ

As a result, token(s) STAREQ were disabled for that input
warning(209): java.g:1811:1: Multiple token rules can match input such as "'i'": IF, IMPLEMENTS, IMPORT, INSTANCEOF, INT, INTERFACE, IDENTIFIER

...

error(208): java.g:1799:1: The following token definitions can never be matched because prior tokens match the same input: INTLITERAL,DOUBLELITERAL,LINE_COMMENT,ASSERT,BREAK,BYTE,CATCH,CHAR,CLASS,CONST,CONTINUE,DO,DOUBLE,ENUM,EXTENDS,FINALLY,FLOAT,FOR,IMPLEMENTS,IMPORT,INSTANCEOF,INT,INTERFACE,NEW,PRIVATE,PROTECTED,PUBLIC,STATIC,STRICTFP,SUPER,SWITCH,SYNCHRONIZED,THROW,THROWS,TRANSIENT,TRY,VOLATILE,TRUE,FALSE,NULL,DOT,ELLIPSIS,EQEQ,PLUS,SUB,SLASH,AMP,BAR,PLUSEQ,SUBEQ,STAREQ,SLASHEQ,AMPEQ,BAREQ,CARETEQ,PERCENTEQ,BANGEQ

This seems very odd.  Any ideas?  It's claimed to be a grammar for
ANTLR v3.

Andrew.

From jimi at temporal-wave.com  Wed Jan 20 11:31:45 2010
From: jimi at temporal-wave.com (Jim Idle)
Date: Wed, 20 Jan 2010 11:31:45 -0800
Subject: [antlr-interest] java.g does not compile
In-Reply-To: <4B575227.5050904@redhat.com>
Message-ID: <3feca76b9c5b0547bf1582becc8ac1c3@temporal-wave.com>

Souds like your machine is pretty slow and the conversion timeout default is therefore not engouh.

Use the -Xconversiontimeout 30000 option to increase the elapsed time it will spend on it.

Jim

> -----Original Message-----
> From: antlr-interest-bounces at antlr.org [mailto:antlr-interest-
> bounces at antlr.org] On Behalf Of Andrew Haley
> Sent: Wednesday, January 20, 2010 10:58 AM
> To: antlr-interest at antlr.org
> Subject: [antlr-interest] java.g does not compile
> 
> I just downloaded java.g from
> http://openjdk.java.net/projects/compiler-grammar/antlrworks/Java.g
> and
> 
> ~ $ java -jar Downloads/antlr-3.2.jar java.g
> warning(209): java.g:1771:1: Multiple token rules can match input such
> as "'*'": STAR, STAREQ
> 
> As a result, token(s) STAREQ were disabled for that input
> warning(209): java.g:1811:1: Multiple token rules can match input such
> as "'i'": IF, IMPLEMENTS, IMPORT, INSTANCEOF, INT, INTERFACE,
> IDENTIFIER
> 
> ...
> 
> error(208): java.g:1799:1: The following token definitions can never be
> matched because prior tokens match the same input:
> INTLITERAL,DOUBLELITERAL,LINE_COMMENT,ASSERT,BREAK,BYTE,CATCH,CHAR,CLAS
> S,CONST,CONTINUE,DO,DOUBLE,ENUM,EXTENDS,FINALLY,FLOAT,FOR,IMPLEMENTS,IM
> PORT,INSTANCEOF,INT,INTERFACE,NEW,PRIVATE,PROTECTED,PUBLIC,STATIC,STRIC
> TFP,SUPER,SWITCH,SYNCHRONIZED,THROW,THROWS,TRANSIENT,TRY,VOLATILE,TRUE,
> FALSE,NULL,DOT,ELLIPSIS,EQEQ,PLUS,SUB,SLASH,AMP,BAR,PLUSEQ,SUBEQ,STAREQ
> ,SLASHEQ,AMPEQ,BAREQ,CARETEQ,PERCENTEQ,BANGEQ
> 
> This seems very odd.  Any ideas?  It's claimed to be a grammar for
> ANTLR v3.
> 
> Andrew.
> 
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-
> email-address


From wclodius at los-alamos.net  Wed Jan 20 19:21:30 2010
From: wclodius at los-alamos.net (William B. Clodius)
Date: Wed, 20 Jan 2010 20:21:30 -0700
Subject: [antlr-interest] Implicit imports
In-Reply-To: <5f59a7211001200149i7c7ad186k50a1589d4862906b@mail.gmail.com>
References: <5f59a7211001200149i7c7ad186k50a1589d4862906b@mail.gmail.com>
Message-ID: <214750A6-A234-4B88-BBBF-CA67F62C3B64@los-alamos.net>

First terminology. This sort of analysis is not done as part of parsing, but as part of the semantic analysis. 

You need to develop a simplified representation of the important semantic information, i.e., the names of public  entities and their types and store that for comparison. Typically the modules can be in separate files, and to minimize processing it is useful to create a separate file for each module containing the information. The file should be much smaller than a typical source code file and the contents should have a structure as close as possible to the internal representation used for the data. However it is also useful to have additional information such as a time stamp, and a version number for the code that generated the summary, so that you can identify whether the contents are out of date either compared with the contents of the module or with the code of your compiler/interpreter. Typically the "summary" file is a text file so that problems can be visually identified, but a binary form can be more compact and faster to process.

On Jan 20, 2010, at 2:49 AM, Gustaf Johansson wrote:

> Hi,
> 
> I have a grammar in which there can be implicit imports of a few
> definitions example:
> 
> module A {
>  enum myEnumA { A1, A2, A3 }
> }
> 
> module B {
>  import module A;
>  function myFuncB (int, myEnumA) {
>    ...
>  }
> }
> 
> 
> module Prog {
>  import B;
>  myFuncB (1, A2);  *
> }
> 
> *Here A2 is implicitly known to be of type myEnumA, since the
> definition of myFuncB is in B and B imports A.
> 
> The problem i have is that my parser reports A2 as unknown.
> I have not come up with a good and simple solution to this.
> I have been thinking along the lines of:
> Check definition of myFuncB and if it takes a enum as argument, check
> the local module's imports for the definition of that enum.
> 
> Any help is really appreciated.
> 
> Best Regards Gustaf
> 
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address
> 


From aph at redhat.com  Thu Jan 21 01:49:19 2010
From: aph at redhat.com (Andrew Haley)
Date: Thu, 21 Jan 2010 09:49:19 +0000
Subject: [antlr-interest] java.g does not compile
In-Reply-To: <3feca76b9c5b0547bf1582becc8ac1c3@temporal-wave.com>
References: <3feca76b9c5b0547bf1582becc8ac1c3@temporal-wave.com>
Message-ID: <4B58231F.3070701@redhat.com>

On 01/20/2010 07:31 PM, Jim Idle wrote:
> Souds like your machine is pretty slow and the conversion timeout default is therefore not engouh.
> 
> Use the -Xconversiontimeout 30000 option to increase the elapsed time it will spend on it.

Thank you, that worked.

However, this is a fast machine: a four-core Nehalem-based Xeon system.
There are faster machines available, but not many.  :-)

Andrew.

From iwm at doc.ic.ac.uk  Thu Jan 21 02:22:33 2010
From: iwm at doc.ic.ac.uk (Ian Moor)
Date: Thu, 21 Jan 2010 10:22:33 +0000
Subject: [antlr-interest] gunit problem
Message-ID: <4B582AE9.4030403@doc.ic.ac.uk>

I am using the gunit which is provided with antlr 3.2 and
I am trying to test parts of an tree, for example
   statement walks statements:
    "x=1" -> "ok"

I expect an error message saying the code produced to System.out is
not  "ok", but gunit prints no output, ans stops with a non zero return
value.

I have a couple of simple program walks program (where program is the
complete program),and when the example above is commented out
gunit gives correct test results.

If I use  -o, gunit hangs, and when I stop it
the junit file has code for all of the tests looking as if
they can be run.

Is there a way finding what is happening, or a later gunit ?
     Ian Moor


From yurushkin at rambler.ru  Thu Jan 21 03:21:43 2010
From: yurushkin at rambler.ru (=?koi8-r?B?4NLV28vJziDtycjBycw=?=)
Date: Thu, 21 Jan 2010 14:21:43 +0300
Subject: [antlr-interest] [C target] Duplicating tree error
Message-ID: <op.u6vq6hzyt3jqlu@win-mupvrp0jyrf>

Good day,

I have the following rewrite rule:

type_declaration_stmt
   : label? declaration_type_spec ( (T_COMMA  attr_spec )* T_COLON_COLON )?
     entity_decl (T_COMMA entity_decl)* end_of_stmt    	
     -> ^(T_TYPE_DECLARATION_STMT declaration_type_spec attr_spec*  
entity_decl)+
   ;

and this is a piece of tree parser grammar:

type_declaration_stmt
   :  ^(T_TYPE_DECLARATION_STMT declaration_type_spec attr_spec*  
entity_decl)
   ;


When I give "integer a, b, c" on the input, 3 sequential  
T_TYPE_DECLARATION-trees are
generated. It's right.
BUT declaration_type_spec subtree isn't dublicated (only the root of  
subtree).

Where is mistake?
thanks


-- 
Best regards,
Michael

From antlr at mirality.co.nz  Thu Jan 21 03:42:01 2010
From: antlr at mirality.co.nz (Gavin Lambert)
Date: Fri, 22 Jan 2010 00:42:01 +1300
Subject: [antlr-interest] [C target] Duplicating tree error
In-Reply-To: <op.u6vq6hzyt3jqlu@win-mupvrp0jyrf>
References: <op.u6vq6hzyt3jqlu@win-mupvrp0jyrf>
Message-ID: <20100121114217.994F13418424@www.antlr.org>

At 00:21 22/01/2010, =?koi8-r?B?4NLV28vJziDtycjBycw=?= wrote:
 >type_declaration_stmt
 >   : label? declaration_type_spec ( (T_COMMA  attr_spec )*
 >T_COLON_COLON )?
 >     entity_decl (T_COMMA entity_decl)* end_of_stmt    	
 >     -> ^(T_TYPE_DECLARATION_STMT declaration_type_spec 
attr_spec*
 >entity_decl)+
 >   ;
[...]
 >BUT declaration_type_spec subtree isn't dublicated (only the 
root
 >of subtree).
 >
 >Where is mistake?

IIRC, when you use a rule name in a rewrite rule, it represents 
"the first unused instance of this rule in the input" (which is 
why entity_decl is doing what it is).  So the second and 
subsequent times it appears (during the + loop) the value is empty 
since it didn't occur any more times in the input.  To duplicate 
nodes you need to use a label.


From yurushkin at rambler.ru  Thu Jan 21 03:56:53 2010
From: yurushkin at rambler.ru (=?koi8-r?B?4NLV28vJziDtycjBycw=?=)
Date: Thu, 21 Jan 2010 14:56:53 +0300
Subject: [antlr-interest] [C target] Duplicating tree error
In-Reply-To: <20100121114219.75CF337588D@mx5.rambler.ru>
References: <op.u6vq6hzyt3jqlu@win-mupvrp0jyrf>
	<20100121114219.75CF337588D@mx5.rambler.ru>
Message-ID: <op.u6vss3pst3jqlu@win-mupvrp0jyrf>

Excuse me, what you mean behind "you need to use a label"? Could you send  
me
example?

And, currently, I haven't seen problems with duplicating of "entity_decl"  
tree.
I have a fault with a coping of "declaration_type_spec" tree.


Gavin Lambert <antlr at mirality.co.nz> ?????(?) ? ????? ?????? Thu, 21 Jan  
2010 14:42:01 +0300:

> At 00:21 22/01/2010, =?koi8-r?B?4NLV28vJziDtycjBycw=?= wrote:
>  >type_declaration_stmt
>  >   : label? declaration_type_spec ( (T_COMMA  attr_spec )*
>  >T_COLON_COLON )?
>  >     entity_decl (T_COMMA entity_decl)* end_of_stmt    	
>  >     -> ^(T_TYPE_DECLARATION_STMT declaration_type_spec attr_spec*
>  >entity_decl)+
>  >   ;
> [...]
>  >BUT declaration_type_spec subtree isn't dublicated (only the root
>  >of subtree).
>  >
>  >Where is mistake?
>
> IIRC, when you use a rule name in a rewrite rule, it represents "the  
> first unused instance of this rule in the input" (which is why  
> entity_decl is doing what it is).  So the second and subsequent times it  
> appears (during the + loop) the value is empty since it didn't occur any  
> more times in the input.  To duplicate nodes you need to use a label.
>
>
> __________ Information from ESET Smart Security, version of virus  
> signature database 4792 (20100121) __________
>
> The message was checked by ESET Smart Security.
>
> http://www.esetnod32.ru
>
>
>


-- 
Best regards,
Michael

From jimi at temporal-wave.com  Thu Jan 21 06:40:58 2010
From: jimi at temporal-wave.com (Jim Idle)
Date: Thu, 21 Jan 2010 06:40:58 -0800
Subject: [antlr-interest] java.g does not compile
In-Reply-To: <4B58231F.3070701@redhat.com>
Message-ID: <47ff671de7f8524dbbca4695f1f41700@temporal-wave.com>

You are probably right on the limit of the default 10000, or perhaps you are not compiling the exact original? Try the on in the examples zip and see if there are any differences. However, Xeon's are not as fast as you think on a single thread which is what the analysis phase runs on by default.

Jim

> -----Original Message-----
> From: Andrew Haley [mailto:aph at redhat.com]
> Sent: Thursday, January 21, 2010 1:49 AM
> To: Jim Idle
> Cc: antlr-interest at antlr.org
> Subject: Re: [antlr-interest] java.g does not compile
> 
> On 01/20/2010 07:31 PM, Jim Idle wrote:
> > Souds like your machine is pretty slow and the conversion timeout
> default is therefore not engouh.
> >
> > Use the -Xconversiontimeout 30000 option to increase the elapsed time
> it will spend on it.
> 
> Thank you, that worked.
> 
> However, this is a fast machine: a four-core Nehalem-based Xeon system.
> There are faster machines available, but not many.  :-)
> 
> Andrew.


From jimi at temporal-wave.com  Thu Jan 21 06:47:57 2010
From: jimi at temporal-wave.com (Jim Idle)
Date: Thu, 21 Jan 2010 06:47:57 -0800
Subject: [antlr-interest] [C target] Duplicating tree error
In-Reply-To: <op.u6vss3pst3jqlu@win-mupvrp0jyrf>
Message-ID: <5cd0ce1546bc114c83d6a37e06a0f390@temporal-wave.com>

Well, you are rewriting the tree with ^(....)+ but the tree grammar only walks one declaration ^(...). Unless you are using the + higher up the rule chain.

Example is:

... e+=entity_decl (COMMA e+=entity_decl)* ...
 -> ^(X .... $e)+

Also, your rewrite rule loses the label. 

It is generally a good idea to break components up in to separate rules when rewriting as then the token boundaries of nodes are correctly set. So here, you would move label? In to a higher rule for instance.

Jim

> -----Original Message-----
> From: antlr-interest-bounces at antlr.org [mailto:antlr-interest-
> bounces at antlr.org] On Behalf Of ??????? ??????
> Sent: Thursday, January 21, 2010 3:57 AM
> To: Gavin Lambert; antlr-interest at antlr.org
> Subject: Re: [antlr-interest] [C target] Duplicating tree error
> 
> Excuse me, what you mean behind "you need to use a label"? Could you
> send
> me
> example?
> 
> And, currently, I haven't seen problems with duplicating of
> "entity_decl"
> tree.
> I have a fault with a coping of "declaration_type_spec" tree.
> 
> 
> Gavin Lambert <antlr at mirality.co.nz> ?????(?) ? ????? ?????? Thu, 21
> Jan
> 2010 14:42:01 +0300:
> 
> > At 00:21 22/01/2010, =?koi8-r?B?4NLV28vJziDtycjBycw=?= wrote:
> >  >type_declaration_stmt
> >  >   : label? declaration_type_spec ( (T_COMMA  attr_spec )*
> >  >T_COLON_COLON )?
> >  >     entity_decl (T_COMMA entity_decl)* end_of_stmt
> >  >     -> ^(T_TYPE_DECLARATION_STMT declaration_type_spec attr_spec*
> >  >entity_decl)+
> >  >   ;
> > [...]
> >  >BUT declaration_type_spec subtree isn't dublicated (only the root
> >  >of subtree).
> >  >
> >  >Where is mistake?
> >
> > IIRC, when you use a rule name in a rewrite rule, it represents "the
> > first unused instance of this rule in the input" (which is why
> > entity_decl is doing what it is).  So the second and subsequent times
> it
> > appears (during the + loop) the value is empty since it didn't occur
> any
> > more times in the input.  To duplicate nodes you need to use a label.
> >
> >
> > __________ Information from ESET Smart Security, version of virus
> > signature database 4792 (20100121) __________
> >
> > The message was checked by ESET Smart Security.
> >
> > http://www.esetnod32.ru
> >
> >
> >
> 
> 
> --
> Best regards,
> Michael
> 
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-
> email-address


From aph at redhat.com  Thu Jan 21 06:50:57 2010
From: aph at redhat.com (Andrew Haley)
Date: Thu, 21 Jan 2010 14:50:57 +0000
Subject: [antlr-interest] java.g does not compile
In-Reply-To: <47ff671de7f8524dbbca4695f1f41700@temporal-wave.com>
References: <47ff671de7f8524dbbca4695f1f41700@temporal-wave.com>
Message-ID: <4B5869D1.1090707@redhat.com>

On 01/21/2010 02:40 PM, Jim Idle wrote:

> You are probably right on the limit of the default 10000, or perhaps
> you are not compiling the exact original?

I haven't touched it.  Honestly!

Besides, the default seems to be 1000, not 10000.  

 $ java -jar Downloads/antlr-3.2.jar -X
  -Xconversiontimeout t   set NFA conversion timeout (ms) for each decision          [1000]

I changed it to 10000, and all is fine:

--- antlr-3.2/tool/src/main/java/org/antlr/analysis/DFA.java~   2009-09-23 19:36:06.000000000 +0100
+++ antlr-3.2/tool/src/main/java/org/antlr/analysis/DFA.java    2010-01-21 13:08:32.625782840 +0000
@@ -53,7 +53,7 @@
         */
 
        /** Set to 0 to not terminate early (time in ms) */
-       public static int MAX_TIME_PER_DFA_CREATION = 1*1000;
+       public static int MAX_TIME_PER_DFA_CREATION = 10*1000;
 
        /** How many edges can each DFA state have before a "special" state
         *  is created that uses IF expressions instead of a table?

> Try the on in the examples zip and see if there are any
> differences. However, Xeon's are not as fast as you think on a
> single thread which is what the analysis phase runs on by default.

Err, how on Earth do you know how fast I think Xeons are?  :-)
But anyway, most users aren't likely to have anything hugely faster.

Andrew.

From jimi at temporal-wave.com  Thu Jan 21 07:00:30 2010
From: jimi at temporal-wave.com (Jim Idle)
Date: Thu, 21 Jan 2010 07:00:30 -0800
Subject: [antlr-interest] java.g does not compile
In-Reply-To: <4B5869D1.1090707@redhat.com>
Message-ID: <6653941088acac4fb5b7cb8c45a7a0ac@temporal-wave.com>

I wouldn't change the default time out as then your project depends on a custom version of NATLR for no good reason. That was just my 6:40AM typo of course :-)

I have a QX9450 and some i7s. I think that the Xeon server versions of 9450 etc might be slower on a single thread. I think a lot of the i7s are faster than Xeon? However I haven't bothered with Xeon myself. But, it depends what you are measuring. Most of the published benchmark programs are worthless.

Jim

> -----Original Message-----
> From: Andrew Haley [mailto:aph at redhat.com]
> Sent: Thursday, January 21, 2010 6:51 AM
> To: Jim Idle
> Cc: antlr-interest at antlr.org
> Subject: Re: [antlr-interest] java.g does not compile
> 
> On 01/21/2010 02:40 PM, Jim Idle wrote:
> 
> > You are probably right on the limit of the default 10000, or perhaps
> > you are not compiling the exact original?
> 
> I haven't touched it.  Honestly!
> 
> Besides, the default seems to be 1000, not 10000.
> 
>  $ java -jar Downloads/antlr-3.2.jar -X
>   -Xconversiontimeout t   set NFA conversion timeout (ms) for each
> decision          [1000]
> 
> I changed it to 10000, and all is fine:
> 
> --- antlr-3.2/tool/src/main/java/org/antlr/analysis/DFA.java~   2009-
> 09-23 19:36:06.000000000 +0100
> +++ antlr-3.2/tool/src/main/java/org/antlr/analysis/DFA.java    2010-
> 01-21 13:08:32.625782840 +0000
> @@ -53,7 +53,7 @@
>          */
> 
>         /** Set to 0 to not terminate early (time in ms) */
> -       public static int MAX_TIME_PER_DFA_CREATION = 1*1000;
> +       public static int MAX_TIME_PER_DFA_CREATION = 10*1000;
> 
>         /** How many edges can each DFA state have before a "special"
> state
>          *  is created that uses IF expressions instead of a table?
> 
> > Try the on in the examples zip and see if there are any
> > differences. However, Xeon's are not as fast as you think on a
> > single thread which is what the analysis phase runs on by default.
> 
> Err, how on Earth do you know how fast I think Xeons are?  :-)
> But anyway, most users aren't likely to have anything hugely faster.
> 
> Andrew.


From aph at redhat.com  Thu Jan 21 07:22:39 2010
From: aph at redhat.com (Andrew Haley)
Date: Thu, 21 Jan 2010 15:22:39 +0000
Subject: [antlr-interest] java.g does not compile
In-Reply-To: <6653941088acac4fb5b7cb8c45a7a0ac@temporal-wave.com>
References: <6653941088acac4fb5b7cb8c45a7a0ac@temporal-wave.com>
Message-ID: <4B58713F.2070009@redhat.com>

On 01/21/2010 03:00 PM, Jim Idle wrote:

> I wouldn't change the default time out as then your project depends
> on a custom version of NATLR for no good reason. That was just my
> 6:40AM typo of course :-)

I'm using antlrworks, and I can't find any other way to change the
default.

> I have a QX9450 and some i7s. I think that the Xeon server versions
> of 9450 etc might be slower on a single thread.

I'm sure they would be, but I'm talking about a Nehalem-based Xeon: it
*is* an i7, not a Core 2 anything.  The Xeon 35xx and Core i7-9xx are
more or less the same thing.

So, I think that almost everyone would have the same problem I'm having.

Andrew.

From jimi at temporal-wave.com  Thu Jan 21 07:41:24 2010
From: jimi at temporal-wave.com (Jim Idle)
Date: Thu, 21 Jan 2010 07:41:24 -0800
Subject: [antlr-interest] java.g does not compile
In-Reply-To: <4B58713F.2070009@redhat.com>
Message-ID: <57cab6502a087249bff2330f62dab5f0@temporal-wave.com>

They probably would, unless they read the comments at the start of the .g file, where it says:

*  NOTE: If you try to compile this file from command line and Antlr gives an exception 
*    like error message while compiling, add option 
*    -Xconversiontimeout 100000
*    to the command line.  

Sorry - missed the Nehalem comment. However, my 3Ghz QX9650 (forgot what CPU I had in this thing) running Vista 64 and Sun's 64 bit JRE deals with it just fine.

Jim

> -----Original Message-----
> From: antlr-interest-bounces at antlr.org [mailto:antlr-interest-
> bounces at antlr.org] On Behalf Of Andrew Haley
> Sent: Thursday, January 21, 2010 7:23 AM
> To: antlr-interest at antlr.org
> Subject: Re: [antlr-interest] java.g does not compile
> 
> On 01/21/2010 03:00 PM, Jim Idle wrote:
> 
> > I wouldn't change the default time out as then your project depends
> > on a custom version of NATLR for no good reason. That was just my
> > 6:40AM typo of course :-)
> 
> I'm using antlrworks, and I can't find any other way to change the
> default.
> 
> > I have a QX9450 and some i7s. I think that the Xeon server versions
> > of 9450 etc might be slower on a single thread.
> 
> I'm sure they would be, but I'm talking about a Nehalem-based Xeon: it
> *is* an i7, not a Core 2 anything.  The Xeon 35xx and Core i7-9xx are
> more or less the same thing.
> 
> So, I think that almost everyone would have the same problem I'm
> having.
> 
> Andrew.
> 
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-
> email-address


From aph at redhat.com  Thu Jan 21 07:51:50 2010
From: aph at redhat.com (Andrew Haley)
Date: Thu, 21 Jan 2010 15:51:50 +0000
Subject: [antlr-interest] java.g does not compile
In-Reply-To: <57cab6502a087249bff2330f62dab5f0@temporal-wave.com>
References: <57cab6502a087249bff2330f62dab5f0@temporal-wave.com>
Message-ID: <4B587816.7090103@redhat.com>

On 01/21/2010 03:41 PM, Jim Idle wrote:
> They probably would, unless they read the comments at the start of the .g file, where it says:
> 
> *  NOTE: If you try to compile this file from command line and Antlr gives an exception 
> *    like error message while compiling, add option 
> *    -Xconversiontimeout 100000
> *    to the command line.  
> 

Hah!  I even did a web search for other people having the same
problem, and never saw that comment in the file.  :-)

> Sorry - missed the Nehalem comment. However, my 3Ghz QX9650 (forgot
> what CPU I had in this thing) running Vista 64 and Sun's 64 bit JRE
> deals with it just fine.

Me too, kinda sorta (OpenJDK64 on Linux).  Weird.

Thanks again,
Andrew.

From jimi at temporal-wave.com  Thu Jan 21 08:49:25 2010
From: jimi at temporal-wave.com (Jim Idle)
Date: Thu, 21 Jan 2010 08:49:25 -0800
Subject: [antlr-interest] java.g does not compile
In-Reply-To: <4B587816.7090103@redhat.com>
Message-ID: <5b21023602fded4c83de5774d2c76cee@temporal-wave.com>

I have found OpenJDK to be less than reliable to be honest, though many say it is fine for them. It might be the 64 bit version that always seemed to let me down in Fedora. Once I changed to Sun JDK/JRE then all my issues went away.

Jim

> -----Original Message-----
> From: antlr-interest-bounces at antlr.org [mailto:antlr-interest-
> bounces at antlr.org] On Behalf Of Andrew Haley
> Sent: Thursday, January 21, 2010 7:52 AM
> To: antlr-interest at antlr.org
> Subject: Re: [antlr-interest] java.g does not compile
> 
> On 01/21/2010 03:41 PM, Jim Idle wrote:
> > They probably would, unless they read the comments at the start of
> the .g file, where it says:
> >
> > *  NOTE: If you try to compile this file from command line and Antlr
> gives an exception
> > *    like error message while compiling, add option
> > *    -Xconversiontimeout 100000
> > *    to the command line.
> >
> 
> Hah!  I even did a web search for other people having the same
> problem, and never saw that comment in the file.  :-)
> 
> > Sorry - missed the Nehalem comment. However, my 3Ghz QX9650 (forgot
> > what CPU I had in this thing) running Vista 64 and Sun's 64 bit JRE
> > deals with it just fine.
> 
> Me too, kinda sorta (OpenJDK64 on Linux).  Weird.
> 
> Thanks again,
> Andrew.
> 
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-
> email-address


From aph at redhat.com  Thu Jan 21 10:05:51 2010
From: aph at redhat.com (Andrew Haley)
Date: Thu, 21 Jan 2010 18:05:51 +0000
Subject: [antlr-interest] java.g does not compile
In-Reply-To: <5b21023602fded4c83de5774d2c76cee@temporal-wave.com>
References: <5b21023602fded4c83de5774d2c76cee@temporal-wave.com>
Message-ID: <4B58977F.80600@redhat.com>

On 01/21/2010 04:49 PM, Jim Idle wrote:

> I have found OpenJDK to be less than reliable to be honest, though
> many say it is fine for them. It might be the 64 bit version that
> always seemed to let me down in Fedora. Once I changed to Sun
> JDK/JRE then all my issues went away.

Hmm, that's not good.  There shouldn't really be any difference, given
that OpenJDK is built from a very similar codebase and runs all the
same compatibility tests.  We need all the feedback about failures we
can get, with test cases if possible.

Andrew.

From antlr at mirality.co.nz  Thu Jan 21 11:08:58 2010
From: antlr at mirality.co.nz (Gavin Lambert)
Date: Fri, 22 Jan 2010 08:08:58 +1300
Subject: [antlr-interest] [C target] Duplicating tree error
In-Reply-To: <op.u6vss3pst3jqlu@win-mupvrp0jyrf>
References: <op.u6vq6hzyt3jqlu@win-mupvrp0jyrf>
	<20100121114219.75CF337588D@mx5.rambler.ru>
	<op.u6vss3pst3jqlu@win-mupvrp0jyrf>
Message-ID: <20100121190918.E09123418423@www.antlr.org>

At 00:56 22/01/2010, =?koi8-r?B?4NLV28vJziDtycjBycw=?= wrote:
 >Excuse me, what you mean behind "you need to use a label"? Could 

 >you send me example?

[...] t=declaration_type_spec [...]
   -> ^(T_TYPE_DECLARATION_STMT $t [...]

 >And, currently, I haven't seen problems with duplicating of
 >"entity_decl" tree.
 >I have a fault with a coping of "declaration_type_spec" tree.

That's because you're only using one entity_decl at a time.  My 
point is that they're doing the same thing -- the first time 
around the loop it uses the first encountered entity_decl, the 
second time it uses the second, etc.  It's behaving exactly the 
same with the declaration_type_spec; only there's just the one of 
those in the input.


From greneche.hugo at gmail.com  Thu Jan 21 11:20:46 2010
From: greneche.hugo at gmail.com (Hugo)
Date: Thu, 21 Jan 2010 20:20:46 +0100
Subject: [antlr-interest] newbie needs help
Message-ID: <4B58A90E.5020401@gmail.com>

I started using antlr to parse a specific file format.
The problem is that i don't know how to write correctly my grammar.

The file have the following format.
It contains multiple lines and each can have the following format:

Only one or multilple hexadecimal caracter with space or not
ex: A0 A4 B5 77
or: A0

Only variable identifier with the format VAR_XXX
ex: VAR_MY_VARIABLE

Or the combinaison of the two previous format
ex:
A0 A4B5 VAR_MY_VARIABLE 77 98 VAR_MY_VARIABLE2
or
VAR_MY_VARIABLE AA BB
or
AA BB VAR_MY_VARIABLE


what i want to do is to build a AST tree

And the problem is that i don't know how to do this with antlr. the tool
always tell me that multiple rule can be applies with my grammar.

please help me to solve my problem. 

Here is my grammar:

stmts               : bytes+ ;


bytes : multiple_byte bytes? -> ^(EXPR_DEF multiple_byte  bytes? )

| define_expression bytes? -> ^(EXPR_DEF define_expression bytes? )

| NEWLINE ;

define_expression : define_var -> ^(DEFINE_VAR_DEF define_var) ;

define_var : DEFINE_VARIABLE ;
multiple_byte : single_byte (single_byte)+ -> ^(MULTIPLE_BYTES_DEF
single_byte single_byte+) ;


single_byte : byte_digit -> ^(BYTES_DEF byte_digit) ;

byte_digit : BYTE_DIGIT ;

DEFINE_VARIABLE :
'VAR_'('a'..'z'|'A'..'Z'|'_')('a'..'z'|'A'..'Z'|'0'..'9'|'_')*;

BYTE_DIGIT :('0'..'9'| 'A'..'F'|'a'..'f')('0'..'9'| 'A'..'F'|'a'..'f') ;

// Ignore whitespace, tab and escape sequence WS : (' '|'\t'|'\\\r\n')+
{$channel = HIDDEN;} ;

// a new line NEWLINE : '\r'? '\n' ;

thanks a lot


From jbb at acm.org  Thu Jan 21 13:25:24 2010
From: jbb at acm.org (John B. Brodie)
Date: Thu, 21 Jan 2010 16:25:24 -0500
Subject: [antlr-interest] newbie needs help
In-Reply-To: <4B58A90E.5020401@gmail.com>
References: <4B58A90E.5020401@gmail.com>
Message-ID: <1264109124.9363.10.camel@gecko.home.org>

Greetings!

On Thu, 2010-01-21 at 20:20 +0100, Hugo wrote:
> I started using antlr to parse a specific file format.
> The problem is that i don't know how to write correctly my grammar.
> 
> The file have the following format.
> It contains multiple lines and each can have the following format:
> 
> Only one or multilple hexadecimal caracter with space or not
> ex: A0 A4 B5 77
> or: A0
> 
> Only variable identifier with the format VAR_XXX
> ex: VAR_MY_VARIABLE
> 
> Or the combinaison of the two previous format
> ex:
> A0 A4B5 VAR_MY_VARIABLE 77 98 VAR_MY_VARIABLE2
> or
> VAR_MY_VARIABLE AA BB
> or
> AA BB VAR_MY_VARIABLE
> 
> 
> what i want to do is to build a AST tree

attached please find a grammar file that is *almost* what I think you
are trying to do.

It does not have a MULTIPLE_BYTES_DEF node because the grouping of a
collection of single_byte instances into a multibyte is ambiguous.
Consider

11 22 33 44 55 66 77 88

is this 8 single bytes? 1 single byte and 7-long multi? is it 4 multi
pairs? a triple, a single and a quad?

i kinda expect you want it to be a single 8-long multi, e.g. any run of
single bytes becomes a multi. But that is a semantic of your language
and getting a parser to do semantics isn't always possible....

if you really need the MULTIPLE_BYTE_DEF node, you might be best served
by parsing using some like my code (e.g. the parser produces only
BYTE_DEF nodes) and then write a tree-walker that transforms the AST
resultant from the parse into a new AST that contains the requisite
MULTIPLE_BYTE_DEF nodes. e.g. scan for and collapse sequences of
consecutive EXPR_DEF nodes that have BYTE_DEF children into a single
EXPR_DEF node containing a single MULTIPLE_BYTE_DEF child.

> 
> And the problem is that i don't know how to do this with antlr. the tool
> always tell me that multiple rule can be applies with my grammar.
> 
> please help me to solve my problem. 
> 
> Here is my grammar:
> 
> stmts               : bytes+ ;
> 
> 
> bytes : multiple_byte bytes? -> ^(EXPR_DEF multiple_byte  bytes? )
> 
> | define_expression bytes? -> ^(EXPR_DEF define_expression bytes? )
> 
> | NEWLINE ;
> 
> define_expression : define_var -> ^(DEFINE_VAR_DEF define_var) ;
> 
> define_var : DEFINE_VARIABLE ;
> multiple_byte : single_byte (single_byte)+ -> ^(MULTIPLE_BYTES_DEF
> single_byte single_byte+) ;
> 
> 
> single_byte : byte_digit -> ^(BYTES_DEF byte_digit) ;
> 
> byte_digit : BYTE_DIGIT ;
> 
> DEFINE_VARIABLE :
> 'VAR_'('a'..'z'|'A'..'Z'|'_')('a'..'z'|'A'..'Z'|'0'..'9'|'_')*;
> 
> BYTE_DIGIT :('0'..'9'| 'A'..'F'|'a'..'f')('0'..'9'| 'A'..'F'|'a'..'f') ;
> 
> // Ignore whitespace, tab and escape sequence WS : (' '|'\t'|'\\\r\n')+
> {$channel = HIDDEN;} ;
> 
> // a new line NEWLINE : '\r'? '\n' ;
> 
> thanks a lot

hope this helps...
   -jbb

-------------- next part --------------
grammar Test;

options {
   output = AST;
   ASTLabelType = CommonTree;
}

tokens {
   EXPR_DEF;
   DEFINE_VAR_DEF;
   BYTES_DEF;
}

@members {
   private static final String [] x = new String[]{
      "A0\n",
      "A0 A4 B5 77\n",
      "VAR_MY_VARIABLE\n",
      "A0 A4B5 VAR_MY_VARIABLE 77 98 VAR_MY_VARIABLE2\n",
      "VAR_MY_VARIABLE AA BB\n",
      "AA BB VAR_MY_VARIABLE\n"
   };

   public static void main(String [] args) {
      for( int i = 0; i < x.length; ++i ) {
         try {
            System.out.println("about to parse:`"+x[i]+"`");
            TestLexer lexer = new TestLexer(new ANTLRStringStream(x[i]));
            CommonTokenStream tokens = new CommonTokenStream(lexer);

            TestParser parser = new TestParser(tokens);
            TestParser.stmts_return p_result = parser.stmts();

            CommonTree ast = p_result.tree;
            if( ast == null ) {
               System.out.println("resultant tree: is NULL");
            } else {
               System.out.println("resultant tree: " + ast.toStringTree());
            }
            System.out.println();
         } catch(Exception e) {
            e.printStackTrace();
         }
      }
   }
}

stmts : bytes+ EOF!;

bytes
   : ( b=BYTE_DIGIT t=bytes -> ^(EXPR_DEF ^(BYTES_DEF $b) $t) )
   | ( d=DEFINE_VARIABLE t=bytes -> ^(EXPR_DEF ^(DEFINE_VAR_DEF $d) $t) )
   | NEWLINE ;

fragment LETTER :  'a' .. 'z' | 'A' .. 'Z' ;
fragment DIGIT : '0'.. '9' ;
DEFINE_VARIABLE : 'VAR_' (LETTER|'_') (LETTER | DIGIT | '_')*;

fragment HEXIT : '0'..'9' | 'A'..'F' | 'a'..'f' ;
BYTE_DIGIT : HEXIT HEXIT ;

// Ignore whitespace, tab and escape sequence
WS : (' '|'\t'|'\\\r\n')+ {$channel = HIDDEN;} ;

// a new line
NEWLINE : '\r'? '\n' ;

From michael.scholz at gmail.com  Thu Jan 21 14:20:55 2010
From: michael.scholz at gmail.com (Michael Scholz)
Date: Thu, 21 Jan 2010 14:20:55 -0800
Subject: [antlr-interest] Problem with rewrite rule: DebugTokenStream cannot
	be cast to TokenRewriteStream
Message-ID: <61e8cbbd1001211420u3829eb86jdd930c64c5f564d0@mail.gmail.com>

I have hit the issue described
here: http://www.antlr.org/pipermail/antlr-interest/2009-January/032284.html
and here: http://www.antlr.org/jira/browse/AW-242

Since it's not a new problem, is there a known fix/workaround/patch?

Thanks

From m.y.speyer at inter.nl.net  Thu Jan 21 16:55:05 2010
From: m.y.speyer at inter.nl.net (Marc Speyer)
Date: Fri, 22 Jan 2010 01:55:05 +0100
Subject: [antlr-interest] Tree pattern maching using the C# (was
	C)	target
In-Reply-To: <20100120004202.274260@gmx.net>
References: <000901ca95d9$df6dd740$9e4985c0$@y.speyer@inter.nl.net>	<20100115125833.242280@gmx.net>
	<002401ca985c$bd00f450$3702dcf0$@y.speyer@inter.nl.net>
	<20100120004202.274260@gmx.net>
Message-ID: <003501ca9afd$898245e0$9c86d1a0$@y.speyer@inter.nl.net>

Hi Johannes,

Please find the file attached. I can get it compiled with this file but When
I then run the grammar nothings happens whereas I have grammar rules and
actions (see my previous post).

I have not tested the CSharp3 target myself yet because I could not compile
the source for it either but did only spend a lot bit of time on it since I
cannot find anything about the status of the CSharp3 target.

Any help would be much appreciated.

Thanks,
Marc

>-----Original Message-----
>From: Johannes Luber [mailto:JALuber at gmx.de]
>Sent: Wednesday, January 20, 2010 1:42 AM
>To: Marc Speyer; antlr-interest at antlr.org
>Subject: Re: [antlr-interest] Tree pattern maching using the C# (was C)
>target
>
>> Hi Johannes,
>>
>> I tried the version that you mentioned by downloading it from
>> antlr:/runtime/CSharp2 in the Fisheye code repository and then tried to
>> compile it using VS2008. This didn't work because a file
>> "TokenConstants.cs"
>> was reported missing by VS2008 and gave me compilation errors. I managed
>> to
>> get a version from the CSharp3 repository and after making one change I
>> could compile.
>
>Oops - I thought that I had checked in that file already. Can you send both
>TokenConstants.cs (for comparing with my own version) and the modified
>grammar file to the list? I'm not sure where the error can be as I lifted
>more than a few file from the CSharp3 target.
>
>Sam, can you check if the grammar works with CSharp3 target? It would be
>helpful to narrow down the cause.
>
>Johannes
>
>> I noticed that the Downup method is part of the Treefilter
>> class which inherits from the TreeParser class. The grammar for the tree
>> parser from the example has the following header:
>>
>> // START: header
>> tree grammar DefRef;
>> options {
>>   tokenVocab = Cymbol;
>>   ASTLabelType = CommonTree;
>>   filter = true;
>>   language=CSharp2;
>> }
>> @members {
>>     SymbolTable symtab;
>>     Scope currentScope;
>>     public DefRef(ITreeNodeStream input, SymbolTable symtab)
>>     	: this(input)
>>     {
>>         this.symtab = symtab;
>>         currentScope = symtab.globals;
>>     }
>> }
>> // END: header
>>
>> Generating the tree parser gives DefRef.cs with the DefRef class declared
>> as:
>>
>> public partial class DefRef : TreeParser
>>
>>
>> Now I can cast this into the TreeFilter class but to test things quickly
>I
>> changed the above line in the DefRef.cs into:
>>
>> public partial class DefRef : TreeFilter
>>
>>
>> In the calling program I use:
>>
>> DefRef def = new DefRef(nodes, symtab); // use custom constructor
>> def.Downup(t); // trigger symtab actions upon certain subtrees
>>
>> When I run this nothings happens whereas I have grammar rules and actions
>> like:
>>
>> exitBlock
>>     :   BLOCK
>>         {
>>         Console.WriteLine("locals: "+currentScope);
>>         currentScope = currentScope.getEnclosingScope();    // pop scope
>>         }
>>     ;
>>
>> I have not figured out yet why this doesn't work. The examples is a
>> one-to-one port of the Java example of pattern 17 Symbol Table for Nested
>> Scopes of the Language Implementation Patterns.
>>
>> Any idea?
>>
>> Thanks,
>>
>> Marc
>> >-----Original Message-----
>> >From: Johannes Luber [mailto:JALuber at gmx.de]
>> >Sent: Friday, January 15, 2010 1:59 PM
>> >To: Marc Speyer; antlr-interest at antlr.org
>> >Subject: Re: [antlr-interest] Tree pattern maching using the C target
>> >
>> >> Hi all,
>> >>
>> >> I have a similar issue using the C# target. Using the Cymbol.g example
>> of
>> >> pattern 17 Symbol Table for Nested Scopes of the Language
>> Implementation
>> >> Patterns book I could not get it to work because there is now downup
>> >> method.
>> >> According to the documentation this method walks the AST code using
>> >> ANTLR's
>> >> built-in downup( ) strategy.
>> >>
>> >> Am I correct assuming that this has not been implemented yet for the
>C#
>> >> target (as Jim implies in his response). Is it difficult to implement
>> it
>> >> myself? I guess it would involve implementing the tree pattern
>matching
>> >> stuff.
>> >>
>> >> Marc
>> >
>> >You are correct - there is no official version yet, which implements
>tree
>> >pattern matching. I haven't gotten around to the API changes yet (will
>> work
>> >on that next week), though I have checked in some untested changes. It
>> >would be the easieast if you'd base your own code on that for now.
>> >
>> >Johannes
>> >
>> >> P.S. Hope this email files under the proper subject thread, and
>> apologies
>> >> in
>> >> advance if it isn't (Just subscribed to the mailing list but I could
>> not
>> >> find out how to get previous posts from it)
>> >>
>> >> > Pattern matcher or normal tree walker? The pattern stuff is not
>> >> implemented in the C target yet.
>> >> >
>> >> > Jim
>> >> >
>> >> >> -----Original Message-----
>> >> >> From: antlr-interest-bounces at antlr.org [mailto:antlr-interest-
>> >> >> bounces at antlr.org] On Behalf Of Heiko Folkerts
>> >> >> Sent: Thursday, January 14, 2010 5:01 AM
>> >> >> To: antlr-interest at antlr.org
>> >> >> Subject: [antlr-interest] Tree pattern maching using the C target
>> >> >>
>> >> >> Hi all,
>> >> >> I wrote al litle tree pattern matcher for a specific validation we
>> >need
>> >> >> in our grammar. ANTLR and the C compiler compile it all well but
>> there
>> >> >> is now "downup" mehtod for running the matcher. Instead I only see
>> our
>> >> >> own rules in the generated parser. So, is the method to run when
>> using
>> >> >> a tree pattern macher in the C target different than ^"downup"? How
>> to
>> >> >> run the matcher?
>> >> >>
>> >> >> I tried to find an answer in the C examples but there was only a
>> >> >> treeparser and no tree pattern matcher.
>> >> >>
>> >> >> Thx+
>> >> >> Heiko
>> >> >>
>> >> >>
>> >> >> --
>> >>
>> >>
>> >>
>> >> List: http://www.antlr.org/mailman/listinfo/antlr-interest
>> >> Unsubscribe:
>> >> http://www.antlr.org/mailman/options/antlr-interest/your-email-address
>> >
>> >--
>> >GRATIS f?r alle GMX-Mitglieder: Die maxdome Movie-FLAT!
>> >Jetzt freischalten unter http://portal.gmx.net/de/go/maxdome01
>>
>>
>> List: http://www.antlr.org/mailman/listinfo/antlr-interest
>> Unsubscribe:
>> http://www.antlr.org/mailman/options/antlr-interest/your-email-address
>
>--
>Jetzt kostenlos herunterladen: Internet Explorer 8 und Mozilla Firefox 3.5
>-
>sicherer, schneller und einfacher! http://portal.gmx.net/de/go/atbrowser
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: TokenConstants.cs
Url: http://www.antlr.org/pipermail/antlr-interest/attachments/20100122/b71db7bf/attachment.pl 

From michael.scholz at gmail.com  Thu Jan 21 18:23:44 2010
From: michael.scholz at gmail.com (Michael Scholz)
Date: Thu, 21 Jan 2010 18:23:44 -0800
Subject: [antlr-interest] multiple command queues to get multiple rewrites
	from a single pass
Message-ID: <61e8cbbd1001211823mf618eccs2275e0b1c2d0b7f9@mail.gmail.com>

Referring to The Definitive ANTLR Reference, page 220

"You can also have multiple command queues to get multiple rewrites from a
single pass over the input such as generating both a C file and its header
file (see the TokenRewriteStream Javadoc for an example)"

said example:
/*  You can also have multiple "instruction streams" and get multiple
 *  rewrites from a single pass over the input.  Just name the instruction
 *  streams and use that name again when printing the buffer.  This could be
 *  useful for generating a C file and also its header file--all from the
 *  same buffer:
 *
 *      tokens.insertAfter("pass1", t, "text to put after t");}
 *         tokens.insertAfter("pass2", u, "text after u");}
 *         System.out.println(tokens.toString("pass1"));
 *         System.out.println(tokens.toString("pass2"));
 *
 *  If you don't use named rewrite streams, a "default" stream is used as
 *  the first example shows.
 */

I don't see how to apply this in the context of the CMinus.g 1pass rewriter.
This example uses inline template definitions to rewrite, and the syntax for
doing that:
... -> template-name(<<attribute-assignment-list>>)
doesn't have any obvious way to specify the non-default instruction
stream... to generate a replace(String,...) instead of replace(...) in the
parser file.

If this functionality is enabled, the documentation for getting to it is not
obvious. Is this a V2/V3 issue? The tweak example seems like it might be
closer than 1pass rewriter, but it doesn't look consistent with the book's
techniques.

From linlin.xie at siemens.com  Fri Jan 22 04:57:40 2010
From: linlin.xie at siemens.com (Xie, Linlin)
Date: Fri, 22 Jan 2010 13:57:40 +0100
Subject: [antlr-interest] UTF-8 input?
In-Reply-To: <b4a29e8bfd66e445960573c91dcd6f93@temporal-wave.com>
References: <79118B9FE8CE8E49B0D71964A79CB647033CA2D5@dekomplm002.net.plm.eds.com>
	<b4a29e8bfd66e445960573c91dcd6f93@temporal-wave.com>
Message-ID: <79118B9FE8CE8E49B0D71964A79CB647033CABB3@dekomplm002.net.plm.eds.com>

Hi jim,

Thanks for the reply. You said I can convert my UTF8 input "to UCS2
using the supplied converter in the current runtime", but I can't find
any such converter in antlr c runtime. Can you suggest me which API to
use? Btw, I searched the archive, I can see the person who had similar
problem as mine used iconv library on linux. 

Thanks in advance!
Linlin


-----Original Message-----
From: antlr-interest-bounces at antlr.org
[mailto:antlr-interest-bounces at antlr.org] On Behalf Of Jim Idle
Sent: 20 January 2010 16:31
To: antlr-interest at antlr.org
Subject: Re: [antlr-interest] UTF-8 input?

You need to remember to state which target you are talking about.

I have written a new universal input stream for the next version of the
C runtime. It takes 8bit, 16 bit, UTF-8, UTF-16, UCS2, UTF32 and EBCDIC
(code gen will change slightly to support this). It is not well tested
right now but will be available as a snapshot 3.3 release shortly in the
downloads page.

In the meantime the easiest thing to do is to convert to UCS2 using the
supplied converter in the current runtime. Though this will not work
with surrogate pairs in UTF-16 though but most people do not need that.

If you really need UTf-8 without conversion then it is easy enough to
write, or you can just steal the code from my check in of the code in
about 10 minutes. Note that while the streams work, I have not provided
ANTLR3_STRING support for UTF-8 and so on yet and so getting $text from
such a stream may or may not work,

Jim

> -----Original Message-----
> From: antlr-interest-bounces at antlr.org [mailto:antlr-interest-
> bounces at antlr.org] On Behalf Of Xie, Linlin
> Sent: Wednesday, January 20, 2010 3:32 AM
> To: antlr-interest at antlr.org
> Subject: [antlr-interest] UTF-8 input?
> 
> Can anyone tell me if antlr3.1.3 generated parser works with UTF-8
> input? If it does, how should I configure in the grammar? I noticed
> there are two macros ANTLR3_INLINE_INPUT_ASCII and
> ANTLR3_INLINE_INPUT_UTF16, but no UTF-8 one.
> 
> 
> 
> Many thanks!
> 
> Linlin
> 
> 
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-
> email-address


List: http://www.antlr.org/mailman/listinfo/antlr-interest
Unsubscribe:
http://www.antlr.org/mailman/options/antlr-interest/your-email-address

From jamesdcarrollml at verizon.net  Fri Jan 22 07:20:48 2010
From: jamesdcarrollml at verizon.net (James Carroll)
Date: Fri, 22 Jan 2010 10:20:48 -0500
Subject: [antlr-interest] Agglutinative language
Message-ID: <1264173648.24829.53.camel@Cheyenne>

I'm only starting with ANTLR and thinking about 'language' for its own
sake and was wondering.... are there any agglutinative programming
languages? 

The Decorator pattern would seem to lend itself to it. From wikipedia
(javascript example):
function Coffee() {this.cost = function(){return 1;};
} 
// Decorator A
function Milk(coffee) {this.cost = function() {return coffee.cost() + 0.5;};	
}
// Decorator B
function Whip(coffee) {this.cost = function() {return coffee.cost() + 0.7;};
}
 
// Decorator C
function Sprinkles(coffee) {this.cost = function() {return coffee.cost() + 0.2;};
}
var coffee = new Coffee();
coffee = new Sprinkles(coffee);
coffee = new Whip(coffee);
coffee = new Milk(coffee);

But what if I wanted to do this:

entity Coffee() {this.cost = function(){return 1;};
}
entity Espresso() is Coffee {this.cost = function(){return 1.5;}
// Decorator A
feature Milked(coffee) {this.cost = function() {return coffee.cost() + 0.5;};	
}
// Decorator B
feature Whipped(coffee) {this.cost = function() {return coffee.cost() + 0.7;};
}
 
// Decorator C
feature Sprinkled(coffee) {this.cost = function() {return coffee.cost() + 0.2;};
}
var coffee1 = new Coffee();
var coffee2 = new WhippedCoffee();
var coffee3 = new SprinkledMilkedCoffee();
var coffee4 = new SprinkledEspresso();

Just curious.


From kaleb.pederson at gmail.com  Fri Jan 22 07:46:43 2010
From: kaleb.pederson at gmail.com (Kaleb Pederson)
Date: Fri, 22 Jan 2010 07:46:43 -0800
Subject: [antlr-interest] gunit problem
In-Reply-To: <4B582AE9.4030403@doc.ic.ac.uk>
References: <4B582AE9.4030403@doc.ic.ac.uk>
Message-ID: <f14c01621001220746l5fbb7ed3s9ec167863ab50cce@mail.gmail.com>

On Thu, Jan 21, 2010 at 2:22 AM, Ian Moor <iwm at doc.ic.ac.uk> wrote:
> I am using the gunit which is provided with antlr 3.2 and
> I am trying to test parts of an tree, for example
> ? statement walks statements:
> ? ?"x=1" -> "ok"
>
> I expect an error message saying the code produced to System.out is
> not ?"ok", but gunit prints no output, ans stops with a non zero return
> value.

Although it takes some work, you can debug gunit.  You'll need to grab
the source and then set appropriate breakpoints as one normally would
in working with a debugger, but it's possible.

> Is there a way finding what is happening, or a later gunit ?

A patched version of gunit is available that includes better support
for custom ASTs:

http://www.antlr.org/wiki/pages/viewpageattachments.action?pageId=3244061&metadataLink=true

I know I found source for it somewhere and was able to debug another
problem, so hopefully you can do the same.

--
Kaleb Pederson

Blog - http://kalebpederson.com
Twitter - http://twitter.com/kalebpederson

From parrt at cs.usfca.edu  Fri Jan 22 09:32:53 2010
From: parrt at cs.usfca.edu (Terence Parr)
Date: Fri, 22 Jan 2010 09:32:53 -0800
Subject: [antlr-interest] Call for Papers and Tool Demo Proposals - SCAM 2010
Message-ID: <6A6EDA85-EA68-44DA-B84B-1548B55FDAB4@cs.usfca.edu>

hiya.  Anybody got a cool product / tool to show off?  Here's a conference.
Ter
---------------------
Tenth IEEE International Working Conference
on Source Code Analysis and Manipulation

12th-13th September 2010,
Timisoara, Romania,
Co-located with ICSM 2010

http://www2010.ieee-scam.org/

Sponsored by IEEE CS (pending)
In cooperation with:
- Semantic Designs Inc., Austin, TX, USA
- Univ. "Politehnica" Timisoara, Romania
- Centre for Research in Evolution, Search and Testing (CREST), King's College London, UK

----------------
Conference aims:
----------------
The aim of this working conference is to bring together researchers and
practitioners working on theory, techniques and applications which
concern analysis and/or manipulation of the source code of computer
systems. While much attention in the wider software engineering

community is properly directed towards other aspects of systems
development and evolution, such as specification, design and
requirements engineering, it is the source code that contains the only
precise description of the behaviour of the system. The analysis and

manipulation of source code thus remains a pressing concern.

---------
Keynotes:
---------
This year SCAM will feature two outstanding keynotes:
- Mark Harman, King's College London, UK
- Andreas Zeller, Saarland University, Germany

---------------------------------
Covered topics and paper formats:
---------------------------------
We welcome submission of papers that describe original and significant
work in the field of source code analysis and manipulation. Topics of
interest include, but are not limited to:

    * program transformation
    * abstract interpretation
    * program slicing
    * source level software metrics
    * decompilation
    * source level testing and verification
    * source level optimization
    * program comprehension

Note that SCAM explicitly solicits results from any theoretical or
technological domain that can be applied to these and similar topics.

Submitted papers should not be longer than 10 pages. We also welcome
submission of 2 page proposals for tool demonstrations expected to be
performed live at the conference. All papers submitted should follow
IEEE Computer Society Press Proceedings Author Guidelines. The

papers should be submitted electronically via the conference web site.
Submitted papers should not have been previously published, and should
not have been concurrently submitted elsewhere.

------------
Proceedings:
------------
All accepted papers will appear in the proceedings which will be
published by the IEEE Computer Society Press.

--------------
Special Issue:
--------------
Best papers from SCAM 2010 will be considered for revision, extension,
and publication in a special issue of the Science of Computer
Programming journal edited by Elsevier.

----------------
Important Dates:
----------------
Deadline for submission:
    Abstract due: 23rd April, 2010
    Full paper due: 30 April, 2010
Notification: 7th June, 2010
Working Conference: 12th-13th September 2010

------------------------
Conference Organization:
------------------------

General Chair
Massimiliano Di Penta, Research Centre on Software Technology,
Universita degli Studi del Sannio, Italy


Program Co-Chairs
Jurgen Vinju, Centrum Wiskunde & Informatica, The Netherlands
Cristina Marinescu, Politehnica University of Timisoara, Romania


Publicity Chair
Zheng Li, CREST Centre, Department of Computer Science, King?s College
London, UK

Finance Chair
Dave Binkley, Computer Science Department, Loyola College in Maryland, USA

Tool Demonstration Chair
Pascal Cuoq, CEA-Recherche Technologique, France

Local Arrangements Chair
Marius Minea, Politehnica University of Timisoara, Romania


-----------------------------------------
Steering Committee and Program Committee:
-----------------------------------------
See the conference Website


From jimi at temporal-wave.com  Fri Jan 22 12:06:31 2010
From: jimi at temporal-wave.com (Jim Idle)
Date: Fri, 22 Jan 2010 12:06:31 -0800
Subject: [antlr-interest] UTF-8 input?
In-Reply-To: <79118B9FE8CE8E49B0D71964A79CB647033CABB3@dekomplm002.net.plm.eds.com>
Message-ID: <52625667efb426469f56bf603d379f7d@temporal-wave.com>

Do you not see the function call:

ConvertUTF8toUTF16() ?

In the file called 'antlr3convertutf.c" ?

Jim


> -----Original Message-----
> From: Xie, Linlin [mailto:linlin.xie at siemens.com]
> Sent: Friday, January 22, 2010 4:58 AM
> To: Jim Idle; antlr-interest at antlr.org
> Subject: RE: [antlr-interest] UTF-8 input?
> 
> Hi jim,
> 
> Thanks for the reply. You said I can convert my UTF8 input "to UCS2
> using the supplied converter in the current runtime", but I can't find
> any such converter in antlr c runtime. Can you suggest me which API to
> use? Btw, I searched the archive, I can see the person who had similar
> problem as mine used iconv library on linux.
> 
> Thanks in advance!
> Linlin
> 
> 
> -----Original Message-----
> From: antlr-interest-bounces at antlr.org
> [mailto:antlr-interest-bounces at antlr.org] On Behalf Of Jim Idle
> Sent: 20 January 2010 16:31
> To: antlr-interest at antlr.org
> Subject: Re: [antlr-interest] UTF-8 input?
> 
> You need to remember to state which target you are talking about.
> 
> I have written a new universal input stream for the next version of the
> C runtime. It takes 8bit, 16 bit, UTF-8, UTF-16, UCS2, UTF32 and EBCDIC
> (code gen will change slightly to support this). It is not well tested
> right now but will be available as a snapshot 3.3 release shortly in
> the
> downloads page.
> 
> In the meantime the easiest thing to do is to convert to UCS2 using the
> supplied converter in the current runtime. Though this will not work
> with surrogate pairs in UTF-16 though but most people do not need that.
> 
> If you really need UTf-8 without conversion then it is easy enough to
> write, or you can just steal the code from my check in of the code in
> about 10 minutes. Note that while the streams work, I have not provided
> ANTLR3_STRING support for UTF-8 and so on yet and so getting $text from
> such a stream may or may not work,
> 
> Jim
> 
> > -----Original Message-----
> > From: antlr-interest-bounces at antlr.org [mailto:antlr-interest-
> > bounces at antlr.org] On Behalf Of Xie, Linlin
> > Sent: Wednesday, January 20, 2010 3:32 AM
> > To: antlr-interest at antlr.org
> > Subject: [antlr-interest] UTF-8 input?
> >
> > Can anyone tell me if antlr3.1.3 generated parser works with UTF-8
> > input? If it does, how should I configure in the grammar? I noticed
> > there are two macros ANTLR3_INLINE_INPUT_ASCII and
> > ANTLR3_INLINE_INPUT_UTF16, but no UTF-8 one.
> >
> >
> >
> > Many thanks!
> >
> > Linlin
> >
> >
> > List: http://www.antlr.org/mailman/listinfo/antlr-interest
> > Unsubscribe: http://www.antlr.org/mailman/options/antlr-
> interest/your-
> > email-address
> 
> 
> 
> 
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe:
> http://www.antlr.org/mailman/options/antlr-interest/your-email-address


From duygu_the_duygu at yahoo.com  Fri Jan 22 17:01:21 2010
From: duygu_the_duygu at yahoo.com (Duygu Altinok)
Date: Fri, 22 Jan 2010 17:01:21 -0800 (PST)
Subject: [antlr-interest] newbie question- tree walker goes into infininte
	loop
Message-ID: <678818.24633.qm@web46002.mail.sp1.yahoo.com>

Hi,
I'm writing a C-like language compiler . I both generate code and do usual stuff within my tree walker but , it goes into an infinite loop with the following , problem is with the function part. I really didin't get what I'm doing wrong , so please can anybody help?Thanx in advance.

Parser :

program:

    function_list

{


    #program= #([PROGRAM,"program"],symbol_table, program);
}
;


function_list:
{
    is_in_function_list = true;
}
          (function)+
{
#function_list= #([FUNCTION_LIST, "function_list"], function_list);
}
         ;

function :
{
    String bt;
}
     bt=basic_type!  i:ID!
{
    String identifier = i.getText();

    if  (identifier.length() > 32)
    {
        error(WARN00, i.getLine(), i.getColumn());
        identifier = identifier.substring(0, 32);
    }

    which_function = new String(identifier);
    identifier=identifier + ":" + Integer.toString(i.getLine()) + ":" + Integer.toString(i.getColumn());

}

LPAREN!  parameter_list!  RPAREN!  LCURLY  function_body  RCURLY

{

    symbol_table.addChild(#([SYMBOL_FUNCTION, identifier ], [SYMBOL_TYPE, bt] , symbol_parameters, symbol_locals ));

#function=(#([ID,identifier],function));
}
    ;

function_body:
            declaration_list! statement_list
         ;


Tree walker:

program
    :
    #(PROGRAM symbol_table
    {
        sTable.sort();
assemble.code+=new String("\n\t#Duygu\n\n\n");
assemble.code+=new String("\t.data\n");

    }  function_list
    {
    sTable.prettyPrint();
            try{
        FileWriter file=new FileWriter(new String("output.asm"));
            file.write(assemble.code.toCharArray());
            file.close();
        }catch (Exception e) {
                e.printStackTrace();
                    System.out.println(e);
        }
    }
    )

    ;
function_list
    :
    #(FUNCTION_LIST (function)+)
    ;

function
    :
    #(i:ID
{
    //Parse info
    String identifier;
    int line, column;

    String [] params = new String[3];

    identifier = i.getText();
    params = identifier.split(":");
    identifier = params[0];
    line = Integer.parseInt(params[1]);
    column = Integer.parseInt(params[2]);

    int index;
    int line2=0,column2=0; //line and column info from the symbol table
    index = sTable.getFunctionIndex(identifier);

    if(index != -1)
    {
    line2=((Function)sTable.functions.elementAt(index)).line;
    column2=((Function)sTable.functions.elementAt(index)).column;
    }

    if(index!=-1 && line==line2 && column==column2)
    {
        isFunctionLegal=true;
        currentFunction = (Function) sTable.functions.elementAt(index);
                Vector parameters=currentFunction.parameters;
        int offset=0;
        for(int k=0;k<parameters.size();k++)
  {
        ((Symbol)parameters.elementAt(k)).place.isMemory=true;
    if(((Symbol)parameters.elementAt(k)).type.indexOf('[')!=-1)
        ((Symbol)parameters.elementAt(k)).place.isArray=true;
    else
        ((Symbol)parameters.elementAt(k)).place.isArray=false;
    ((Symbol)parameters.elementAt(k)).place.store=new String(Integer.toString(offset)+"($fp)");
    offset+=4;
        }

        if(identifier.equals(new String("main")))

        {

            firstFunc=false;

            firstName=new String(identifier);

            assemble.code+=new String("\t.text\n");

            assemble.code+=new String("\n\t.globl\tmain\n");

            assemble.code+=new String("main:\n");

        }

        else if (identifier.equals(firstName))

        {

            assemble.code+=new String("\n\t.globl\tmain\n");

            assemble.code+=new String("main:\n");

        }

        else

        {

            assemble.code+=new String("\n\t.globl\t"+currentFunction.name+"\n");

            assemble.code+=new String(currentFunction.name+":\n");

        }

        assemble.code+=new String("#Initialize function\n");

        String reg=new String();

        reg=new String("$fp");

        assemble.code+=new String("#fp\n");

        assemble.code+=Push(reg);

        assemble.code+=new String("\taddi $fp, $sp, 4\n");

        reg=new String("$ra");

        assemble.code+=new String("#ra\n");

        assemble.code+=Push(reg);
   assemble.code+=new String("#push s registers to the stack\n");

        for(int ind=0;ind<8;ind++)

            assemble.code+=Push(new String("$s"+Integer.toString(ind)));


        Vector locals=currentFunction.locals;

        for(int locVar=0;locVar<locals.size();locVar++)

        {

            //local icin stack'ten yer ayir... sp'yi asagi cekerek.

            String type = ((Symbol)locals.elementAt(locVar)).type;

            assemble.code+=new String("#space for local in the stack\n");

            if(type.trim().equals("int") || type.trim().equals("float"))

            {

                assemble.code+=new String("\taddi $sp, $sp, -4\n");

                currentFunction.ARoffset-=4;

                ((Symbol)locals.elementAt(locVar)).place.isArray = false;

            }

            else if(type.indexOf('[')!=-1)

            {

   ((Symbol)locals.elementAt(locVar)).place.isArray = true;

                int size=GetArrSize(type);

                for(int j=0;j<size;j++)

                {

                    assemble.code+=new String("\taddi $sp,$sp, -4\n");

                    currentFunction.ARoffset-=4;

                }

            }


            ((Symbol)locals.elementAt(locVar)).place.store = new String(Integer.toString(currentFunction.ARoffset)) + "($fp)";

            ((Symbol)locals.elementAt(locVar)).place.isMemory = true;


        }


    }
        else
        isFunctionLegal=false;


}
   function_body

    {


        //Stack Bookkeeping

        if(!currentFunction.ret)

        {

        assemble.code+=new String("#Stack bookkeeping\n");

        assemble.code+=new String("\taddi $sp, $fp, -40\n");

        //restore callee saved regs

        assemble.code+=new String("#pop s registers\n");

        for(int ind=7;ind>=0;ind--)

            assemble.code+=Pop(new String("$s"+Integer.toString(ind)));

        //restore ra

        assemble.code+=new String("#pop ra\n");

        assemble.code+=Pop(new String("$ra"));

        //restore fp

        assemble.code+=new String("#pop fp\n");

        assemble.code+=Pop(new String("$fp"));
        //back to caller fnc

        assemble.code+=new String("\tli $v0, 0\n");

        assemble.code+=new String("\tjr $ra\n");

        }

    }
)
    ;

function_body:
    statement_list
    ;

statement_list:
    (statement)+
    ;

statement:
          |assignment_statement
      |return_statement
      |if_statement
      |while_statement
         | print_statement
          | expression SEMI
          | read_statement
      ;


From michael.scholz at gmail.com  Fri Jan 22 19:24:31 2010
From: michael.scholz at gmail.com (Michael Scholz)
Date: Fri, 22 Jan 2010 19:24:31 -0800
Subject: [antlr-interest] ANTLR
In-Reply-To: <61e8cbbd1001211835g4a148662i116658ad16d4692a@mail.gmail.com>
References: <61e8cbbd1001211835g4a148662i116658ad16d4692a@mail.gmail.com>
Message-ID: <61e8cbbd1001221924h693195faq51a83d24fef119e5@mail.gmail.com>

I'm trying to do something fairly simple. A variation of the tweak example,
based on the 1pass template rewrite concept. I also would like to dup the
whitespace as the attached code attempts to do, but it doesn't work.

Help?
MS
-------------- next part --------------
A non-text attachment was scrubbed...
Name: mytest.zip
Type: application/zip
Size: 2369 bytes
Desc: not available
Url : http://www.antlr.org/pipermail/antlr-interest/attachments/20100122/0b14bce7/attachment.zip 

From michael.scholz at gmail.com  Fri Jan 22 23:39:28 2010
From: michael.scholz at gmail.com (Michael Scholz)
Date: Fri, 22 Jan 2010 23:39:28 -0800
Subject: [antlr-interest] ANTLR
In-Reply-To: <61e8cbbd1001221924h693195faq51a83d24fef119e5@mail.gmail.com>
References: <61e8cbbd1001211835g4a148662i116658ad16d4692a@mail.gmail.com>
	<61e8cbbd1001221924h693195faq51a83d24fef119e5@mail.gmail.com>
Message-ID: <61e8cbbd1001222339i29be92e6pedf86f4da8e46d2c@mail.gmail.com>

So I basically solved what I was after. Code is attached, for your comments.
(and posterity)

MS

On Fri, Jan 22, 2010 at 7:24 PM, Michael Scholz <michael.scholz at gmail.com>wrote:

> I'm trying to do something fairly simple. A variation of the tweak example,
> based on the 1pass template rewrite concept. I also would like to dup the
> whitespace as the attached code attempts to do, but it doesn't work.
>
> Help?
> MS
>
>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: mytest.zip
Type: application/zip
Size: 2374 bytes
Desc: not available
Url : http://www.antlr.org/pipermail/antlr-interest/attachments/20100122/fb4756bb/attachment.zip 

From serega.sheypak at gmail.com  Sat Jan 23 03:55:04 2010
From: serega.sheypak at gmail.com (Serega Sheypak)
Date: Sat, 23 Jan 2010 14:55:04 +0300
Subject: [antlr-interest] Typographing text with antlr
Message-ID: <197382531001230355p63c1c275k218afbec0825c4d0@mail.gmail.com>

Hi guys, I've really impressed with the power of ANTLR.
I need your advice.
I would like to develop typograph application. I've tried to do it with the
help of regular expressions, but it's extremely hard to maintain it. I write
crazy rregex and in few months I'can get how does it work:)
Typical situation when work with complicated regex'es. I've seen named regex
in Ruby (I use Ruby, Rais, Groovy, Java, JavaFX) but they don't bring so
clear evidence like ANTLR does.
Nice declarative ANTLR style should help a lot.

I am from Russia and I'm working with web texts in Russian.
Please, see short description of the task.

ANTLR based (Target lang -> Ruby) application accepts usual text typed in
browser.
Application should apply several characters transformation rules and emit
well-typographed text.

Example rules are:
1. "Something in quotes" -> &laquo;Something in quotes &raquo;
2. "Some text goes nere "Oh, my something in quotes again!" " -> &laquo;Some
text goes here &bdquo;Oh my something in quotes again!&ldquo; &raquo;
3. (r)  -> *?*,
4. (c) -> *?*,
5. (tm) -> ?
6. someWord-someOtherWord -> someWord&nbsp;&ndash;&nbsp;someOtherWord

Special rule for first quote
7. "some text goes here... -> <span style="margin-right:0.44em;">
&laquo;</span>some text goes here...

and many other rules.

The first I would like to do it for Russian lang, nex will be English.

What do you think, is ANTLR nice for such task, is it convenient to solve
such task using ANTLR?

Thank you for your attention, waiting for your considerations.

From stevenraemaekers at gmail.com  Sat Jan 23 08:36:59 2010
From: stevenraemaekers at gmail.com (Steven Raemaekers)
Date: Sat, 23 Jan 2010 17:36:59 +0100
Subject: [antlr-interest] Making a distinction between float and int
	calculation
Message-ID: <46450b021001230836h1966343fpd52991913f3a9913@mail.gmail.com>

Hello,

In my grammar there should be an evaluator for numeric expressions. These
numeric expressions should return an integer, or a float, depending on the
contents of the expression.
For example:

3 + 2.0: should return float
3 + 2: should return integer
2.0 + 3.0: should return float
1 / 3: should return float
4 / 2: should return int

In my grammar there is only one rule for a numeric expression. I do not know
whether I should duplicate the entire operator precedence rules for the
distinction between float and int.
The following statements are part of my grammar:

expression
: list
| quotedword
| booleanexpression
 ;

booleanexpression
: numericexpression (BOOL^ numericexpression)*
 ;

numericexpression
: mult ((PLUS^ | MINUS^) mult)*
 ;

mult
: atom ((MULTIPLY^ | DIVIDE^) atom)*
;

atom
: INT
| FLOAT
| ID
 | LEFTPAREN expression RIGHTPAREN
-> ^(EXPRESSION expression)
;

Does anybody have a idea how I should take care of this distinction between
float and int? Or is this distinction even necessary?

-- 
Regards,

Steven

From endigitalmind at yahoo.co.uk  Sat Jan 23 13:50:57 2010
From: endigitalmind at yahoo.co.uk (Phil Ritchie)
Date: Sat, 23 Jan 2010 13:50:57 -0800 (PST)
Subject: [antlr-interest] Quantifiers
Message-ID: <929543.17622.qm@web23305.mail.ird.yahoo.com>

I think ANTLR might be a quick way for me to build a validating lexer/parser. The file I want to validate is essentially a comma separated values file but the content of individual fields must adhere to content and length restrictions. One field specification I can't seem to find a way of declaring is (in regular expression form): [a-zA-Z]{1,128}.
?
Is there a way I could approach this?
?


From oliver.zeigermann at gmail.com  Sun Jan 24 01:03:35 2010
From: oliver.zeigermann at gmail.com (Oliver Zeigermann)
Date: Sun, 24 Jan 2010 10:03:35 +0100
Subject: [antlr-interest] Anyone in the whole world doing multi step tree
	transformation?
Message-ID: <9da4f4521001240103r5505ee05oc3391065be6bdbee@mail.gmail.com>

Folks!

I was just wondering if anyone except me is actually doing tree
transformations using ANTLR. I use the tree transformation feature
introduced in 3.1. While this does work well, it is so very hard to
refactor or extend my tree structures as I have to change all my
transformer stages and have no tool support to find out what to change
and where.

I started using heterogenous tokens with normalized children to make
use of compiler type checking which helps, but does not comletely
solve my issues as I still have an unchecked children list - which I
need to traverse the tree using tree walkes.

I was considering skipping the whole grammar driving tree
transformation step, but what should I replace it with?

I know of the xtext approach that uses non normalized heterogenous
tokens generated from a common model shared by all transformation
parts. Which seems like a good idea, however, does not seem to have a
means powerful enough to do serious tree transformation.

Any experiences? Hints?

Thanks in advance

- Oliver

From e0309169 at student.tuwien.ac.at  Sun Jan 24 04:35:36 2010
From: e0309169 at student.tuwien.ac.at (Mikolaj Koziarkiewicz)
Date: Sun, 24 Jan 2010 13:35:36 +0100
Subject: [antlr-interest] Quantifiers
In-Reply-To: <929543.17622.qm@web23305.mail.ird.yahoo.com>
References: <929543.17622.qm@web23305.mail.ird.yahoo.com>
Message-ID: <4B5C3E98.2050605@student.tuwien.ac.at>

Hi Phil,

could you provide a textual definition of your grammar, and/or your 
ANTLR specification so far?

Cheers,
Nick

> I think ANTLR might be a quick way for me to build a validating lexer/parser. The file I want to validate is essentially a comma separated values file but the content of individual fields must adhere to content and length restrictions. One field specification I can't seem to find a way of declaring is (in regular expression form): [a-zA-Z]{1,128}.
>  
> Is there a way I could approach this?
>  
> 
> 
>       
> 
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address
> 


From greneche.hugo at gmail.com  Sun Jan 24 10:18:56 2010
From: greneche.hugo at gmail.com (Hugo)
Date: Sun, 24 Jan 2010 19:18:56 +0100
Subject: [antlr-interest] newbie needs help
In-Reply-To: <1264109124.9363.10.camel@gecko.home.org>
References: <4B58A90E.5020401@gmail.com>
	<1264109124.9363.10.camel@gecko.home.org>
Message-ID: <4B5C8F10.2080608@gmail.com>

Thanks you for all...
 but i have another problem because,
my file also contains some kind of function with the following format:

FUNCTION_A      //without parameters
FUNCTION_B   /opt1 /opt2   //2 parameters
FUNCTION_C A0 B0 %VAR_MYVARIABLE // data with the bytes format

the name of the function are name starting always with FUNCTION_

the problem is that where a NEWLINE is detected, it is considered like a 
"bytes" and it's a problem for this function
and a function like FUNCTION_C is badly detected

Could you give your precious help

thanks in advance


John B. Brodie a ?crit :
> Greetings!
>
> On Thu, 2010-01-21 at 20:20 +0100, Hugo wrote:
>   
>> I started using antlr to parse a specific file format.
>> The problem is that i don't know how to write correctly my grammar.
>>
>> The file have the following format.
>> It contains multiple lines and each can have the following format:
>>
>> Only one or multilple hexadecimal caracter with space or not
>> ex: A0 A4 B5 77
>> or: A0
>>
>> Only variable identifier with the format VAR_XXX
>> ex: VAR_MY_VARIABLE
>>
>> Or the combinaison of the two previous format
>> ex:
>> A0 A4B5 VAR_MY_VARIABLE 77 98 VAR_MY_VARIABLE2
>> or
>> VAR_MY_VARIABLE AA BB
>> or
>> AA BB VAR_MY_VARIABLE
>>
>>
>> what i want to do is to build a AST tree
>>     
>
> attached please find a grammar file that is *almost* what I think you
> are trying to do.
>
> It does not have a MULTIPLE_BYTES_DEF node because the grouping of a
> collection of single_byte instances into a multibyte is ambiguous.
> Consider
>
> 11 22 33 44 55 66 77 88
>
> is this 8 single bytes? 1 single byte and 7-long multi? is it 4 multi
> pairs? a triple, a single and a quad?
>
> i kinda expect you want it to be a single 8-long multi, e.g. any run of
> single bytes becomes a multi. But that is a semantic of your language
> and getting a parser to do semantics isn't always possible....
>
> if you really need the MULTIPLE_BYTE_DEF node, you might be best served
> by parsing using some like my code (e.g. the parser produces only
> BYTE_DEF nodes) and then write a tree-walker that transforms the AST
> resultant from the parse into a new AST that contains the requisite
> MULTIPLE_BYTE_DEF nodes. e.g. scan for and collapse sequences of
> consecutive EXPR_DEF nodes that have BYTE_DEF children into a single
> EXPR_DEF node containing a single MULTIPLE_BYTE_DEF child.
>
>   
>> And the problem is that i don't know how to do this with antlr. the tool
>> always tell me that multiple rule can be applies with my grammar.
>>
>> please help me to solve my problem. 
>>
>> Here is my grammar:
>>
>> stmts               : bytes+ ;
>>
>>
>> bytes : multiple_byte bytes? -> ^(EXPR_DEF multiple_byte  bytes? )
>>
>> | define_expression bytes? -> ^(EXPR_DEF define_expression bytes? )
>>
>> | NEWLINE ;
>>
>> define_expression : define_var -> ^(DEFINE_VAR_DEF define_var) ;
>>
>> define_var : DEFINE_VARIABLE ;
>> multiple_byte : single_byte (single_byte)+ -> ^(MULTIPLE_BYTES_DEF
>> single_byte single_byte+) ;
>>
>>
>> single_byte : byte_digit -> ^(BYTES_DEF byte_digit) ;
>>
>> byte_digit : BYTE_DIGIT ;
>>
>> DEFINE_VARIABLE :
>> 'VAR_'('a'..'z'|'A'..'Z'|'_')('a'..'z'|'A'..'Z'|'0'..'9'|'_')*;
>>
>> BYTE_DIGIT :('0'..'9'| 'A'..'F'|'a'..'f')('0'..'9'| 'A'..'F'|'a'..'f') ;
>>
>> // Ignore whitespace, tab and escape sequence WS : (' '|'\t'|'\\\r\n')+
>> {$channel = HIDDEN;} ;
>>
>> // a new line NEWLINE : '\r'? '\n' ;
>>
>> thanks a lot
>>     
>
> hope this helps...
>    -jbb
>
>   


From parrt at cs.usfca.edu  Sun Jan 24 12:01:00 2010
From: parrt at cs.usfca.edu (Terence Parr)
Date: Sun, 24 Jan 2010 12:01:00 -0800
Subject: [antlr-interest] org.antlr.v4.*  ???
Message-ID: <D072C04B-45BC-4233-8C96-758BF1B4A708@cs.usfca.edu>

Hi. to avoid classes where antlr v3 and v4 have to coexist in a project, is it ok if i use org.antlr.v4 as the root package?

ST v4 is cool since it's org.stringtemplate.* old was org.antlr.stringtemplate.*

Ter

From scott at javadude.com  Sun Jan 24 12:11:37 2010
From: scott at javadude.com (Scott Stanchfield)
Date: Sun, 24 Jan 2010 15:11:37 -0500
Subject: [antlr-interest] org.antlr.v4.* ???
In-Reply-To: <D072C04B-45BC-4233-8C96-758BF1B4A708@cs.usfca.edu>
References: <D072C04B-45BC-4233-8C96-758BF1B4A708@cs.usfca.edu>
Message-ID: <d19d16481001241211mff5c504s9e0867d972350e14@mail.gmail.com>

Sounds cool, but I'd suggest using a similar convention for both antlr
and stringtemplate so it'll be easier come v5 ;)

-- Scott

----------------------------------------
Scott Stanchfield
http://javadude.com


On Sun, Jan 24, 2010 at 3:01 PM, Terence Parr <parrt at cs.usfca.edu> wrote:
> Hi. to avoid classes where antlr v3 and v4 have to coexist in a project, is it ok if i use org.antlr.v4 as the root package?
>
> ST v4 is cool since it's org.stringtemplate.* old was org.antlr.stringtemplate.*
>
> Ter
>
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address
>

From parrt at cs.usfca.edu  Sun Jan 24 12:14:06 2010
From: parrt at cs.usfca.edu (Terence Parr)
Date: Sun, 24 Jan 2010 12:14:06 -0800
Subject: [antlr-interest] org.antlr.v4.* ???
In-Reply-To: <d19d16481001241211mff5c504s9e0867d972350e14@mail.gmail.com>
References: <D072C04B-45BC-4233-8C96-758BF1B4A708@cs.usfca.edu>
	<d19d16481001241211mff5c504s9e0867d972350e14@mail.gmail.com>
Message-ID: <1801E8A3-536B-4337-B2A0-64F7DA61A13E@cs.usfca.edu>

good point. so org.stringtemplate.v4.* and org.antlr.v4.* right?

Ter
On Jan 24, 2010, at 12:11 PM, Scott Stanchfield wrote:

> Sounds cool, but I'd suggest using a similar convention for both antlr
> and stringtemplate so it'll be easier come v5 ;)
> 
> -- Scott


From scott at javadude.com  Sun Jan 24 12:15:54 2010
From: scott at javadude.com (Scott Stanchfield)
Date: Sun, 24 Jan 2010 15:15:54 -0500
Subject: [antlr-interest] org.antlr.v4.* ???
In-Reply-To: <1801E8A3-536B-4337-B2A0-64F7DA61A13E@cs.usfca.edu>
References: <D072C04B-45BC-4233-8C96-758BF1B4A708@cs.usfca.edu>
	<d19d16481001241211mff5c504s9e0867d972350e14@mail.gmail.com>
	<1801E8A3-536B-4337-B2A0-64F7DA61A13E@cs.usfca.edu>
Message-ID: <d19d16481001241215ybf2b399s24c4f0dcece14806@mail.gmail.com>

Sounds good. Normally I'd say use the same package names as before,
but because you're doing a rewrite having a different package name is
a good idea imho.

-- Scott

----------------------------------------
Scott Stanchfield
http://javadude.com


On Sun, Jan 24, 2010 at 3:14 PM, Terence Parr <parrt at cs.usfca.edu> wrote:
> good point. so org.stringtemplate.v4.* and org.antlr.v4.* right?
>
> Ter
> On Jan 24, 2010, at 12:11 PM, Scott Stanchfield wrote:
>
>> Sounds cool, but I'd suggest using a similar convention for both antlr
>> and stringtemplate so it'll be easier come v5 ;)
>>
>> -- Scott
>
>

From endigitalmind at yahoo.co.uk  Sun Jan 24 12:29:20 2010
From: endigitalmind at yahoo.co.uk (Phil Ritchie)
Date: Sun, 24 Jan 2010 12:29:20 -0800 (PST)
Subject: [antlr-interest] Quantifiers
Message-ID: <698757.2976.qm@web23307.mail.ird.yahoo.com>


Mikolaj
?
I haven't attempted a grammar yet but below is a textual example:
?
The file should contain three fields called "jobNo", "description" and "cost".
?
The fields should adhere to the following specifications (regex in braces):
jobNo:? must be digits only, maximum 5 - (\d{1,5})
description:? any lowercase characters or space upto a maximum of 128 - ([a-z ]{1,128})
cost:? positive or negative amount formatted as upto 5 digits before the decimal and 4 afterwards zero padded - (-?\d?\d?\d?\d?\d\.\d\d\d\d)
?
E.g.
?
"jobNo","description","cost"
"12345","this record conforms","123.4321"
"987","this RECORD does not conform because of uppercase usage","-22.44"
?
Phil.

--- On Sun, 24/1/10, Mikolaj Koziarkiewicz <e0309169 at student.tuwien.ac.at> wrote:


From: Mikolaj Koziarkiewicz <e0309169 at student.tuwien.ac.at>
Subject: Re: [antlr-interest] Quantifiers
To: "Phil Ritchie" <endigitalmind at yahoo.co.uk>
Cc: antlr-interest at antlr.org
Date: Sunday, 24 January, 2010, 12:35


Hi Phil,

could you provide a textual definition of your grammar, and/or your ANTLR specification so far?

Cheers,
Nick

> I think ANTLR might be a quick way for me to build a validating lexer/parser. The file I want to validate is essentially a comma separated values file but the content of individual fields must adhere to content and length restrictions. One field specification I can't seem to find a way of declaring is (in regular expression form): [a-zA-Z]{1,128}.
>? Is there a way I could approach this?
>? 
> 
>? ? ???
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address
> 


From sharwell at pixelminegames.com  Sun Jan 24 12:47:22 2010
From: sharwell at pixelminegames.com (Sam Harwell)
Date: Sun, 24 Jan 2010 14:47:22 -0600
Subject: [antlr-interest] org.antlr.v4.*  ???
References: <D072C04B-45BC-4233-8C96-758BF1B4A708@cs.usfca.edu>
Message-ID: <DD5A5D428FE040429CCDF377FAA892840152DE94@martini.ironwillgames.com>

What about org.antlr.compiler.*?

-----Original Message-----
From: antlr-interest-bounces at antlr.org
[mailto:antlr-interest-bounces at antlr.org] On Behalf Of Terence Parr
Sent: Sunday, January 24, 2010 2:01 PM
To: antlr-interest at antlr.org interest
Subject: [antlr-interest] org.antlr.v4.* ???

Hi. to avoid classes where antlr v3 and v4 have to coexist in a project,
is it ok if i use org.antlr.v4 as the root package?

ST v4 is cool since it's org.stringtemplate.* old was
org.antlr.stringtemplate.*

Ter

List: http://www.antlr.org/mailman/listinfo/antlr-interest
Unsubscribe:
http://www.antlr.org/mailman/options/antlr-interest/your-email-address

From parrt at cs.usfca.edu  Sun Jan 24 12:49:52 2010
From: parrt at cs.usfca.edu (Terence Parr)
Date: Sun, 24 Jan 2010 12:49:52 -0800
Subject: [antlr-interest] org.antlr.v4.*  ???
In-Reply-To: <DD5A5D428FE040429CCDF377FAA892840152DE94@martini.ironwillgames.com>
References: <D072C04B-45BC-4233-8C96-758BF1B4A708@cs.usfca.edu>
	<DD5A5D428FE040429CCDF377FAA892840152DE94@martini.ironwillgames.com>
Message-ID: <F0F5E488-DE72-4EDD-91A5-8E58502534E0@cs.usfca.edu>

what does compiler mean here?
Ter
On Jan 24, 2010, at 12:47 PM, Sam Harwell wrote:

> What about org.antlr.compiler.*?
> 
> -----Original Message-----
> From: antlr-interest-bounces at antlr.org
> [mailto:antlr-interest-bounces at antlr.org] On Behalf Of Terence Parr
> Sent: Sunday, January 24, 2010 2:01 PM
> To: antlr-interest at antlr.org interest
> Subject: [antlr-interest] org.antlr.v4.* ???
> 
> Hi. to avoid classes where antlr v3 and v4 have to coexist in a project,
> is it ok if i use org.antlr.v4 as the root package?
> 
> ST v4 is cool since it's org.stringtemplate.* old was
> org.antlr.stringtemplate.*
> 
> Ter
> 
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe:
> http://www.antlr.org/mailman/options/antlr-interest/your-email-address


From sharwell at pixelminegames.com  Sun Jan 24 12:54:47 2010
From: sharwell at pixelminegames.com (Sam Harwell)
Date: Sun, 24 Jan 2010 14:54:47 -0600
Subject: [antlr-interest] org.antlr.v4.*  ???
References: <D072C04B-45BC-4233-8C96-758BF1B4A708@cs.usfca.edu>
	<DD5A5D428FE040429CCDF377FAA892840152DE94@martini.ironwillgames.com>
	<F0F5E488-DE72-4EDD-91A5-8E58502534E0@cs.usfca.edu>
Message-ID: <DD5A5D428FE040429CCDF377FAA892840152DE95@martini.ironwillgames.com>

I guess you could use codegen.* for the tool instead, so you'd end up
with org.antlr.runtime, org.antlr.codegen, etc.

-----Original Message-----
From: Terence Parr [mailto:parrt at cs.usfca.edu] 
Sent: Sunday, January 24, 2010 2:50 PM
To: Sam Harwell
Cc: antlr-interest at antlr.org
Subject: Re: [antlr-interest] org.antlr.v4.* ???

what does compiler mean here?
Ter
On Jan 24, 2010, at 12:47 PM, Sam Harwell wrote:

> What about org.antlr.compiler.*?
> 
> -----Original Message-----
> From: antlr-interest-bounces at antlr.org
> [mailto:antlr-interest-bounces at antlr.org] On Behalf Of Terence Parr
> Sent: Sunday, January 24, 2010 2:01 PM
> To: antlr-interest at antlr.org interest
> Subject: [antlr-interest] org.antlr.v4.* ???
> 
> Hi. to avoid classes where antlr v3 and v4 have to coexist in a
project,
> is it ok if i use org.antlr.v4 as the root package?
> 
> ST v4 is cool since it's org.stringtemplate.* old was
> org.antlr.stringtemplate.*
> 
> Ter
> 
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe:
> http://www.antlr.org/mailman/options/antlr-interest/your-email-address


From oliver.zeigermann at gmail.com  Sun Jan 24 14:29:01 2010
From: oliver.zeigermann at gmail.com (Oliver Zeigermann)
Date: Sun, 24 Jan 2010 23:29:01 +0100
Subject: [antlr-interest] org.antlr.v4.* ???
In-Reply-To: <D072C04B-45BC-4233-8C96-758BF1B4A708@cs.usfca.edu>
References: <D072C04B-45BC-4233-8C96-758BF1B4A708@cs.usfca.edu>
Message-ID: <9da4f4521001241429i22d3da55l5d3cb12532a6bbeb@mail.gmail.com>

Very good idea.

2010/1/24 Terence Parr <parrt at cs.usfca.edu>:
> Hi. to avoid classes where antlr v3 and v4 have to coexist in a project, is it ok if i use org.antlr.v4 as the root package?
>
> ST v4 is cool since it's org.stringtemplate.* old was org.antlr.stringtemplate.*
>
> Ter
>
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address
>

From parrt at cs.usfca.edu  Sun Jan 24 16:40:43 2010
From: parrt at cs.usfca.edu (Terence Parr)
Date: Sun, 24 Jan 2010 16:40:43 -0800
Subject: [antlr-interest] gunit use in ANTLR v4
Message-ID: <B2386636-89F7-491A-8526-16A88D3C729F@cs.usfca.edu>

Hiya, been using Leon's gunit to test ANTLR's AST builder...works great!  I made a few improvements (for next v3 ANTLR release).  Here's a few samples:

grammarSpec:
    "parser grammar P; a : A;"
    -> (PARSER_GRAMMAR P (RULES (RULE a (BLOCK (ALT A)))))

    <<
    parser grammar P;
    options {k=2; output=AST;}
    scope S {int x}
    tokens { A; B='33'; }
    @header {foo}
    a : A;
    >>
    ->
    (PARSER_GRAMMAR P
    (OPTIONS (= k 2) (= output AST))
    (scope S {int x})
    (tokens { A (= B '33'))
    (@ header {foo})
    (RULES (RULE a (BLOCK (ALT A)))))

block:
	"( ^(A B) | ^(b C) )" -> (BLOCK (ALT ("^(" A B)) (ALT ("^(" b C)))
	
alternative:
	"x+=ID* -> $x*" ->
	    (ALT_REWRITE
		    (ALT (* (BLOCK (ALT (+= x ID)))))
            (-> (ALT (* (BLOCK (ALT x))))))

	"A -> ..." -> (ALT_REWRITE (ALT A) (-> ...))
	"A -> "	   -> (ALT_REWRITE (ALT A) (-> EPSILON))

element:
	"b+"		-> (+ (BLOCK (ALT b)))
	"(b)+"		-> (+ (BLOCK (ALT b)))
	"b?"  		-> (? (BLOCK (ALT b)))
	"(b)?"		-> (? (BLOCK (ALT b)))
	"(b)*"		-> (* (BLOCK (ALT b)))
	"b*"		-> (* (BLOCK (ALT b)))
	"'while'*"	-> (* (BLOCK (ALT 'while')))
	"'a'+"		-> (+ (BLOCK (ALT 'a')))
	"a[3]"		-> (a 3)
	"'a'..'z'+" -> (+ (BLOCK (ALT (.. 'a' 'z'))))

Pretty cool, eh?

Ter

From wclodius at los-alamos.net  Sun Jan 24 19:51:54 2010
From: wclodius at los-alamos.net (William B. Clodius)
Date: Sun, 24 Jan 2010 20:51:54 -0700
Subject: [antlr-interest] Making a distinction between float and int
	calculation
In-Reply-To: <46450b021001231403i3305f8cfsc032169f3dd91658@mail.gmail.com>
References: <46450b021001230836h1966343fpd52991913f3a9913@mail.gmail.com>
	<28F5E254-3E2E-4FC7-A856-F12C7E6EFA76@los-alamos.net>
	<46450b021001231403i3305f8cfsc032169f3dd91658@mail.gmail.com>
Message-ID: <FDA965A2-CAB6-41F6-B420-AB7213C3E90D@los-alamos.net>

Steven:

I should have originally posted my answer to antlr-interest not directly to you. So far I have only been using ANTLR to test the lexing and parsing of a language I am developing as a hobby. I have read the tree parsing material in the ANTLR reference but have not used it. Roughly what I would do is have each of INT and float have two attributes associated with it: a type (INT and FLOAT respectively) and a value, to be determined by the string that represents the value. Similarly an ID and an expression would also have a type and a value associated with them. Since you include boolean expressions you will have to decide if you want to explicitly have a boolean type or just have it be an integer type. For these other entities the types and values will have to be determined as you traverse the tree. For example for numericexpression you should examine the types of the two  atoms, if both have type INT, then the type of multexpression is INT and the value is the sum of the two values, if both are FLOAT the corresponding logic is used, if one is INT and the other is FLOAT then you should include an intermediate step that converts the INT to a FLOAT, both in type and value, then perform the numeric operation. Read Chapter 6 of the reference if you have it.

On Jan 23, 2010, at 3:03 PM, Steven Raemaekers wrote:

> Hi William,
> 
> How should i do this exactly in ANTLR? Should I test for this in my Tree walker? I do not have a clue where to start, when I make my numericexpression like this:
> 
> numericexpression returns [int value]
> 	: ^(PLUS mult1 = mult mult2 = mult) { $value = 20; }
> 	| ^(MINUS mult1 = mult mult2 = mult) { $value = 20; }
> 	| ^(MULTIPLY mult1 = mult mult2 = mult) { $value = 20; }
> 	| ^(DIVIDE mult1 = mult mult2 = mult) { $value = 20; }
> 	;
> 
> What value should it return, if all mults can be either floats or integers? 
> 
> Thanks,
> 
> Steven
> 
> On Sat, Jan 23, 2010 at 8:19 PM, William B. Clodius <wclodius at los-alamos.net> wrote:
> THis is normally done as part of the semantic evaluation not as parsing. If and when you start including named entities you will normally be unable to make this distinction using syntax (unless you require integers and floats to have special categories of names). Putting it off until the semantics analysis also allows better error reporting, if you should say make assignment and comparison equalities both valid expressions.
> 
> On Jan 23, 2010, at 9:36 AM, Steven Raemaekers wrote:
> 
> > Hello,
> >
> > In my grammar there should be an evaluator for numeric expressions. These
> > numeric expressions should return an integer, or a float, depending on the
> > contents of the expression.
> > For example:
> >
> > 3 + 2.0: should return float
> > 3 + 2: should return integer
> > 2.0 + 3.0: should return float
> > 1 / 3: should return float
> > 4 / 2: should return int
> >
> > In my grammar there is only one rule for a numeric expression. I do not know
> > whether I should duplicate the entire operator precedence rules for the
> > distinction between float and int.
> > The following statements are part of my grammar:
> >
> > expression
> > : list
> > | quotedword
> > | booleanexpression
> > ;
> >
> > booleanexpression
> > : numericexpression (BOOL^ numericexpression)*
> > ;
> >
> > numericexpression
> > : mult ((PLUS^ | MINUS^) mult)*
> > ;
> >
> > mult
> > : atom ((MULTIPLY^ | DIVIDE^) atom)*
> > ;
> >
> > atom
> > : INT
> > | FLOAT
> > | ID
> > | LEFTPAREN expression RIGHTPAREN
> > -> ^(EXPRESSION expression)
> > ;
> >
> > Does anybody have a idea how I should take care of this distinction between
> > float and int? Or is this distinction even necessary?
> >
> > --
> > Regards,
> >
> > Steven
> >
> > List: http://www.antlr.org/mailman/listinfo/antlr-interest
> > Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address
> >
> 


From lgcraymer at yahoo.com  Sun Jan 24 22:25:58 2010
From: lgcraymer at yahoo.com (Loring Craymer)
Date: Sun, 24 Jan 2010 22:25:58 -0800 (PST)
Subject: [antlr-interest] Anyone in the whole world doing multi step
	tree transformation?
In-Reply-To: <9da4f4521001240103r5505ee05oc3391065be6bdbee@mail.gmail.com>
References: <9da4f4521001240103r5505ee05oc3391065be6bdbee@mail.gmail.com>
Message-ID: <72961.38549.qm@web55905.mail.re3.yahoo.com>

Oliver--

Some key points:
1.)  Capture semantics rather than designing tree structures.  
2.)  Preserve grammar structure--that is, rule a in pass n becomes rule a in pass n+1 unless there is reason to do otherwise.
3.)  Avoid cluttering your grammars with action code.
4.)  Separate analysis passes from transformation passes.

Follow those principles, and you'll find that rippling changes across grammars is tedious, but not a real problem.

--Loring


----- Original Message ----
> From: Oliver Zeigermann <oliver.zeigermann at gmail.com>
> To: antlr-interest Interest <antlr-interest at antlr.org>
> Sent: Sun, January 24, 2010 1:03:35 AM
> Subject: [antlr-interest] Anyone in the whole world doing multi step tree transformation?
> 
> Folks!
> 
> I was just wondering if anyone except me is actually doing tree
> transformations using ANTLR. I use the tree transformation feature
> introduced in 3.1. While this does work well, it is so very hard to
> refactor or extend my tree structures as I have to change all my
> transformer stages and have no tool support to find out what to change
> and where.
> 
> I started using heterogenous tokens with normalized children to make
> use of compiler type checking which helps, but does not comletely
> solve my issues as I still have an unchecked children list - which I
> need to traverse the tree using tree walkes.
> 
> I was considering skipping the whole grammar driving tree
> transformation step, but what should I replace it with?
> 
> I know of the xtext approach that uses non normalized heterogenous
> tokens generated from a common model shared by all transformation
> parts. Which seems like a good idea, however, does not seem to have a
> means powerful enough to do serious tree transformation.
> 
> Any experiences? Hints?
> 
> Thanks in advance
> 
> - Oliver
> 
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe: 
> http://www.antlr.org/mailman/options/antlr-interest/your-email-address


From oliver.zeigermann at gmail.com  Mon Jan 25 01:00:27 2010
From: oliver.zeigermann at gmail.com (Oliver Zeigermann)
Date: Mon, 25 Jan 2010 10:00:27 +0100
Subject: [antlr-interest] Anyone in the whole world doing multi step
	tree transformation?
In-Reply-To: <72961.38549.qm@web55905.mail.re3.yahoo.com>
References: <9da4f4521001240103r5505ee05oc3391065be6bdbee@mail.gmail.com>
	<72961.38549.qm@web55905.mail.re3.yahoo.com>
Message-ID: <9da4f4521001250100w175aeb6bwfd9bf43d7b5551c4@mail.gmail.com>

Hey, Loring!

Thanks for your help. Two questions:
1.) This does not help when you do major refactorings, doest it ;)
2.) Where do you store the information you gather in analysis? I stick
them back into the tree or put them into (symbol-)tables. If you do so
as well: What do you do if the tree data in tables has to be processed
in subsequent tree transformation steps? How do you pass in the data?

Any thoughts?

- Oliver

2010/1/25 Loring Craymer <lgcraymer at yahoo.com>:
> Oliver--
>
> Some key points:
> 1.) ?Capture semantics rather than designing tree structures.
> 2.) ?Preserve grammar structure--that is, rule a in pass n becomes rule a in pass n+1 unless there is reason to do otherwise.
> 3.) ?Avoid cluttering your grammars with action code.
> 4.) ?Separate analysis passes from transformation passes.
>
> Follow those principles, and you'll find that rippling changes across grammars is tedious, but not a real problem.
>
> --Loring
>
>
>
>
> ----- Original Message ----
>> From: Oliver Zeigermann <oliver.zeigermann at gmail.com>
>> To: antlr-interest Interest <antlr-interest at antlr.org>
>> Sent: Sun, January 24, 2010 1:03:35 AM
>> Subject: [antlr-interest] Anyone in the whole world doing multi step tree transformation?
>>
>> Folks!
>>
>> I was just wondering if anyone except me is actually doing tree
>> transformations using ANTLR. I use the tree transformation feature
>> introduced in 3.1. While this does work well, it is so very hard to
>> refactor or extend my tree structures as I have to change all my
>> transformer stages and have no tool support to find out what to change
>> and where.
>>
>> I started using heterogenous tokens with normalized children to make
>> use of compiler type checking which helps, but does not comletely
>> solve my issues as I still have an unchecked children list - which I
>> need to traverse the tree using tree walkes.
>>
>> I was considering skipping the whole grammar driving tree
>> transformation step, but what should I replace it with?
>>
>> I know of the xtext approach that uses non normalized heterogenous
>> tokens generated from a common model shared by all transformation
>> parts. Which seems like a good idea, however, does not seem to have a
>> means powerful enough to do serious tree transformation.
>>
>> Any experiences? Hints?
>>
>> Thanks in advance
>>
>> - Oliver
>>
>> List: http://www.antlr.org/mailman/listinfo/antlr-interest
>> Unsubscribe:
>> http://www.antlr.org/mailman/options/antlr-interest/your-email-address
>
>
>
>
>
>

From lgcraymer at yahoo.com  Mon Jan 25 02:22:16 2010
From: lgcraymer at yahoo.com (Loring Craymer)
Date: Mon, 25 Jan 2010 02:22:16 -0800 (PST)
Subject: [antlr-interest] Anyone in the whole world doing multi step
	tree transformation?
In-Reply-To: <9da4f4521001250100w175aeb6bwfd9bf43d7b5551c4@mail.gmail.com>
References: <9da4f4521001240103r5505ee05oc3391065be6bdbee@mail.gmail.com>
	<72961.38549.qm@web55905.mail.re3.yahoo.com>
	<9da4f4521001250100w175aeb6bwfd9bf43d7b5551c4@mail.gmail.com>
Message-ID: <745659.10881.qm@web55907.mail.re3.yahoo.com>

Oliver--

1.)  There are languages where semantically equivalent data--argument lists and the like--appears in different places with different syntax.  In those cases you can end up with tree grammars that radically differ from the parser grammar, but then the tree grammars can usually be kept reasonably consistent.  I happen to believe that the refactoring should be done with tool assistance (a refactoring editor) so that it might be possible to reconstruct the refactorings.  Without tool assistance, though, the consolation in these cases is that the tree grammars end up being simpler than the parser grammar.

2.)  In Yggdrasil, data structures that need to be preserved across passes are declared as "public" attributes of the grammars and propagated in the target language wrapper code that invokes each pass.

--Loring


----- Original Message ----
> From: Oliver Zeigermann <oliver.zeigermann at gmail.com>
> To: Loring Craymer <lgcraymer at yahoo.com>
> Cc: antlr-interest Interest <antlr-interest at antlr.org>
> Sent: Mon, January 25, 2010 1:00:27 AM
> Subject: Re: [antlr-interest] Anyone in the whole world doing multi step tree  transformation?
> 
> Hey, Loring!
> 
> Thanks for your help. Two questions:
> 1.) This does not help when you do major refactorings, doest it ;)
> 2.) Where do you store the information you gather in analysis? I stick
> them back into the tree or put them into (symbol-)tables. If you do so
> as well: What do you do if the tree data in tables has to be processed
> in subsequent tree transformation steps? How do you pass in the data?
> 
> Any thoughts?
> 
> - Oliver
> 
> 2010/1/25 Loring Craymer :
> > Oliver--
> >
> > Some key points:
> > 1.)  Capture semantics rather than designing tree structures.
> > 2.)  Preserve grammar structure--that is, rule a in pass n becomes rule a in 
> pass n+1 unless there is reason to do otherwise.
> > 3.)  Avoid cluttering your grammars with action code.
> > 4.)  Separate analysis passes from transformation passes.
> >
> > Follow those principles, and you'll find that rippling changes across grammars 
> is tedious, but not a real problem.
> >
> > --Loring
> >
> >
> >
> >
> > ----- Original Message ----
> >> From: Oliver Zeigermann 
> >> To: antlr-interest Interest 
> >> Sent: Sun, January 24, 2010 1:03:35 AM
> >> Subject: [antlr-interest] Anyone in the whole world doing multi step tree 
> transformation?
> >>
> >> Folks!
> >>
> >> I was just wondering if anyone except me is actually doing tree
> >> transformations using ANTLR. I use the tree transformation feature
> >> introduced in 3.1. While this does work well, it is so very hard to
> >> refactor or extend my tree structures as I have to change all my
> >> transformer stages and have no tool support to find out what to change
> >> and where.
> >>
> >> I started using heterogenous tokens with normalized children to make
> >> use of compiler type checking which helps, but does not comletely
> >> solve my issues as I still have an unchecked children list - which I
> >> need to traverse the tree using tree walkes.
> >>
> >> I was considering skipping the whole grammar driving tree
> >> transformation step, but what should I replace it with?
> >>
> >> I know of the xtext approach that uses non normalized heterogenous
> >> tokens generated from a common model shared by all transformation
> >> parts. Which seems like a good idea, however, does not seem to have a
> >> means powerful enough to do serious tree transformation.
> >>
> >> Any experiences? Hints?
> >>
> >> Thanks in advance
> >>
> >> - Oliver
> >>
> >> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> >> Unsubscribe:
> >> http://www.antlr.org/mailman/options/antlr-interest/your-email-address
> >
> >
> >
> >
> >
> >


From ranco.marcus at epirion.nl  Mon Jan 25 02:35:56 2010
From: ranco.marcus at epirion.nl (Ranco Marcus)
Date: Mon, 25 Jan 2010 10:35:56 +0000
Subject: [antlr-interest] Antlr does not generate Lexer from a
	composite	grammar
In-Reply-To: <93BD0000E4D72D458F0E8CDE6BA971A80EBFECBD@CINMLVEM11.e2k.ad.ge.com>
References: <93BD0000E4D72D458F0E8CDE6BA971A80EBFECBD@CINMLVEM11.e2k.ad.ge.com>
Message-ID: <2B65C901391C804DBB9CF9E6FE30C6F914976940@sun.epirion.local>

I experienced the same problem and did not find a proper solution for it. As a work-around, I have found that adding a dummy lexer rule to the composite grammar causes the lexer to be generated.

grammar C ;
import L, P2 ;

stuff : ( letters spaces )+ ;
dummy : 'DUMMY';


In general, I would expect that no parser or lexer rule is required in the composite grammar. This way, we can use the composite grammer only as a way to glue things together and specify generation options for a particular use.

I hope this is of any help to you.

Best regards,

Ranco Marcus


From: antlr-interest-bounces at antlr.org [mailto:antlr-interest-bounces at antlr.org] On Behalf Of Stevenson, Todd (GE Healthcare, consultant)
Sent: dinsdag 1 december 2009 19:09
To: antlr-interest at antlr.org
Subject: [antlr-interest] Antlr does not generate Lexer from a composite grammar

I built the grammars described at the bottom of the Composite Grammars page of the Antlr documentation(i.e.  L, P1, P2, and C).  When I run Antlr with no command line arguments on the combined grammar 'C', it generates C_P1.java, C_P2_P1.java, and CParser.java, but does not generate CLexer.java.   Is this correct behavior?  If so, how to I call the lexer from my source java program?

When I build the other grammars on that page (Root and Delegate), and run Antlr on 'Root', it generates RootParser.java Root_Delegate.java and RootLexer.java.

thanks.

I tried it with Antlr 3.2 and Antlr 3.1.3.


From candide at palacehotel.org  Mon Jan 25 04:17:30 2010
From: candide at palacehotel.org (Candide Kemmler)
Date: Mon, 25 Jan 2010 13:17:30 +0100
Subject: [antlr-interest] antlrworks interpreter like serialized parse tree
Message-ID: <74DD45A5-DD1E-41C3-819D-2032293EF2A9@palacehotel.org>

Hi,

I'm very happy with my antlr results so far, and next step is to use antlr's output to add a code-completion like feature to my application.

I love the parse tree representation that antlrWorks presents and getting such a structure would be ideal for my use case. However I can't seem to find a way to create a similar representation of the parse tree using the API.

Any ideas?

Candide

From candide at palacehotel.org  Mon Jan 25 04:48:24 2010
From: candide at palacehotel.org (Candide Kemmler)
Date: Mon, 25 Jan 2010 13:48:24 +0100
Subject: [antlr-interest] antlrworks interpreter like serialized parse
	tree
In-Reply-To: <d19d16481001250423x559e084cj40ddfbe5e0215af7@mail.gmail.com>
References: <74DD45A5-DD1E-41C3-819D-2032293EF2A9@palacehotel.org>
	<d19d16481001250423x559e084cj40ddfbe5e0215af7@mail.gmail.com>
Message-ID: <ABA4BF89-6853-41F9-ABCD-A6C6A258F1E3@palacehotel.org>

That's very interesting. I don't want to create an image, no: only a structured data representation (XML or JSON).
Can you elaborate a little bit on how to enable the debug option ("debug = true" is not working for me) and then how to listen to the debugging events?

Thanks a lot for your quick and enlightening answer :-)

Candide
On 25 Jan 2010, at 13:23, Scott Stanchfield wrote:

> It's captured using the debugging API. ANTLRWorks listens to debugging
> events from your parser (when it's generated with the debug option)
> and hears when rules are entered and exited.
> 
> You could use these events to build a tree (I'm working on an
> AST-diagram generator for eclipse using the debug API, using Eclipse's
> Zest framework for the diagram).
> 
> If you just want images, I would recommend that you use the debugging
> api to capture the enters/exits and then create a GraphViz dot file.
> Check out http://www.graphviz.org. You can use it to generate many
> graphics file formats.
> -- Scott
> 
> ----------------------------------------
> Scott Stanchfield
> http://javadude.com
> 
> 
> 
> On Mon, Jan 25, 2010 at 7:17 AM, Candide Kemmler
> <candide at palacehotel.org> wrote:
>> Hi,
>> 
>> I'm very happy with my antlr results so far, and next step is to use antlr's output to add a code-completion like feature to my application.
>> 
>> I love the parse tree representation that antlrWorks presents and getting such a structure would be ideal for my use case. However I can't seem to find a way to create a similar representation of the parse tree using the API.
>> 
>> Any ideas?
>> 
>> Candide
>> 
>> List: http://www.antlr.org/mailman/listinfo/antlr-interest
>> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address
>> 


From antlr at mirality.co.nz  Mon Jan 25 12:14:59 2010
From: antlr at mirality.co.nz (Gavin Lambert)
Date: Tue, 26 Jan 2010 09:14:59 +1300
Subject: [antlr-interest] antlrworks interpreter like serialized parse
 tree
In-Reply-To: <74DD45A5-DD1E-41C3-819D-2032293EF2A9@palacehotel.org>
References: <74DD45A5-DD1E-41C3-819D-2032293EF2A9@palacehotel.org>
Message-ID: <20100125201513.24B0A341840F@www.antlr.org>

At 01:17 26/01/2010, Candide Kemmler wrote:
 >I love the parse tree representation that antlrWorks presents 
and
 >getting such a structure would be ideal for my use case. However 
I
 >can't seem to find a way to create a similar representation of 
the
 >parse tree using the API.

Normally you don't really want to generate the parse tree as shown 
in ANTLRworks -- that's purely a debugging aid.  For production 
use you're better off generating an AST instead; this way you have 
more control over the output and you can (among other things) 
refactor your parser without altering the output if you want to.

See the output=AST option and the various example grammars, wiki 
pages, and book chapters about AST construction.


From candide at palacehotel.org  Mon Jan 25 12:42:39 2010
From: candide at palacehotel.org (Candide Kemmler)
Date: Mon, 25 Jan 2010 21:42:39 +0100
Subject: [antlr-interest] antlrworks interpreter like serialized parse
	tree
In-Reply-To: <20100125201514.DB2F5952049@ns1.jwhosting.eu>
References: <74DD45A5-DD1E-41C3-819D-2032293EF2A9@palacehotel.org>
	<20100125201514.DB2F5952049@ns1.jwhosting.eu>
Message-ID: <4DC81062-2BE2-4D24-95E5-86F21821FF2F@palacehotel.org>

Yes that's already what I'm doing but the AST (in the form of a CommonTree) is only really giving me the leaf tokens without the intermediary branches corresponding to the rules that "recognized" my programs.

I have attached an example of a test grammar to illustrate what I mean. The sample sentence where multiple rules are fired ("location", "when", "where",...) are shown in a nice hierarchy in AntlrWorks whereas in the debugger in Eclipse I can only see a flat structure where the root tree has a boring set of 6 children each corresponding to the final tokens of my sentence.

-------------- next part --------------
A non-text attachment was scrubbed...
Name: astantlrworks.png
Type: image/png
Size: 11893 bytes
Desc: not available
Url : http://www.antlr.org/pipermail/antlr-interest/attachments/20100125/9a892bf1/attachment.png 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: astdebugger.png
Type: image/png
Size: 25082 bytes
Desc: not available
Url : http://www.antlr.org/pipermail/antlr-interest/attachments/20100125/9a892bf1/attachment-0001.png 
-------------- next part --------------


Or maybe I am completely missing the point here...

On 25 Jan 2010, at 21:14, Gavin Lambert wrote:

> At 01:17 26/01/2010, Candide Kemmler wrote:
> >I love the parse tree representation that antlrWorks presents and
> >getting such a structure would be ideal for my use case. However I
> >can't seem to find a way to create a similar representation of the
> >parse tree using the API.
> 
> Normally you don't really want to generate the parse tree as shown in ANTLRworks -- that's purely a debugging aid.  For production use you're better off generating an AST instead; this way you have more control over the output and you can (among other things) refactor your parser without altering the output if you want to.
> 
> See the output=AST option and the various example grammars, wiki pages, and book chapters about AST construction.
> 


From killebrew.daniel at gmail.com  Mon Jan 25 17:52:16 2010
From: killebrew.daniel at gmail.com (Daniel Killebrew)
Date: Mon, 25 Jan 2010 17:52:16 -0800
Subject: [antlr-interest] two ways to match nothing
In-Reply-To: <4B318151.20107@gmail.com>
References: <4B318151.20107@gmail.com>
Message-ID: <4B5E4AD0.5050000@gmail.com>

Antlr doesn't like it when there are multiple ways to match nothing. It 
says there's an error in my grammar because the second "alternative" 
(which is another way to match nothing) will never match.
Antlr can enter the optional (...)? element and match nothing, or skip 
the optional element, thus matching nothing.

example:

naughty_rule
     :    Start (A? List*)? End
     ;
Start    :    'start';
A    :    'aaa';
End    :    'end';
List    :    'list';

Rewritten so Antlr is happy
good_rule
     :    Start End
     |    Start A List* End
     |    Start List+ End
     ;

While I can rewrite my grammar easily enough, it seems odd that Antlr 
doesn't recognize that it's trying to match nothing in two different 
ways, so who cares if it can't match the second alternative. That 
shouldn't be an error. If it's a warning, I could understand that. To 
make it the user rewrite their code into something less legible seems to 
be opposite of the usual 'Antlr way'. Although I guess this would 
require making the code a little more complicated to detect this special 
case, so perhaps this was already considered.

Cheers
Daniel

From killebrew.daniel at gmail.com  Mon Jan 25 18:10:43 2010
From: killebrew.daniel at gmail.com (Daniel Killebrew)
Date: Mon, 25 Jan 2010 18:10:43 -0800
Subject: [antlr-interest] two ways to match nothing
In-Reply-To: <4B5E4D75.7020305@kjchome.homeip.net>
References: <4B318151.20107@gmail.com> <4B5E4AD0.5050000@gmail.com>
	<4B5E4D75.7020305@kjchome.homeip.net>
Message-ID: <4B5E4F23.5080103@gmail.com>

Doh, thanks for pointing that out Kevin. Ignore my silliness, everyone 
:)  I got caught up transcribing a parser into Antlr and overlooked this 
simple, obvious transformation.

Daniel

On 1/25/2010 6:03 PM, Kevin J. Cummings wrote:
> On 01/25/2010 08:52 PM, Daniel Killebrew wrote:
>    
>> Antlr doesn't like it when there are multiple ways to match nothing. It
>> says there's an error in my grammar because the second "alternative"
>> (which is another way to match nothing) will never match.
>> Antlr can enter the optional (...)? element and match nothing, or skip
>> the optional element, thus matching nothing.
>>
>> example:
>>
>> naughty_rule
>>       :    Start (A? List*)? End
>>       ;
>>      
> Why can't you just rewrite naughty_rule as:
>
> good_rule
> 	: Start A? List* End
> 	;
>
> I think the outer ()? is what was confusing antlr....
>
>    
>> Start    :    'start';
>> A    :    'aaa';
>> End    :    'end';
>> List    :    'list';
>>
>> Rewritten so Antlr is happy
>> good_rule
>>       :    Start End
>>       |    Start A List* End
>>       |    Start List+ End
>>       ;
>>
>> While I can rewrite my grammar easily enough, it seems odd that Antlr
>> doesn't recognize that it's trying to match nothing in two different
>> ways, so who cares if it can't match the second alternative. That
>> shouldn't be an error. If it's a warning, I could understand that. To
>> make it the user rewrite their code into something less legible seems to
>> be opposite of the usual 'Antlr way'. Although I guess this would
>> require making the code a little more complicated to detect this special
>> case, so perhaps this was already considered.
>>
>> Cheers
>> Daniel
>>
>> List: http://www.antlr.org/mailman/listinfo/antlr-interest
>> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address
>>      
>
>    

From hvmedhaj at gmail.com  Mon Jan 25 20:47:05 2010
From: hvmedhaj at gmail.com (venkat medhaj)
Date: Mon, 25 Jan 2010 23:47:05 -0500
Subject: [antlr-interest] how to construct an AST ?
Message-ID: <f25256041001252047t265abe7ep13b242398d5c83b0@mail.gmail.com>

Hi, I am a newbie to ANTLR and I am learning to use Antlr lately. I want to
generate an AST for the .g file i.e the grammar file available, the target
language being Java 1.6.
Can anyone please tell me how to proceed ? I find it a bit confusing.

Thnks,
-V

From jeff.wilcox at mac.com  Tue Jan 26 06:52:18 2010
From: jeff.wilcox at mac.com (Jeff Wilcox)
Date: Tue, 26 Jan 2010 06:52:18 -0800
Subject: [antlr-interest] Disabling rules in the lexer
Message-ID: <33AFEA62-2C42-4174-B149-06D2025628F9@mac.com>

Hi,

I have a special area in this language that has symbols within a table structure that are normally used in other tokens in other areas of the language (like a couple digits, a couple letters and a couple symbols).  So I am trying to setup the lexer to accept these table tokens only when in a table.  Based on what I have been able to dig up, I believe gated semantic predicates are a valid way to disable rules in the lexer.  However, I am seeing issues with this with ANTLR 3.2 and the java language target.  

So I expected a lexer rules like this to do the trick:  

Level0       : {inTable}?=> '0';

But that actually creates a very strange loop when inTable is false.  I basically throws a FailedPredicateException (which I would not have expected for a gated predicate) and then retries the same token with the same rule, obviously resulting in an infinite loop.

Can someone clarify whether this is allowed and if so whether there is some trick to using it?  I am stumped.  

Thanks
Jeff


From csp7kk3 at cs.ucy.ac.cy  Tue Jan 26 07:34:33 2010
From: csp7kk3 at cs.ucy.ac.cy (Konstantinos Kakousis)
Date: Tue, 26 Jan 2010 17:34:33 +0200
Subject: [antlr-interest] Test class for tree grammars
Message-ID: <4B5F0B89.8080206@cs.ucy.ac.cy>

Hello,

I have my grammar and tree grammars working exactly as expected at the 
AntlrWorks.
Now I was trying to run from console or Eclipse the same grammar using 
the following Test.java
class:

import org.antlr.runtime.ANTLRStringStream;
import org.antlr.runtime.CommonTokenStream;
import org.antlr.runtime.RecognitionException;
import org.antlr.runtime.tree.CommonTree;
import org.antlr.runtime.tree.CommonTreeNodeStream;
import org.antlr.runtime.tree.Tree;

public class Test{

    public static void main (String[] args){
       
       try{
         String in = "5+6*7";
        ANTLRStringStream input = new ANTLRStringStream(in);
        UtilityLexer lexer = new UtilityLexer(input);
        CommonTokenStream tokens = new CommonTokenStream(lexer);
        UtilityParser parser = new UtilityParser(tokens);
        UtilityParser.prog_return r = null;
        r = parser.prog();
       CommonTree t = (CommonTree)r.getTree(); // get tree from parser
         System.out.println("Parse Tree:"+t.toStringTree());

        CommonTreeNodeStream nodes = new CommonTreeNodeStream(t);
         System.out.println ("Here1");
  
        nodes.setTokenStream(tokens);
         System.out.println ("Here2");

        UtilTree walker = new UtilTree(nodes);
         System.out.println ("Here3");
        walker.prog();
         System.out.println ("Here4");
   
    } catch (RecognitionException e) {
            e.printStackTrace();
        }

      
    }
}

 From the output:
Parse Tree:(+ 5 (* 6 7))
Here1
Here2

It seems that the programs hangs on the following command:
   UtilTree walker = new UtilTree(nodes);
Is there somewhere a standard Test.java class for running the generated 
grammars?
Is there something wrong with the above class?

BR,

-- 
Konstantinos Kakousis
Research Associate

Department of Computer Science
University of Cyprus

Address: P.O. Box 20537, CY-1678, Nicosia, Cyprus
Tel:     +357 22892684
Fax:     +357 22892701
Webpage: http://www.cs.ucy.ac.cy/~csp7kk3
Email:   mailto://kakousis at cs.ucy.ac.cy
Skype:   callto://costas.kakousis


From kfeuerherm at wlu.ca  Tue Jan 26 09:51:54 2010
From: kfeuerherm at wlu.ca (Karljurgen Feuerherm)
Date: Tue, 26 Jan 2010 12:51:54 -0500
Subject: [antlr-interest] Running ANTLRWorks 1.3.1 -- javac error
Message-ID: <4B5EE56A020000CC0001CF38@wlgw07.wlu.ca>

Hello,
 
I'm new to this product (and to modern products of this type
generally... was a B programmer in the early 80s and trying to get
updated!)
 
I'm on Windows XP, and have run the JAR file to invoke ANTLRWorks.
 
I'm trying out the Expression Evaluator Tutorial. Interpreter works
fine, but invoking the debugger gets me
 
"java.IO.IOException: Cannot run program "javac": CreateProcess
error=2, the system cannot find the file specified"
 
(Oddly, after a while, trying it again got me a different error about
timeout, even though I'd changed nothing [Sure. Famous Last Words,
eh?].)
 
Not sure where to go from here... By all means be pedantic in a
response :)
 
Thanks!

K
 
Karlj?rgen G. Feuerherm, PhD
Department of Archaeology and Classical Studies
Wilfrid Laurier University
75 University Avenue West
Waterloo, Ontario N2L 3C5
Tel. (519) 884-1970 x3193
Fax (519) 883-0991 (ATTN Arch. & Classics)

From bkiers at gmail.com  Tue Jan 26 10:19:59 2010
From: bkiers at gmail.com (Bart Kiers)
Date: Tue, 26 Jan 2010 19:19:59 +0100
Subject: [antlr-interest] Running ANTLRWorks 1.3.1 -- javac error
In-Reply-To: <4B5EE56A020000CC0001CF38@wlgw07.wlu.ca>
References: <4B5EE56A020000CC0001CF38@wlgw07.wlu.ca>
Message-ID: <af443a971001261019g6e82a1f2tb8449c321249aa8a@mail.gmail.com>

Karlj?rgen,

In order to run ANTLRWorks, you do not need 'javac', but 'java'.

'javac' is the compiler that will compile java source files into byte codes
that the JRE (Java Runtime Environment) interprets/executes.

'java' is the application that executes the byte codes produced by 'javac'.
Since ANTLRWorks is already compiled, you only need 'java'.

So, on the command line, give the following command:

java -jar antlrworks-1.3.1.jar

If the above does not work, please post the exact error message(s) on the
list.

Thanks.

Bart.


On Tue, Jan 26, 2010 at 6:51 PM, Karljurgen Feuerherm <kfeuerherm at wlu.ca>wrote:

> Hello,
>
> I'm new to this product (and to modern products of this type
> generally... was a B programmer in the early 80s and trying to get
> updated!)
>
> I'm on Windows XP, and have run the JAR file to invoke ANTLRWorks.
>
> I'm trying out the Expression Evaluator Tutorial. Interpreter works
> fine, but invoking the debugger gets me
>
> "java.IO.IOException: Cannot run program "javac": CreateProcess
> error=2, the system cannot find the file specified"
>
> (Oddly, after a while, trying it again got me a different error about
> timeout, even though I'd changed nothing [Sure. Famous Last Words,
> eh?].)
>
> Not sure where to go from here... By all means be pedantic in a
> response :)
>
> Thanks!
>
> K
>
> Karlj?rgen G. Feuerherm, PhD
> Department of Archaeology and Classical Studies
> Wilfrid Laurier University
> 75 University Avenue West
> Waterloo, Ontario N2L 3C5
> Tel. (519) 884-1970 x3193
> Fax (519) 883-0991 (ATTN Arch. & Classics)
>
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe:
> http://www.antlr.org/mailman/options/antlr-interest/your-email-address
>

From bkiers at gmail.com  Tue Jan 26 10:40:52 2010
From: bkiers at gmail.com (Bart Kiers)
Date: Tue, 26 Jan 2010 19:40:52 +0100
Subject: [antlr-interest] Running ANTLRWorks 1.3.1 -- javac error
In-Reply-To: <1332b72e1001261037w341273e4kdedd7de7ccc1317@mail.gmail.com>
References: <4B5EE56A020000CC0001CF38@wlgw07.wlu.ca>
	<af443a971001261019g6e82a1f2tb8449c321249aa8a@mail.gmail.com>
	<1332b72e1001261037w341273e4kdedd7de7ccc1317@mail.gmail.com>
Message-ID: <af443a971001261040t797e8c36rd30cd30ed868bdc6@mail.gmail.com>

On Tue, Jan 26, 2010 at 7:37 PM, Andreas Stefik <stefika at gmail.com> wrote:

> I think he's asking why the debugger throws errors with javac, not how to
> start antlrworks.
>
> The error you are seeing is because the antlrworks debugger, far as I
> understand it, needs a java compiler to actually debug a grammar. As such,
> you need to put the path to your javac compiler in the path field in
> antlrworks. This is straightforward to do:
>
> 1. Open up the options window. I'm on mac at the moment, which is in
> preferences, but on windows it is similar.
> 2. Go to the tab labeled compiler and look for where it says javac.
> 3. Check path under javac, then click browse and a window should appear.
> 4. Browse to where javac is located.
>
> As I'm on mac, the paths are different, but if I recall correctly, on
> windows javac is in program files, so it would be something "like"
>
> c:\program files\Java\bin\javac.exe
>
> That path might not be correct, but I don't have a windows box on me to
> give it to you exactly. Should be close though and if you browse around you
> should find it.
>
> The last detail is that, if you can't find javac, you may not have the JDK
> installed (java.sun.com), so you'll need to do that. It's just a little
> installer, so there's nothing fancy to do. You can know for sure whether you
> have it by going to the command line and typing:
>
> javac
>
> if it throws an error, you need the JDK. If it's there, you will see a
> bunch of information put out to the terminal.
>
> Hope that helps,
>
> Andreas Stefik, Ph.D.
> Assistant Professor
> Department of Computer Science
> Southern Illinois University Edwardsville
>
>
>
> On Tue, Jan 26, 2010 at 12:19 PM, Bart Kiers <bkiers at gmail.com> wrote:
>
>> Karlj?rgen,
>>
>> In order to run ANTLRWorks, you do not need 'javac', but 'java'.
>>
>> 'javac' is the compiler that will compile java source files into byte
>> codes
>> that the JRE (Java Runtime Environment) interprets/executes.
>>
>> 'java' is the application that executes the byte codes produced by
>> 'javac'.
>> Since ANTLRWorks is already compiled, you only need 'java'.
>>
>> So, on the command line, give the following command:
>>
>> java -jar antlrworks-1.3.1.jar
>>
>> If the above does not work, please post the exact error message(s) on the
>> list.
>>
>> Thanks.
>>
>> Bart.
>>
>>
>> On Tue, Jan 26, 2010 at 6:51 PM, Karljurgen Feuerherm <kfeuerherm at wlu.ca
>> >wrote:
>>
>> > Hello,
>> >
>> > I'm new to this product (and to modern products of this type
>> > generally... was a B programmer in the early 80s and trying to get
>> > updated!)
>> >
>> > I'm on Windows XP, and have run the JAR file to invoke ANTLRWorks.
>> >
>> > I'm trying out the Expression Evaluator Tutorial. Interpreter works
>> > fine, but invoking the debugger gets me
>> >
>> > "java.IO.IOException: Cannot run program "javac": CreateProcess
>> > error=2, the system cannot find the file specified"
>> >
>> > (Oddly, after a while, trying it again got me a different error about
>> > timeout, even though I'd changed nothing [Sure. Famous Last Words,
>> > eh?].)
>> >
>> > Not sure where to go from here... By all means be pedantic in a
>> > response :)
>> >
>> > Thanks!
>> >
>> > K
>> >
>> > Karlj?rgen G. Feuerherm, PhD
>> > Department of Archaeology and Classical Studies
>> > Wilfrid Laurier University
>> > 75 University Avenue West
>> > Waterloo, Ontario N2L 3C5
>> > Tel. (519) 884-1970 x3193
>> > Fax (519) 883-0991 (ATTN Arch. & Classics)
>> >
>> > List: http://www.antlr.org/mailman/listinfo/antlr-interest
>> > Unsubscribe:
>> > http://www.antlr.org/mailman/options/antlr-interest/your-email-address
>> >
>>
>> List: http://www.antlr.org/mailman/listinfo/antlr-interest
>> Unsubscribe:
>> http://www.antlr.org/mailman/options/antlr-interest/your-email-address
>>
>
>

From stefika at gmail.com  Tue Jan 26 13:46:46 2010
From: stefika at gmail.com (Andreas Stefik)
Date: Tue, 26 Jan 2010 15:46:46 -0600
Subject: [antlr-interest] Running ANTLRWorks 1.3.1 -- javac error
In-Reply-To: <af443a971001261040t797e8c36rd30cd30ed868bdc6@mail.gmail.com>
References: <4B5EE56A020000CC0001CF38@wlgw07.wlu.ca>
	<af443a971001261019g6e82a1f2tb8449c321249aa8a@mail.gmail.com>
	<1332b72e1001261037w341273e4kdedd7de7ccc1317@mail.gmail.com>
	<af443a971001261040t797e8c36rd30cd30ed868bdc6@mail.gmail.com>
Message-ID: <1332b72e1001261346l46901804u60c4d70933cd7995@mail.gmail.com>

JDK means the same thing as SDK does for Java (Java Development Kit). JDK 6
U18 is the correct thing to have downloaded, so you are on the right track.

After installing it, you definitely should have javac and it should be in
your environment variables. Might be obvious, but have you tried restarting?

Andreas Stefik, Ph.D.
Assistant Professor
Department of Computer Science
Southern Illinois University Edwardsville


On Tue, Jan 26, 2010 at 3:41 PM, Karljurgen Feuerherm <kfeuerherm at wlu.ca>wrote:

>  Hi
>
> Thanks, that makes a lot more sense.
>
> Now, oddly, I found and downloaded JDK 6 U18, and installed it... and javac
> is still not found. I'm hunting around for documentation on the site, but
> there are so many different options...
>
> Maybe I need an SDK?
>
> K
>
> Karlj?rgen G. Feuerherm, PhD
> Department of Archaeology and Classical Studies
> Wilfrid Laurier University
> 75 University Avenue West
> Waterloo, Ontario N2L 3C5
> Tel. (519) 884-1970 x3193
> Fax (519) 883-0991 (ATTN Arch. & Classics)
>
> >>> Bart Kiers <bkiers at gmail.com> 26/01/2010 1:40 pm >>>
>
> On Tue, Jan 26, 2010 at 7:37 PM, Andreas Stefik <* stefika at gmail.com* >
> wrote:
>
> > I think he's asking why the debugger throws errors with javac, not how to
> > start antlrworks.
> >
> > The error you are seeing is because the antlrworks debugger, far as I
> > understand it, needs a java compiler to actually debug a grammar. As
> such,
> > you need to put the path to your javac compiler in the path field in
> > antlrworks. This is straightforward to do:
> >
> > 1. Open up the options window. I'm on mac at the moment, which is in
> > preferences, but on windows it is similar.
> > 2. Go to the tab labeled compiler and look for where it says javac.
> > 3. Check path under javac, then click browse and a window should appear.
> > 4. Browse to where javac is located.
> >
> > As I'm on mac, the paths are different, but if I recall correctly, on
> > windows javac is in program files, so it would be something "like"
> >
> > c:\program files\Java\bin\javac.exe
> >
> > That path might not be correct, but I don't have a windows box on me to
> > give it to you exactly. Should be close though and if you browse around
> you
> > should find it.
> >
> > The last detail is that, if you can't find javac, you may not have the
> JDK
> > installed (java.sun.com), so you'll need to do that. It's just a little
> > installer, so there's nothing fancy to do. You can know for sure whether
> you
> > have it by going to the command line and typing:
> >
> > javac
> >
> > if it throws an error, you need the JDK. If it's there, you will see a
> > bunch of information put out to the terminal.
> >
> > Hope that helps,
> >
> > Andreas Stefik, Ph.D.
> > Assistant Professor
> > Department of Computer Science
> > Southern Illinois University Edwardsville
> >
> >
> >
> > On Tue, Jan 26, 2010 at 12:19 PM, Bart Kiers <* bkiers at gmail.com* >
> wrote:
> >
> >> Karlj?rgen,
> >>
> >> In order to run ANTLRWorks, you do not need 'javac', but 'java'.
> >>
> >> 'javac' is the compiler that will compile java source files into byte
> >> codes
> >> that the JRE (Java Runtime Environment) interprets/executes.
> >>
> >> 'java' is the application that executes the byte codes produced by
> >> 'javac'.
> >> Since ANTLRWorks is already compiled, you only need 'java'.
> >>
> >> So, on the command line, give the following command:
> >>
> >> java -jar antlrworks-1.3.1.jar
> >>
> >> If the above does not work, please post the exact error message(s) on
> the
> >> list.
> >>
> >> Thanks.
> >>
> >> Bart.
> >>
> >>
> >> On Tue, Jan 26, 2010 at 6:51 PM, Karljurgen Feuerherm <*
> kfeuerherm at wlu.ca*
> >> >wrote:
> >>
> >> > Hello,
> >> >
> >> > I'm new to this product (and to modern products of this type
> >> > generally... was a B programmer in the early 80s and trying to get
> >> > updated!)
> >> >
> >> > I'm on Windows XP, and have run the JAR file to invoke ANTLRWorks.
> >> >
> >> > I'm trying out the Expression Evaluator Tutorial. Interpreter works
> >> > fine, but invoking the debugger gets me
> >> >
> >> > "java.IO.IOException: Cannot run program "javac": CreateProcess
> >> > error=2, the system cannot find the file specified"
> >> >
> >> > (Oddly, after a while, trying it again got me a different error about
> >> > timeout, even though I'd changed nothing [Sure. Famous Last Words,
> >> > eh?].)
> >> >
> >> > Not sure where to go from here... By all means be pedantic in a
> >> > response :)
> >> >
> >> > Thanks!
> >> >
> >> > K
> >> >
> >> > Karlj?rgen G. Feuerherm, PhD
> >> > Department of Archaeology and Classical Studies
> >> > Wilfrid Laurier University
> >> > 75 University Avenue West
> >> > Waterloo, Ontario N2L 3C5
> >> > Tel. (519) 884-1970 x3193
> >> > Fax (519) 883-0991 (ATTN Arch. & Classics)
> >> >
> >> > List: *http://www.antlr.org/mailman/listinfo/antlr-interest*
> >> > Unsubscribe:
> >> > *
> http://www.antlr.org/mailman/options/antlr-interest/your-email-address*
> >> >
> >>
> >> List: *http://www.antlr.org/mailman/listinfo/antlr-interest*
> >> Unsubscribe:
> >> *http://www.antlr.org/mailman/options/antlr-interest/your-email-address
> *
> >>
> >
> >
>
> List: *http://www.antlr.org/mailman/listinfo/antlr-interest*
> Unsubscribe: *
> http://www.antlr.org/mailman/options/antlr-interest/your-email-address*
>

From stefika at gmail.com  Tue Jan 26 14:35:56 2010
From: stefika at gmail.com (Andreas Stefik)
Date: Tue, 26 Jan 2010 16:35:56 -0600
Subject: [antlr-interest] Got it!
In-Reply-To: <4B5F26C1020000CC0001D02D@wlgw07.wlu.ca>
References: <4B5EE56A020000CC0001CF38@wlgw07.wlu.ca>
	<af443a971001261019g6e82a1f2tb8449c321249aa8a@mail.gmail.com>
	<1332b72e1001261037w341273e4kdedd7de7ccc1317@mail.gmail.com>
	<af443a971001261040t797e8c36rd30cd30ed868bdc6@mail.gmail.com>
	<1332b72e1001261346l46901804u60c4d70933cd7995@mail.gmail.com>
	<4B5F26C1020000CC0001D02D@wlgw07.wlu.ca>
Message-ID: <1332b72e1001261435x69aa6b26qa8ee8dbdbeada49e@mail.gmail.com>

No problem and best of luck.

ANTLR is a great parsing tool. In my view, much easier to use than many of
the alternatives, so hopefully you have a good time hacking away.


Andreas Stefik, Ph.D.
Assistant Professor
Department of Computer Science
Southern Illinois University Edwardsville


On Tue, Jan 26, 2010 at 4:30 PM, Karljurgen Feuerherm <kfeuerherm at wlu.ca>wrote:

>  hi
>
> thanks for your patience :)
>
> the second instal did create the directory. a reboot didn't change the fact
> that the environment variable is not set globally... however, following your
> instructions I was able to set the patch to C:\Program
> Files\Java\jdk1.6.0_18\bin and now it seems to work.
>
> i have no idea why the installation didn't work properly the first time...
> maybe all that fiddling trying to change the options fixed it, who knows. in
> any case, one problem down.
>
> i appreciate your help! now let's see whether i can come up with some REAL
> problems...!
>
> Best
>
> K
>
> Karlj?rgen G. Feuerherm, PhD
> Department of Archaeology and Classical Studies
> Wilfrid Laurier University
> 75 University Avenue West
> Waterloo, Ontario N2L 3C5
> Tel. (519) 884-1970 x3193
> Fax (519) 883-0991 (ATTN Arch. & Classics)
>
> >>> Andreas Stefik <stefika at gmail.com> 26/01/2010 4:46 pm >>>
> JDK means the same thing as SDK does for Java (Java Development Kit). JDK 6
> U18 is the correct thing to have downloaded, so you are on the right track.
>
> After installing it, you definitely should have javac and it should be in
> your environment variables. Might be obvious, but have you tried restarting?
>
> Andreas Stefik, Ph.D.
> Assistant Professor
> Department of Computer Science
> Southern Illinois University Edwardsville
>
>
> On Tue, Jan 26, 2010 at 3:41 PM, Karljurgen Feuerherm <kfeuerherm at wlu.ca>wrote:
>
>>  Hi
>>
>> Thanks, that makes a lot more sense.
>>
>> Now, oddly, I found and downloaded JDK 6 U18, and installed it... and
>> javac is still not found. I'm hunting around for documentation on the site,
>> but there are so many different options...
>>
>> Maybe I need an SDK?
>>
>> K
>>
>>  Karlj?rgen G. Feuerherm, PhD
>> Department of Archaeology and Classical Studies
>> Wilfrid Laurier University
>> 75 University Avenue West
>> Waterloo, Ontario N2L 3C5
>> Tel. (519) 884-1970 x3193
>> Fax (519) 883-0991 (ATTN Arch. & Classics)
>>
>> >>> Bart Kiers <bkiers at gmail.com> 26/01/2010 1:40 pm >>>
>>
>> On Tue, Jan 26, 2010 at 7:37 PM, Andreas Stefik <* stefika at gmail.com* >
>> wrote:
>>
>>  > I think he's asking why the debugger throws errors with javac, not how
>> to
>> > start antlrworks.
>> >
>> > The error you are seeing is because the antlrworks debugger, far as I
>> > understand it, needs a java compiler to actually debug a grammar. As
>> such,
>> > you need to put the path to your javac compiler in the path field in
>> > antlrworks. This is straightforward to do:
>> >
>> > 1. Open up the options window. I'm on mac at the moment, which is in
>> > preferences, but on windows it is similar.
>> > 2. Go to the tab labeled compiler and look for where it says javac.
>> > 3. Check path under javac, then click browse and a window should appear.
>> > 4. Browse to where javac is located.
>> >
>> > As I'm on mac, the paths are different, but if I recall correctly, on
>> > windows javac is in program files, so it would be something "like"
>> >
>> > c:\program files\Java\bin\javac.exe
>> >
>> > That path might not be correct, but I don't have a windows box on me to
>> > give it to you exactly. Should be close though and if you browse around
>> you
>> > should find it.
>> >
>> > The last detail is that, if you can't find javac, you may not have the
>> JDK
>> > installed (java.sun.com), so you'll need to do that. It's just a little
>> > installer, so there's nothing fancy to do. You can know for sure whether
>> you
>> > have it by going to the command line and typing:
>> >
>> > javac
>> >
>> > if it throws an error, you need the JDK. If it's there, you will see a
>> > bunch of information put out to the terminal.
>> >
>> > Hope that helps,
>> >
>> > Andreas Stefik, Ph.D.
>> > Assistant Professor
>> > Department of Computer Science
>> > Southern Illinois University Edwardsville
>> >
>> >
>> >
>> > On Tue, Jan 26, 2010 at 12:19 PM, Bart Kiers <* bkiers at gmail.com* >
>> wrote:
>> >
>> >> Karlj?rgen,
>> >>
>> >> In order to run ANTLRWorks, you do not need 'javac', but 'java'.
>> >>
>> >> 'javac' is the compiler that will compile java source files into byte
>> >> codes
>> >> that the JRE (Java Runtime Environment) interprets/executes.
>> >>
>> >> 'java' is the application that executes the byte codes produced by
>> >> 'javac'.
>> >> Since ANTLRWorks is already compiled, you only need 'java'.
>> >>
>> >> So, on the command line, give the following command:
>> >>
>> >> java -jar antlrworks-1.3.1.jar
>> >>
>> >> If the above does not work, please post the exact error message(s) on
>> the
>> >> list.
>> >>
>> >> Thanks.
>> >>
>> >> Bart.
>> >>
>> >>
>> >> On Tue, Jan 26, 2010 at 6:51 PM, Karljurgen Feuerherm <*
>> kfeuerherm at wlu.ca*
>> >> >wrote:
>> >>
>> >> > Hello,
>> >> >
>> >> > I'm new to this product (and to modern products of this type
>> >> > generally... was a B programmer in the early 80s and trying to get
>> >> > updated!)
>> >> >
>> >> > I'm on Windows XP, and have run the JAR file to invoke ANTLRWorks.
>> >> >
>> >> > I'm trying out the Expression Evaluator Tutorial. Interpreter works
>> >> > fine, but invoking the debugger gets me
>> >> >
>> >> > "java.IO.IOException: Cannot run program "javac": CreateProcess
>> >> > error=2, the system cannot find the file specified"
>> >> >
>> >> > (Oddly, after a while, trying it again got me a different error about
>> >> > timeout, even though I'd changed nothing [Sure. Famous Last Words,
>> >> > eh?].)
>> >> >
>> >> > Not sure where to go from here... By all means be pedantic in a
>> >> > response :)
>> >> >
>> >> > Thanks!
>> >> >
>> >> > K
>> >> >
>> >> > Karlj?rgen G. Feuerherm, PhD
>> >> > Department of Archaeology and Classical Studies
>> >> > Wilfrid Laurier University
>> >> > 75 University Avenue West
>> >> > Waterloo, Ontario N2L 3C5
>> >> > Tel. (519) 884-1970 x3193
>> >> > Fax (519) 883-0991 (ATTN Arch. & Classics)
>> >> >
>> >> > List: *http://www.antlr.org/mailman/listinfo/antlr-interest*
>> >> > Unsubscribe:
>> >> > *
>> http://www.antlr.org/mailman/options/antlr-interest/your-email-address*
>> >> >
>> >>
>> >> List: *http://www.antlr.org/mailman/listinfo/antlr-interest*
>> >> Unsubscribe:
>> >> *
>> http://www.antlr.org/mailman/options/antlr-interest/your-email-address*
>> >>
>> >
>> >
>>
>> List: *http://www.antlr.org/mailman/listinfo/antlr-interest*
>> Unsubscribe: *
>> http://www.antlr.org/mailman/options/antlr-interest/your-email-address*
>>
>
>

From parrt at cs.usfca.edu  Tue Jan 26 14:57:40 2010
From: parrt at cs.usfca.edu (Terence Parr)
Date: Tue, 26 Jan 2010 14:57:40 -0800
Subject: [antlr-interest] better error messages in tree parsers
Message-ID: <604A33CC-D57B-4C48-ADDE-75331132609D@cs.usfca.edu>

Hi, a reminder that debugging tree grammars can be a bitch.  I like to override standard messaging to spew lots of stuff.  E.g., i like this kind of thing:

ASTVerifier.g: node from after line 150:17 [grammarSpec, rules, rule, altListAsBlock, altList, alternative, elements, element, ebnf, block, altList, alternative]  no viable alt; token=[@-1,0:0='ALT',<84>,0:-1] (decision=24 state 3) decision=<<>>
context=...DOWN BLOCK DOWN >>>ALT<<< DOWN DOC_COMMENT...

Here's my code:

    public String getErrorMessage(RecognitionException e,
                                  String[] tokenNames)
    {
        List stack = getRuleInvocationStack(e, this.getClass().getName());
        String msg = null;
        String inputContext =
            ((Tree)input.LT(-3)).getText()+" "+
            ((Tree)input.LT(-2)).getText()+" "+
            ((Tree)input.LT(-1)).getText()+" >>>"+
            ((Tree)input.LT(1)).getText()+"<<< "+
            ((Tree)input.LT(2)).getText()+" "+
            ((Tree)input.LT(3)).getText();
        if ( e instanceof NoViableAltException ) {
           NoViableAltException nvae = (NoViableAltException)e;
           msg = " no viable alt; token="+e.token+
              " (decision="+nvae.decisionNumber+
              " state "+nvae.stateNumber+")"+
              " decision=<<"+nvae.grammarDecisionDescription+">>";
        }
        else {
           msg = super.getErrorMessage(e, tokenNames);
        }
        return stack+" "+msg+" context=..."+inputContext+"...";
    }
    public String getTokenErrorDisplay(Token t) {
        return t.toString();
    }

Ter

From kferrio at gmail.com  Tue Jan 26 18:00:51 2010
From: kferrio at gmail.com (kferrio at gmail.com)
Date: Wed, 27 Jan 2010 02:00:51 +0000
Subject: [antlr-interest] better error messages in tree parsers
In-Reply-To: <604A33CC-D57B-4C48-ADDE-75331132609D@cs.usfca.edu>
References: <604A33CC-D57B-4C48-ADDE-75331132609D@cs.usfca.edu>
Message-ID: <177143290-1264557652-cardhu_decombobulator_blackberry.rim.net-951491048-@bda428.bisx.prod.on.blackberry>

ROTFL!  Thanks for calling it as you see it.  I feel a little less na?ve now, knowing that you have "issues" with debugging.  Thanks for the nice example too!

Kyle 

Sent from my Verizon Wireless BlackBerry

-----Original Message-----
From: Terence Parr <parrt at cs.usfca.edu>
Date: Tue, 26 Jan 2010 14:57:40 
To: antlr-interest at antlr.org interest<antlr-interest at antlr.org>
Subject: [antlr-interest] better error messages in tree parsers

Hi, a reminder that debugging tree grammars can be a bitch.  I like to override standard messaging to spew lots of stuff.  E.g., i like this kind of thing:

ASTVerifier.g: node from after line 150:17 [grammarSpec, rules, rule, altListAsBlock, altList, alternative, elements, element, ebnf, block, altList, alternative]  no viable alt; token=[@-1,0:0='ALT',<84>,0:-1] (decision=24 state 3) decision=<<>>
context=...DOWN BLOCK DOWN >>>ALT<<< DOWN DOC_COMMENT...

Here's my code:

    public String getErrorMessage(RecognitionException e,
                                  String[] tokenNames)
    {
        List stack = getRuleInvocationStack(e, this.getClass().getName());
        String msg = null;
        String inputContext =
            ((Tree)input.LT(-3)).getText()+" "+
            ((Tree)input.LT(-2)).getText()+" "+
            ((Tree)input.LT(-1)).getText()+" >>>"+
            ((Tree)input.LT(1)).getText()+"<<< "+
            ((Tree)input.LT(2)).getText()+" "+
            ((Tree)input.LT(3)).getText();
        if ( e instanceof NoViableAltException ) {
           NoViableAltException nvae = (NoViableAltException)e;
           msg = " no viable alt; token="+e.token+
              " (decision="+nvae.decisionNumber+
              " state "+nvae.stateNumber+")"+
              " decision=<<"+nvae.grammarDecisionDescription+">>";
        }
        else {
           msg = super.getErrorMessage(e, tokenNames);
        }
        return stack+" "+msg+" context=..."+inputContext+"...";
    }
    public String getTokenErrorDisplay(Token t) {
        return t.toString();
    }

Ter

List: http://www.antlr.org/mailman/listinfo/antlr-interest
Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address

From wclodius at los-alamos.net  Tue Jan 26 19:58:17 2010
From: wclodius at los-alamos.net (William B. Clodius)
Date: Tue, 26 Jan 2010 20:58:17 -0700
Subject: [antlr-interest] Disabling rules in the lexer
In-Reply-To: <33AFEA62-2C42-4174-B149-06D2025628F9@mac.com>
References: <33AFEA62-2C42-4174-B149-06D2025628F9@mac.com>
Message-ID: <0CB6E93D-F815-47C4-A870-8A2C9C0E83A7@los-alamos.net>

Generally don't try to be too restrictive with your lexer and parser. This sort of context dependence is more naturally handled in the semantic analysis. In particular error reporting is much better if you accept things that are ultimately illegal in the lexer and parser and determine whether they are they are illegal in the semantic analysis. Instead of a minimal message such as "Illegal token" you can report "Illegal token for the table structure see constraint # in the language definition", or "Token is not one of the set of ..."

On Jan 26, 2010, at 7:52 AM, Jeff Wilcox wrote:

> Hi,
> 
> I have a special area in this language that has symbols within a table structure that are normally used in other tokens in other areas of the language (like a couple digits, a couple letters and a couple symbols).  So I am trying to setup the lexer to accept these table tokens only when in a table.  Based on what I have been able to dig up, I believe gated semantic predicates are a valid way to disable rules in the lexer.  However, I am seeing issues with this with ANTLR 3.2 and the java language target.  
> 
> So I expected a lexer rules like this to do the trick:  
> 
> Level0       : {inTable}?=> '0';
> 
> But that actually creates a very strange loop when inTable is false.  I basically throws a FailedPredicateException (which I would not have expected for a gated predicate) and then retries the same token with the same rule, obviously resulting in an infinite loop.
> 
> Can someone clarify whether this is allowed and if so whether there is some trick to using it?  I am stumped.  
> 
> Thanks
> Jeff
<snip>

From C.P.T.de.Gouw at cwi.nl  Wed Jan 27 01:19:19 2010
From: C.P.T.de.Gouw at cwi.nl (Stijn de Gouw)
Date: Wed, 27 Jan 2010 10:19:19 +0100
Subject: [antlr-interest] Parsing a sequence of objects
Message-ID: <4B600517.3040102@cwi.nl>

Given an attribute grammar (with probably only synthesized attributes), 
instead of parsing a sequence of terminal strings, I want to parse a 
sequence (array) of (Java) Objects. Each object o has 3 fields:
(1) String name
(2) Object[] p
(3) String c
The terminals in the grammar correspond exactly to the name field of an 
object (each o.name is a terminal), so parsing decisions should be done 
based on this field (perhaps no lexer is needed?).

In the attribute grammar the other two fields of the object must be used 
as attributes of the terminal (note that the values of these attributes 
are NOT given by a production in the grammar!! but instead are given 
(before parsing) in each object), and it must be possible to define the 
(synthesized) attributes of non-terminals in terms of the attributes of 
the terminals (namely, the o.p and o.c fields). To make it more clear, 
consider the following example (I will denote each object as a triple 
(name, p, c)):


Given a sequence of objects
    ("first", p1, "z"), ("first", p2, "y"), ("last", p3, "z"), ("last", 
p4, "x")

and the attribute grammar
  S ::=   FIRST LAST { $cSet = createset($FIRST.c, $LAST.c); }
        | FIRST S1=S LAST { $cSet = union($S1.cSet, createset($FIRST.c, 
$LAST.c)); }
where cSet an attribute of type 'set of Strings', createset creates a 
new set containing its parameters as elements of the set, and union(a,b) 
returns the union of the sets a and b

the parsing of the sequence of objects produces:

                        S.cSet = {"x","y","z"}
                       /          |           \
                      /           |            \
                     /            |             \
                    /             |              \
                   /              |               \
FIRST = ("first",p1,"z")   S.cSet = {"y", "z"}    LAST = ("last", p4, "x")
                              /       \
                             /         \
                            /           \
                           /             \
     FIRST = ("first", p2, "y")         LAST = ("last", p3, "z")


What would be the best way to implement this? Perhaps subclass the 
antlr.Token class to add the Object[] p and String c fields (if so, what 
would the best way to create a token stream from the given sequence of 
objects)? My current approach, which works but is not very elegant, is to
1) Concatenate all name attributes from the objects in the sequence to 
create a single string S
2) Add the array storing the sequence of objects as a @members variable 
to the grammar (let's call this array a).
3) In the attribute grammar, one can refer to the terminal attribute 
Object[] p of "first" by writing 'a[$FIRST.getTokenIndex()].p' where 
FIRST is a terminal defined in the lexer as FIRST: 'first';.
4) Call the parser with as input the string S formed in step 1

From andre.rutti at gmail.com  Wed Jan 27 07:36:02 2010
From: andre.rutti at gmail.com (andre rutti)
Date: Wed, 27 Jan 2010 16:36:02 +0100
Subject: [antlr-interest] Python RuntimeError
Message-ID: <2132cf931001270736p28eb80bfx432bc5ac38f479dd@mail.gmail.com>

Hi,

I'm using antlr\antlrworks-1.3 to generate lexer and parser for Python.

Using the examples from
http://www.antlr.org/wiki/display/ANTLR3/Antlr3PythonTarget

When I run Test.py, I get

RuntimeError: ANTLR version mismatch: The recognizer has been generated by
V3.2
Sep 23, 2009 12:02:23, but this runtime is V3.1.2. Please use the V3.2 Sep
23, 2
009 12:02:23 runtime or higher.

Is the Python runtime for V3.2 available ?

I tried with antlrworks-1.2.2, but then, I got errors for Eval.g

[15:44:31] error(10):  internal error: eval tree parse error : <AST>:0:0:
unexpected AST node:
org.antlr.stringtemplate.language.ActionEvaluator.expr(Unknown Source)
org.antlr.stringtemplate.language.ActionEvaluator.action(Unknown Source)
org.antlr.stringtemplate.language.ASTExpr.evaluateExpression(Unknown Source)
org.antlr.stringtemplate.language.ASTExpr.handleExprOptions(Unknown Source)


Thanks and regards,
Andre

From alexander.herz at mytum.de  Wed Jan 27 08:23:55 2010
From: alexander.herz at mytum.de (Alexander Herz)
Date: Wed, 27 Jan 2010 17:23:55 +0100
Subject: [antlr-interest] antlr grammar+missing symbol
Message-ID: <4B60689B.6030805@mytum.de>

Hi,

I'm trying to debug the python2.5 grammer from the antlr homepage.
Compiling it gives an error that "token" is not recognized as a symbol.
Where/how should it be defined?
Generally, is there a docu or something where I can look up which
symbols are provided
for the generated classes (so that I can rever to them from inside the
grammar)?

Thx,
Alex

-- 
-------------------------------------------------------
Lehrstuhl I2 Seidl
Sprachen und Beschreibungsstrukturen der Informatik
Institut fuer Informatik
Technische Universitaet Muenchen

Boltzmannstrasse 3  85748 Garching
http://www2.in.tum.de

Telefon: +89 289 181806
Fax: +89 289 18161
------------------------------------------------------- 


From parrt at cs.usfca.edu  Wed Jan 27 11:03:18 2010
From: parrt at cs.usfca.edu (Terence Parr)
Date: Wed, 27 Jan 2010 11:03:18 -0800
Subject: [antlr-interest] better error messages in tree parsers
In-Reply-To: <177143290-1264557652-cardhu_decombobulator_blackberry.rim.net-951491048-@bda428.bisx.prod.on.blackberry>
References: <604A33CC-D57B-4C48-ADDE-75331132609D@cs.usfca.edu>
	<177143290-1264557652-cardhu_decombobulator_blackberry.rim.net-951491048-@bda428.bisx.prod.on.blackberry>
Message-ID: <4ECD2BE4-A90F-42FC-A348-5F38904338CD@cs.usfca.edu>


On Jan 26, 2010, at 6:00 PM, kferrio at gmail.com wrote:

> ROTFL!  Thanks for calling it as you see it.  I feel a little less na?ve now, knowing that you have "issues" with debugging.  Thanks for the nice example too!

:) Added a faq entry.  yeah, it's tough for me right now because I'm debugging a tree grammar parsing an AST representing an ANTLR tree grammar. my brain hurts.

Ter
> 
> Kyle 
> 
> Sent from my Verizon Wireless BlackBerry
> 
> -----Original Message-----
> From: Terence Parr <parrt at cs.usfca.edu>
> Date: Tue, 26 Jan 2010 14:57:40 
> To: antlr-interest at antlr.org interest<antlr-interest at antlr.org>
> Subject: [antlr-interest] better error messages in tree parsers
> 
> Hi, a reminder that debugging tree grammars can be a bitch.  I like to override standard messaging to spew lots of stuff.  E.g., i like this kind of thing:
> 
> ASTVerifier.g: node from after line 150:17 [grammarSpec, rules, rule, altListAsBlock, altList, alternative, elements, element, ebnf, block, altList, alternative]  no viable alt; token=[@-1,0:0='ALT',<84>,0:-1] (decision=24 state 3) decision=<<>>
> context=...DOWN BLOCK DOWN >>>ALT<<< DOWN DOC_COMMENT...
> 
> Here's my code:
> 
>    public String getErrorMessage(RecognitionException e,
>                                  String[] tokenNames)
>    {
>        List stack = getRuleInvocationStack(e, this.getClass().getName());
>        String msg = null;
>        String inputContext =
>            ((Tree)input.LT(-3)).getText()+" "+
>            ((Tree)input.LT(-2)).getText()+" "+
>            ((Tree)input.LT(-1)).getText()+" >>>"+
>            ((Tree)input.LT(1)).getText()+"<<< "+
>            ((Tree)input.LT(2)).getText()+" "+
>            ((Tree)input.LT(3)).getText();
>        if ( e instanceof NoViableAltException ) {
>           NoViableAltException nvae = (NoViableAltException)e;
>           msg = " no viable alt; token="+e.token+
>              " (decision="+nvae.decisionNumber+
>              " state "+nvae.stateNumber+")"+
>              " decision=<<"+nvae.grammarDecisionDescription+">>";
>        }
>        else {
>           msg = super.getErrorMessage(e, tokenNames);
>        }
>        return stack+" "+msg+" context=..."+inputContext+"...";
>    }
>    public String getTokenErrorDisplay(Token t) {
>        return t.toString();
>    }
> 
> Ter
> 
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address
> 
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address


From parrt at cs.usfca.edu  Wed Jan 27 11:21:46 2010
From: parrt at cs.usfca.edu (Terence Parr)
Date: Wed, 27 Jan 2010 11:21:46 -0800
Subject: [antlr-interest] better error messages in tree parsers
In-Reply-To: <177143290-1264557652-cardhu_decombobulator_blackberry.rim.net-951491048-@bda428.bisx.prod.on.blackberry>
References: <604A33CC-D57B-4C48-ADDE-75331132609D@cs.usfca.edu>
	<177143290-1264557652-cardhu_decombobulator_blackberry.rim.net-951491048-@bda428.bisx.prod.on.blackberry>
Message-ID: <2DAFE674-88D1-4A7F-BAC4-846AA42597E9@cs.usfca.edu>

Also note that use use the decision number (24 here) by using -dfa option on antlr and then loading your grammar-dec-24.dot into Graphviz. look at state 3 and you'll see that the token ALT (in this case) has no path to take.
Ter
> ASTVerifier.g: node from after line 150:17 [grammarSpec, rules, rule, altListAsBlock, altList, alternative, elements, element, ebnf, block, altList, alternative]  no viable alt; token=[@-1,0:0='ALT',<84>,0:-1] (decision=24 state 3) decision=<<>> 
> context=...DOWN BLOCK DOWN >>>ALT<<< DOWN DOC_COMMENT...
> 
> Here's my code:
> 
>    public String getErrorMessage(RecognitionException e,
>                                  String[] tokenNames)
>    {
>        List stack = getRuleInvocationStack(e, this.getClass().getName());
>        String msg = null;
>        String inputContext =
>            ((Tree)input.LT(-3)).getText()+" "+
>            ((Tree)input.LT(-2)).getText()+" "+
>            ((Tree)input.LT(-1)).getText()+" >>>"+
>            ((Tree)input.LT(1)).getText()+"<<< "+
>            ((Tree)input.LT(2)).getText()+" "+
>            ((Tree)input.LT(3)).getText();
>        if ( e instanceof NoViableAltException ) {
>           NoViableAltException nvae = (NoViableAltException)e;
>           msg = " no viable alt; token="+e.token+
>              " (decision="+nvae.decisionNumber+
>              " state "+nvae.stateNumber+")"+
>              " decision=<<"+nvae.grammarDecisionDescription+">>";
>        }
>        else {
>           msg = super.getErrorMessage(e, tokenNames);
>        }
>        return stack+" "+msg+" context=..."+inputContext+"...";
>    }
>    public String getTokenErrorDisplay(Token t) {
>        return t.toString();
>    }
> 
> Ter
> 
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address
> 
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address


From gabriel_erzse at yahoo.com  Wed Jan 27 12:36:34 2010
From: gabriel_erzse at yahoo.com (Gabriel Erzse)
Date: Wed, 27 Jan 2010 12:36:34 -0800 (PST)
Subject: [antlr-interest] detecting rubbish data at end of input
Message-ID: <596411.1651.qm@web51302.mail.re2.yahoo.com>

Hello,

When using grammars written in ANTLR, the parser correctly recognizes 
data from an input stream, but if I have some rubbish text at the end of 
the input (which rubbish text is not supposed to be parsed by the grammar) 
the parser does not complain.

I guess this behavior is all right (I mean the parser did its job and parsed 
whatever I said it should parse), but is there any trick to detect when there 
is any data left in the input after the parser has done its job? 

Thanks,
Gabi.

From scott at javadude.com  Wed Jan 27 12:38:15 2010
From: scott at javadude.com (Scott Stanchfield)
Date: Wed, 27 Jan 2010 15:38:15 -0500
Subject: [antlr-interest] detecting rubbish data at end of input
In-Reply-To: <596411.1651.qm@web51302.mail.re2.yahoo.com>
References: <596411.1651.qm@web51302.mail.re2.yahoo.com>
Message-ID: <d19d16481001271238k2bbce057t6f3c7a2d7d837f4d@mail.gmail.com>

Add an EOF token to the end of your start rule
-- Scott

----------------------------------------
Scott Stanchfield
http://javadude.com


On Wed, Jan 27, 2010 at 3:36 PM, Gabriel Erzse <gabriel_erzse at yahoo.com> wrote:
> Hello,
>
> When using grammars written in ANTLR, the parser correctly recognizes
> data from an input stream, but if I have some rubbish text at the end of
> the input (which rubbish text is not supposed to be parsed by the grammar)
> the parser does not complain.
>
> I guess this behavior is all right (I mean the parser did its job and parsed
> whatever I said it should parse), but is there any trick to detect when there
> is any data left in the input after the parser has done its job?
>
> Thanks,
> Gabi.
>
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address
>

From jeff.wilcox at mac.com  Wed Jan 27 13:22:16 2010
From: jeff.wilcox at mac.com (Jeff Wilcox)
Date: Wed, 27 Jan 2010 13:22:16 -0800
Subject: [antlr-interest] Disabling rules in the lexer
Message-ID: <29B7ADE0-F06D-4A60-822A-A16E906610E5@mac.com>

Yes, I agree with you, and in general this is how my parsers have worked.  But there are a couple cases where disabling lexer rules is useful and/or necessary.  Disable keywords that exist only in newer versions of the language which could be identifiers in older versions for example; there are other semi tedious ways around that with predicates but it should not be necessary. 

This case though involves a table section of characters, symbols and numbers.  So a N column row of N discrete symbols could otherwise be a single number, a single identifier, a number plus an identifier, etc.  So without special casing the lexer, the easiest thing was to accept possible candidates, suck it all into a string a re-parse in the semantic analyzer.  But that feels like the wrong solution.  

In general though, it seems like there is a bug in ANLTR's treatment of gated semantic predicates in the lexer.  It does not work unless there are other alternatives in the rule.

Is there any other way to completely turn off a rule in the lexer (without throwing a FPE)?

Thanks,
Jeff


On Jan 26, 2010, at 8:58 PM, William B. Clodius wrote:
> Generally don't try to be too restrictive with your lexer and parser. This sort of context dependence is more naturally handled in the semantic analysis. In particular error reporting is much better if you accept things that are ultimately illegal in the lexer and parser and determine whether they are they are illegal in the semantic analysis. Instead of a minimal message such as "Illegal token" you can report "Illegal token for the table structure see constraint # in the language definition", or "Token is not one of the set of ..."
> 
> On Jan 26, 2010, at 7:52 AM, Jeff Wilcox wrote:
> 
>> Hi,
>> 
>> I have a special area in this language that has symbols within a table structure that are normally used in other tokens in other areas of the language (like a couple digits, a couple letters and a couple symbols).  So I am trying to setup the lexer to accept these table tokens only when in a table.  Based on what I have been able to dig up, I believe gated semantic predicates are a valid way to disable rules in the lexer.  However, I am seeing issues with this with ANTLR 3.2 and the java language target.  
>> 
>> So I expected a lexer rules like this to do the trick:  
>> 
>> Level0       : {inTable}?=> '0';
>> 
>> But that actually creates a very strange loop when inTable is false.  I basically throws a FailedPredicateException (which I would not have expected for a gated predicate) and then retries the same token with the same rule, obviously resulting in an infinite loop.
>> 
>> Can someone clarify whether this is allowed and if so whether there is some trick to using it?  I am stumped.  
>> 
>> Thanks
>> Jeff

From parrt at cs.usfca.edu  Wed Jan 27 21:21:48 2010
From: parrt at cs.usfca.edu (Terence Parr)
Date: Wed, 27 Jan 2010 21:21:48 -0800
Subject: [antlr-interest] DSL discussion at javaranch.com
Message-ID: <93526F1F-A27D-4315-BA68-DD0FE1B3A430@cs.usfca.edu>

Hiya. In case you're interested, we're have some interesting  
discussions about DSL terminology, design, and implementation:

http://www.coderanch.com/forums/f-12/IDEs-Version-Control-other-tools

Ter

From C.P.T.de.Gouw at cwi.nl  Fri Jan 29 00:19:18 2010
From: C.P.T.de.Gouw at cwi.nl (C.P.T.de.Gouw at cwi.nl)
Date: Fri, 29 Jan 2010 09:19:18 +0100 (CET)
Subject: [antlr-interest] Parsing a sequence of objects
In-Reply-To: <4B600517.3040102@cwi.nl>
References: <4B600517.3040102@cwi.nl>
Message-ID: <38399.132.229.128.127.1264753158.squirrel@webmail.cwi.nl>

> Given an attribute grammar (with probably only synthesized attributes),
> instead of parsing a sequence of terminal strings, I want to parse a
> sequence (array) of (Java) Objects.

I just noted an old antlr2 blog post, that I think describes exactly what I
want: http://www.antlr2.org/blog/antlr3/lexical.tml. The feature I'm
interested in is
"the parser grammar (or combined grammar) can specify the extra fields for a
token, which results in a grammar specific token. Tokens may also have a
generate attributes table for dynamically setting attributes, thus, avoiding
creation of a million token subclasses."

Has this been added to antlr v3?

From scott.oakes63 at googlemail.com  Fri Jan 29 09:42:53 2010
From: scott.oakes63 at googlemail.com (Scott Oakes)
Date: Fri, 29 Jan 2010 17:42:53 +0000
Subject: [antlr-interest] Lexer for floating point numbers + field access
	syntax with '.'
Message-ID: <6e75196e1001290942t546f22b6lafdb030ca239c76@mail.gmail.com>

Hi, hoping for some help trying to write a lexer that allows you to
recognise floating point literals (2.3) as well as field accesses of the
form x.y; see grammar below. The trouble is that an input like

  3.fieldAccess

Produces two tokens, FLOAT and ID, rather than the desired three, INT, DOT
and ID.

Pointers would be much appreciated!

-------------------

grammar test;

top: expr EOF;

expr: (INT | FLOAT | ID | '(' expr ')') (DOT ID)*;

ID  :    ('a'..'z'|'A'..'Z'|'_') ('a'..'z'|'A'..'Z'|'0'..'9'|'_')*
    ;

INT :    '0'..'9'+
    ;

DOT: '.';

FLOAT
    :   ('0'..'9')+ '.' ('0'..'9')* EXPONENT?
    |   '.' ('0'..'9')+ EXPONENT?
    |   ('0'..'9')+ EXPONENT
    ;

WS  :   ( ' '
        | '\t'
        | '\r'
        | '\n'
        ) {$channel=HIDDEN;}
    ;

fragment
EXPONENT : ('e'|'E') ('+'|'-')? ('0'..'9')+ ;

From jimi at temporal-wave.com  Fri Jan 29 10:02:17 2010
From: jimi at temporal-wave.com (Jim Idle)
Date: Fri, 29 Jan 2010 10:02:17 -0800
Subject: [antlr-interest] Lexer for floating point numbers + field
	access syntax with '.'
In-Reply-To: <6e75196e1001290942t546f22b6lafdb030ca239c76@mail.gmail.com>
Message-ID: <ec2699b69483924aa24351e1f8d656bc@temporal-wave.com>

Please see the FAQ and complete grammar at:

http://antlr.org/wiki/display/ANTLR3/Lexer+grammar+for+floating+point%2C+dot%2C+range%2C+time+specs


All you need do is add to the predicate here:

                |   // We can of course have 0.nnnnn
                    //
                    { input.LA(2) != '.'}?=> '.'

To check :

{ input.LA(2) != '.' && input.LA(2) >= '0' && input.LA(2) <= '0' }?=> '.'

Then remove the empty alt there that allows number forms like 8.

Jim

> -----Original Message-----
> From: antlr-interest-bounces at antlr.org [mailto:antlr-interest-
> bounces at antlr.org] On Behalf Of Scott Oakes
> Sent: Friday, January 29, 2010 9:43 AM
> To: antlr-interest at antlr.org
> Subject: [antlr-interest] Lexer for floating point numbers + field
> access syntax with '.'
> 
> Hi, hoping for some help trying to write a lexer that allows you to
> recognise floating point literals (2.3) as well as field accesses of
> the
> form x.y; see grammar below. The trouble is that an input like
> 
>   3.fieldAccess
> 
> Produces two tokens, FLOAT and ID, rather than the desired three, INT,
> DOT
> and ID.
> 
> Pointers would be much appreciated!
> 
> -------------------
> 
> grammar test;
> 
> top: expr EOF;
> 
> expr: (INT | FLOAT | ID | '(' expr ')') (DOT ID)*;
> 
> ID  :    ('a'..'z'|'A'..'Z'|'_') ('a'..'z'|'A'..'Z'|'0'..'9'|'_')*
>     ;
> 
> INT :    '0'..'9'+
>     ;
> 
> DOT: '.';
> 
> FLOAT
>     :   ('0'..'9')+ '.' ('0'..'9')* EXPONENT?
>     |   '.' ('0'..'9')+ EXPONENT?
>     |   ('0'..'9')+ EXPONENT
>     ;
> 
> WS  :   ( ' '
>         | '\t'
>         | '\r'
>         | '\n'
>         ) {$channel=HIDDEN;}
>     ;
> 
> fragment
> EXPONENT : ('e'|'E') ('+'|'-')? ('0'..'9')+ ;
> 
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-
> email-address


From scott.oakes63 at googlemail.com  Fri Jan 29 10:30:09 2010
From: scott.oakes63 at googlemail.com (Scott Oakes)
Date: Fri, 29 Jan 2010 18:30:09 +0000
Subject: [antlr-interest] Lexer for floating point numbers + field
	access syntax with '.'
In-Reply-To: <ec2699b69483924aa24351e1f8d656bc@temporal-wave.com>
References: <6e75196e1001290942t546f22b6lafdb030ca239c76@mail.gmail.com>
	<ec2699b69483924aa24351e1f8d656bc@temporal-wave.com>
Message-ID: <6e75196e1001291030w43480359xc73f9e04d3c5225c@mail.gmail.com>

Thanks Jim, the link looks very useful, albeit a bit daunting. I tried
amending my FLOAT to:

FLOAT
    :   ('0'..'9')+ ({input.LA(2) >= '0' && input.LA(2) <= '9'}?=>'.')
('0'..'9')+ EXPONENT?
    |   '.' ('0'..'9')+ EXPONENT?
    |   ('0'..'9')+ EXPONENT
    ;

Unfortunately I get a "rule FLOAT failed predicate" error.

On Fri, Jan 29, 2010 at 6:02 PM, Jim Idle <jimi at temporal-wave.com> wrote:

> Please see the FAQ and complete grammar at:
>
>
> http://antlr.org/wiki/display/ANTLR3/Lexer+grammar+for+floating+point%2C+dot%2C+range%2C+time+specs
>
>
> All you need do is add to the predicate here:
>
>                |   // We can of course have 0.nnnnn
>                    //
>                    { input.LA(2) != '.'}?=> '.'
>
> To check :
>
> { input.LA(2) != '.' && input.LA(2) >= '0' && input.LA(2) <= '0' }?=> '.'
>
> Then remove the empty alt there that allows number forms like 8.
>
> Jim
>
> > -----Original Message-----
> > From: antlr-interest-bounces at antlr.org [mailto:antlr-interest-
> > bounces at antlr.org] On Behalf Of Scott Oakes
> > Sent: Friday, January 29, 2010 9:43 AM
> > To: antlr-interest at antlr.org
> > Subject: [antlr-interest] Lexer for floating point numbers + field
> > access syntax with '.'
> >
> > Hi, hoping for some help trying to write a lexer that allows you to
> > recognise floating point literals (2.3) as well as field accesses of
> > the
> > form x.y; see grammar below. The trouble is that an input like
> >
> >   3.fieldAccess
> >
> > Produces two tokens, FLOAT and ID, rather than the desired three, INT,
> > DOT
> > and ID.
> >
> > Pointers would be much appreciated!
> >
> > -------------------
> >
> > grammar test;
> >
> > top: expr EOF;
> >
> > expr: (INT | FLOAT | ID | '(' expr ')') (DOT ID)*;
> >
> > ID  :    ('a'..'z'|'A'..'Z'|'_') ('a'..'z'|'A'..'Z'|'0'..'9'|'_')*
> >     ;
> >
> > INT :    '0'..'9'+
> >     ;
> >
> > DOT: '.';
> >
> > FLOAT
> >     :   ('0'..'9')+ '.' ('0'..'9')* EXPONENT?
> >     |   '.' ('0'..'9')+ EXPONENT?
> >     |   ('0'..'9')+ EXPONENT
> >     ;
> >
> > WS  :   ( ' '
> >         | '\t'
> >         | '\r'
> >         | '\n'
> >         ) {$channel=HIDDEN;}
> >     ;
> >
> > fragment
> > EXPONENT : ('e'|'E') ('+'|'-')? ('0'..'9')+ ;
>

From jimi at temporal-wave.com  Fri Jan 29 10:37:44 2010
From: jimi at temporal-wave.com (Jim Idle)
Date: Fri, 29 Jan 2010 10:37:44 -0800
Subject: [antlr-interest] Lexer for floating point numbers + field
	access syntax with '.'
In-Reply-To: <6e75196e1001291030w43480359xc73f9e04d3c5225c@mail.gmail.com>
Message-ID: <b65a53910ddbbc4099f3ea90e1650aac@temporal-wave.com>

Yes, you need to follow the method in the example - what you are trying to do will not work until you left factor it.
 
Jim
 
From: Scott Oakes [mailto:scott.oakes63 at googlemail.com] 
Sent: Friday, January 29, 2010 10:30 AM
To: Jim Idle
Cc: antlr-interest at antlr.org
Subject: Re: [antlr-interest] Lexer for floating point numbers + field access syntax with '.'
 
Thanks Jim, the link looks very useful, albeit a bit daunting. I tried amending my FLOAT to:

FLOAT
    :   ('0'..'9')+ ({input.LA(2) >= '0' && input.LA(2) <= '9'}?=>'.') ('0'..'9')+ EXPONENT?
    |   '.' ('0'..'9')+ EXPONENT?
    |   ('0'..'9')+ EXPONENT
    ;

Unfortunately I get a "rule FLOAT failed predicate" error.
On Fri, Jan 29, 2010 at 6:02 PM, Jim Idle <jimi at temporal-wave.com> wrote:
Please see the FAQ and complete grammar at:

http://antlr.org/wiki/display/ANTLR3/Lexer+grammar+for+floating+point%2C+dot%2C+range%2C+time+specs


All you need do is add to the predicate here:

               |   // We can of course have 0.nnnnn
                   //
                   { input.LA(2) != '.'}?=> '.'

To check :

{ input.LA(2) != '.' && input.LA(2) >= '0' && input.LA(2) <= '0' }?=> '.'

Then remove the empty alt there that allows number forms like 8.

Jim

> -----Original Message-----
> From: antlr-interest-bounces at antlr.org [mailto:antlr-interest-
> bounces at antlr.org] On Behalf Of Scott Oakes
> Sent: Friday, January 29, 2010 9:43 AM
> To: antlr-interest at antlr.org
> Subject: [antlr-interest] Lexer for floating point numbers + field
> access syntax with '.'
>
> Hi, hoping for some help trying to write a lexer that allows you to
> recognise floating point literals (2.3) as well as field accesses of
> the
> form x.y; see grammar below. The trouble is that an input like
>
>   3.fieldAccess
>
> Produces two tokens, FLOAT and ID, rather than the desired three, INT,
> DOT
> and ID.
>
> Pointers would be much appreciated!
>
> -------------------
>
> grammar test;
>
> top: expr EOF;
>
> expr: (INT | FLOAT | ID | '(' expr ')') (DOT ID)*;
>
> ID  :    ('a'..'z'|'A'..'Z'|'_') ('a'..'z'|'A'..'Z'|'0'..'9'|'_')*
>     ;
>
> INT :    '0'..'9'+
>     ;
>
> DOT: '.';
>
> FLOAT
>     :   ('0'..'9')+ '.' ('0'..'9')* EXPONENT?
>     |   '.' ('0'..'9')+ EXPONENT?
>     |   ('0'..'9')+ EXPONENT
>     ;
>
> WS  :   ( ' '
>         | '\t'
>         | '\r'
>         | '\n'
>         ) {$channel=HIDDEN;}
>     ;
>
> fragment
> EXPONENT : ('e'|'E') ('+'|'-')? ('0'..'9')+ ;
 

From ron.hunter-duvar at oracle.com  Fri Jan 29 20:51:40 2010
From: ron.hunter-duvar at oracle.com (Ron Hunter-Duvar)
Date: Fri, 29 Jan 2010 21:51:40 -0700
Subject: [antlr-interest] ANTLR running out of memory during generation
Message-ID: <4B63BADC.7000005@oracle.com>

I'm having a strange problem with ANTLR. I'm building a grammar for a 
language with a huge number (hundreds) of non-reserved keywords. I'm 
using the approach of having the lexer return a different token type for 
each keyword, and then having a parser rule of the form:

    id : ( ID | QUOTED_ID | KW_A | KW_B | ... | KW_ZZZ );

This was working great until today. In fact, ANTLR 3.2 generates 
surprisingly clever code for this - all the keywords are assigned 
consecutive token numbers, and generated code just says:

    if ( (input.LA(1)>=KW_A && input.LA(1)<=KW_ZZZ)||(input.LA(1)>=ID && 
input.LA(1)<=QUOTED_ID) ) {
        input.consume();
        ...

This works all the way up to 631 keywords. ANTLR runs in about 20 
seconds, and never uses more than 269MB of memory. When I add a 632nd 
keyword (doesn't matter what the keyword is), and change nothing else, 
ANTLR runs for 2 minutes and runs out of heap space. I kept bumping the 
max space up, but even going to 2GB doesn't make any difference.

What's really interesting is that I was using ANTLR 3.1 until now. When 
I ran into this I upgraded to 3.2, but both of them fail at exactly the 
same spot, 632 keywords. Not surprisingly, the stack trace varies from 
one run to the next, depending on the exact point it runs out of memory, 
but it always has deeply nested calls to these and other methods:

    
org.antlr.stringtemplate.language.ASTExpr.writeTemplate(ASTExpr.java:750)
    org.antlr.stringtemplate.language.ASTExpr.write(ASTExpr.java:680)
    
org.antlr.stringtemplate.language.ASTExpr.writeAttribute(ASTExpr.java:660)
    
org.antlr.stringtemplate.language.ActionEvaluator.action(ActionEvaluator.java:86)
    org.antlr.stringtemplate.language.ASTExpr.write(ASTExpr.java:149)
    org.antlr.stringtemplate.StringTemplate.write(StringTemplate.java:705)

I don't know if it makes a difference, but I'm using backtracking 
(otherwise, this approach to non-reserved keywords doesn't work without 
a lot of synpreds), and outputting ASTs.

Since this is size related, it's hard to narrow it down to a simple 
example. I could try to duplicate it with just the id rule and nothing else.

Any ideas what might be happening here, and whether a fix might be possible?

Thanks,
Ron

-- 
Ron Hunter-Duvar | Software Developer V | 403-272-6580
Oracle Service Engineering
Gulf Canada Square 401 - 9th Avenue S.W., Calgary, AB, Canada T2P 3C5

All opinions expressed here are mine, and do not necessarily represent
those of my employer.


From oliver.zeigermann at gmail.com  Sat Jan 30 03:48:21 2010
From: oliver.zeigermann at gmail.com (Oliver Zeigermann)
Date: Sat, 30 Jan 2010 12:48:21 +0100
Subject: [antlr-interest] better error messages in tree parsers
In-Reply-To: <604A33CC-D57B-4C48-ADDE-75331132609D@cs.usfca.edu>
References: <604A33CC-D57B-4C48-ADDE-75331132609D@cs.usfca.edu>
Message-ID: <9da4f4521001300348t3be6ac97t44963f4b351d423e@mail.gmail.com>

As input.LT seems to return null values in case we are at the very
start/end of the node stream, I added this check which does the job
for me

           input.LT(-3) == null ? "" : ((Tree)input.LT(-3)).getText()+" "+
           input.LT(-2) == null ? "" : ((Tree)input.LT(-2)).getText()+" "+
           input.LT(-1) == null ? "" : ((Tree)input.LT(-1)).getText()+" >>>"+
           input.LT(1) == null ? "" : ((Tree)input.LT(1)).getText()+"<<< "+
           input.LT(2) == null ? "" : ((Tree)input.LT(2)).getText()+" "+
           input.LT(3) == null ? "" : ((Tree)input.LT(3)).getText();


2010/1/26 Terence Parr <parrt at cs.usfca.edu>:
> Hi, a reminder that debugging tree grammars can be a bitch. ?I like to override standard messaging to spew lots of stuff. ?E.g., i like this kind of thing:
>
> ASTVerifier.g: node from after line 150:17 [grammarSpec, rules, rule, altListAsBlock, altList, alternative, elements, element, ebnf, block, altList, alternative] ?no viable alt; token=[@-1,0:0='ALT',<84>,0:-1] (decision=24 state 3) decision=<<>>
> context=...DOWN BLOCK DOWN >>>ALT<<< DOWN DOC_COMMENT...
>
> Here's my code:
>
> ? ?public String getErrorMessage(RecognitionException e,
> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?String[] tokenNames)
> ? ?{
> ? ? ? ?List stack = getRuleInvocationStack(e, this.getClass().getName());
> ? ? ? ?String msg = null;
> ? ? ? ?String inputContext =
> ? ? ? ? ? ?((Tree)input.LT(-3)).getText()+" "+
> ? ? ? ? ? ?((Tree)input.LT(-2)).getText()+" "+
> ? ? ? ? ? ?((Tree)input.LT(-1)).getText()+" >>>"+
> ? ? ? ? ? ?((Tree)input.LT(1)).getText()+"<<< "+
> ? ? ? ? ? ?((Tree)input.LT(2)).getText()+" "+
> ? ? ? ? ? ?((Tree)input.LT(3)).getText();
> ? ? ? ?if ( e instanceof NoViableAltException ) {
> ? ? ? ? ? NoViableAltException nvae = (NoViableAltException)e;
> ? ? ? ? ? msg = " no viable alt; token="+e.token+
> ? ? ? ? ? ? ?" (decision="+nvae.decisionNumber+
> ? ? ? ? ? ? ?" state "+nvae.stateNumber+")"+
> ? ? ? ? ? ? ?" decision=<<"+nvae.grammarDecisionDescription+">>";
> ? ? ? ?}
> ? ? ? ?else {
> ? ? ? ? ? msg = super.getErrorMessage(e, tokenNames);
> ? ? ? ?}
> ? ? ? ?return stack+" "+msg+" context=..."+inputContext+"...";
> ? ?}
> ? ?public String getTokenErrorDisplay(Token t) {
> ? ? ? ?return t.toString();
> ? ?}
>
> Ter
>
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address
>

From scott.oakes63 at googlemail.com  Sat Jan 30 05:30:50 2010
From: scott.oakes63 at googlemail.com (Scott Oakes)
Date: Sat, 30 Jan 2010 13:30:50 +0000
Subject: [antlr-interest] Lexer for floating point numbers + field
	access syntax with '.'
In-Reply-To: <b65a53910ddbbc4099f3ea90e1650aac@temporal-wave.com>
References: <6e75196e1001291030w43480359xc73f9e04d3c5225c@mail.gmail.com>
	<b65a53910ddbbc4099f3ea90e1650aac@temporal-wave.com>
Message-ID: <6e75196e1001300530o40b7a224l2b01c6a4eaeedb39@mail.gmail.com>

> On Fri, Jan 29, 2010 at 6:37 PM, Jim Idle <jimi at temporal-wave.com> wrote:
> Yes, you need to follow the method in the example - what you are trying to do will not work until you left factor it.

OK, I've attempted to merge the INT, DOT and FLOAT rules together and
manually set the token types at various branch points in the rules.
I'm still not having much luck with it, I'm afraid, but here's my
grammar to date:

grammar test;

fragment INT:;
fragment DOT:;

top: expr EOF;

expr: (INT | FLOAT | ID | '(' expr ')') (DOT ID)*;

ID  :	('a'..'z'|'A'..'Z'|'_') ('a'..'z'|'A'..'Z'|'0'..'9'|'_')*
    ;


FLOAT
    :   ('0'..'9')+ (
    			{input.LA(2) >= '0' && input.LA(2) <= '9'}?=>
    			      '.' ('0'..'9')+ EXPONENT? {$type = FLOAT;}
                     | {$type = INT;} (
                           '.' {$type = DOT;}
                       )
                     	
                    )

     | '.' {$type = DOT;}

    ;

WS  :   ( ' '
        | '\t'
        | '\r'
        | '\n'
        ) {$channel=HIDDEN;}
    ;

fragment
EXPONENT : ('e'|'E') ('+'|'-')? ('0'..'9')+ ;

From jimi at temporal-wave.com  Sat Jan 30 11:42:12 2010
From: jimi at temporal-wave.com (Jim Idle)
Date: Sat, 30 Jan 2010 11:42:12 -0800
Subject: [antlr-interest] ANTLR running out of memory during generation
In-Reply-To: <4B63BADC.7000005@oracle.com>
Message-ID: <7277098525d5fb4685c662b1fba4f4e2@temporal-wave.com>

Ron,

First you really need to switch off backtracking unless the objective of your parser is to analyze SQL (you gave it away when you mentioned 632 keywords that can be identifiers). There are not as many predicates required as you think so long as you left factor everything.

Your tokens should be consecutive so long as you list them that way in the lexer. 

The problem might well be that although SQL sort of allows all keywords to be identifiers, it does not allow all because some of them would be to ambiguous even for a syntax directed hand crafted parser. If you turn on backtracking then try to allow one of these reserved words to be an identifier, then you will probably mask the issue because all warnings and errors are turned off. 

It is entirely feasible to create a full SQL parser without backtracking, very little look ahead and few predicates (all of the one or two token lookahead type). I have an online demo of T-SQL for instance on my web site at www.temporal-wave.com  (select 'online demos' link), and Oracle SQL/PLSQL will be up there before long too.

So, I think you will need to do the following to have a chance of generating the code:

1) Use -Xconversiontimeout 10000
2) Cause switches to be generated rather than ifs: -Xmaxswitchcaselabels 32000 -Xminswitchalts 1-xmaxinlineddfastates 65534
3) Use -Xmx2G when invoking the java command (assuming your jvm allows that)

But if you cannot get it going that way, then basically you are masking a bigger problem in your grammar that you are not seeing because of global backtracking. 

Jim

> -----Original Message-----
> From: antlr-interest-bounces at antlr.org [mailto:antlr-interest-
> bounces at antlr.org] On Behalf Of Ron Hunter-Duvar
> Sent: Friday, January 29, 2010 8:52 PM
> To: antlr-interest at antlr.org
> Subject: [antlr-interest] ANTLR running out of memory during generation
> 
> I'm having a strange problem with ANTLR. I'm building a grammar for a
> language with a huge number (hundreds) of non-reserved keywords. I'm
> using the approach of having the lexer return a different token type
> for
> each keyword, and then having a parser rule of the form:
> 
>     id : ( ID | QUOTED_ID | KW_A | KW_B | ... | KW_ZZZ );
> 
> This was working great until today. In fact, ANTLR 3.2 generates
> surprisingly clever code for this - all the keywords are assigned
> consecutive token numbers, and generated code just says:
> 
>     if ( (input.LA(1)>=KW_A && input.LA(1)<=KW_ZZZ)||(input.LA(1)>=ID
> &&
> input.LA(1)<=QUOTED_ID) ) {
>         input.consume();
>         ...
> 
> This works all the way up to 631 keywords. ANTLR runs in about 20
> seconds, and never uses more than 269MB of memory. When I add a 632nd
> keyword (doesn't matter what the keyword is), and change nothing else,
> ANTLR runs for 2 minutes and runs out of heap space. I kept bumping the
> max space up, but even going to 2GB doesn't make any difference.
> 
> What's really interesting is that I was using ANTLR 3.1 until now. When
> I ran into this I upgraded to 3.2, but both of them fail at exactly the
> same spot, 632 keywords. Not surprisingly, the stack trace varies from
> one run to the next, depending on the exact point it runs out of
> memory,
> but it always has deeply nested calls to these and other methods:
> 
> 
> org.antlr.stringtemplate.language.ASTExpr.writeTemplate(ASTExpr.java:75
> 0)
>     org.antlr.stringtemplate.language.ASTExpr.write(ASTExpr.java:680)
> 
> org.antlr.stringtemplate.language.ASTExpr.writeAttribute(ASTExpr.java:6
> 60)
> 
> org.antlr.stringtemplate.language.ActionEvaluator.action(ActionEvaluato
> r.java:86)
>     org.antlr.stringtemplate.language.ASTExpr.write(ASTExpr.java:149)
> 
> org.antlr.stringtemplate.StringTemplate.write(StringTemplate.java:705)
> 
> I don't know if it makes a difference, but I'm using backtracking
> (otherwise, this approach to non-reserved keywords doesn't work without
> a lot of synpreds), and outputting ASTs.
> 
> Since this is size related, it's hard to narrow it down to a simple
> example. I could try to duplicate it with just the id rule and nothing
> else.
> 
> Any ideas what might be happening here, and whether a fix might be
> possible?
> 
> Thanks,
> Ron
> 
> --
> Ron Hunter-Duvar | Software Developer V | 403-272-6580
> Oracle Service Engineering
> Gulf Canada Square 401 - 9th Avenue S.W., Calgary, AB, Canada T2P 3C5
> 
> All opinions expressed here are mine, and do not necessarily represent
> those of my employer.
> 
> 
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-
> email-address


From parrt at cs.usfca.edu  Sat Jan 30 11:48:18 2010
From: parrt at cs.usfca.edu (Terence Parr)
Date: Sat, 30 Jan 2010 11:48:18 -0800
Subject: [antlr-interest] better error messages in tree parsers
In-Reply-To: <9da4f4521001300348t3be6ac97t44963f4b351d423e@mail.gmail.com>
References: <604A33CC-D57B-4C48-ADDE-75331132609D@cs.usfca.edu>
	<9da4f4521001300348t3be6ac97t44963f4b351d423e@mail.gmail.com>
Message-ID: <B1FAEAC3-778F-4D74-8167-622922E60383@cs.usfca.edu>


On Jan 30, 2010, at 3:48 AM, Oliver Zeigermann wrote:

> As input.LT seems to return null values in case we are at the very
> start/end of the node stream, I added this check which does the job
> for me
> 
>           input.LT(-3) == null ? "" : ((Tree)input.LT(-3)).getText()+" "+
>           input.LT(-2) == null ? "" : ((Tree)input.LT(-2)).getText()+" "+
>           input.LT(-1) == null ? "" : ((Tree)input.LT(-1)).getText()+" >>>"+
>           input.LT(1) == null ? "" : ((Tree)input.LT(1)).getText()+"<<< "+
>           input.LT(2) == null ? "" : ((Tree)input.LT(2)).getText()+" "+
>           input.LT(3) == null ? "" : ((Tree)input.LT(3)).getText();

oh. right. start is a problem. end is EOF so no problem.

can u update the faq too? ;)
Ter

From duygu_the_duygu at yahoo.com  Tue Jan 19 13:20:57 2010
From: duygu_the_duygu at yahoo.com (Duygu Altinok)
Date: Tue, 19 Jan 2010 13:20:57 -0800 (PST)
Subject: [antlr-interest] infinite recursion on tree parser
Message-ID: <611145.64998.qm@web46001.mail.sp1.yahoo.com>

I'm writing a C -like language compiler for my language processors course . I defined a rule compound_expr which represents  nested blocks closed within curly braces. Compilation of the parser is fine but tree parser gives an error , can anybody please help?? 

f3.g:1014:10: infinite recursion to rule statement from rule statement_list
f3.g:1030:20: infinite recursion to rule statement from rule ca
f3.g:1029:26: infinite recursion to rule statement from rule compound_expr
f3.g:1023:13: infinite recursion to rule statement from rule statement
f3.g:1014:10: infinite recursion to rule statement from rule statement_list
f3.g:1014:10: infinite recursion to rule statement from rule statement_list
f3.g:1014:10: infinite recursion to rule statement from rule statement_list
f3.g:1014: warning:nondeterminism upon
f3.g:1014:     k==1:NULL_TREE_LOOKAHEAD,NUM,ID,PLUS,MINUS,MULT,DIV,LT,LEQ,EQ,NEQ,ISEQ,OUTPUTT,INPUTT,PARANTEZLISIN,IF,"return","while"
f3.g:1014:     between alt 1 and exit branch of block


Here's my code for compound_expr in the parser and the tree parser :

parser :

program:

    function_list  

{


    #program= #([PROGRAM,"program"],symbol_table, program);
}
;


function_list:
{
    is_in_function_list = true;
}
          (function)+
{
#function_list= #([FUNCTION_LIST, "function_list"], function_list);
}
         ;

function:
{
    String bt;
}
     bt=basic_type! "func"! i:ID! 
{
    String identifier = i.getText();
    
    if  (identifier.length() > 32)
    {
        error(WARN00, i.getLine(), i.getColumn());
        identifier = identifier.substring(0, 32);
    }
    
    which_function = new String(identifier);

    identifier=identifier + ":" + Integer.toString(i.getLine()) + ":" + Integer.toString(i.getColumn());
    
}
    
     parameter_list!  function_body 

{
    
    symbol_table.addChild(#([SYMBOL_FUNCTION, identifier ], [SYMBOL_TYPE, bt] , symbol_parameters, symbol_locals ));
    
#function=(#([ID,identifier],function));
}
    ;

function_body:
         LCURLY  declaration_list! statement_list RCURLY
         ;
declaration_list:
{
if (is_in_function_list)
    symbol_locals = (CommonAST) astFactory.create(SYMBOL_LOCALS, "symbol_locals");
}
         (declaration! SEMI!)*
        ;  
        
declaration:
{ String t  = new String("");
  String t2 = new String("");  } 
    t=basic_type i:ID t2=array_extension  {

 String identifier = i.getText();
                if  (identifier.length() > 32)
                {
                        error(WARN00, i.getLine(), i.getColumn());
                        identifier = identifier.substring(0, 32);
                }
t  += t2;
 identifier=identifier + ":" + Integer.toString(i.getLine()) + ":" + Integer.toString(i.getColumn());
                if (is_in_function_list && is_in_parameter)
                        symbol_parameters.addChild(#([SYMBOL_PARAMETER, identifier], [SYMBOL_TYPE, t], [SYMBOL_FUNCTION_SCOPE, which_function]));

        else if (is_in_function_list && ! is_in_parameter)
                        symbol_locals.addChild(#([SYMBOL_LOCAL, identifier], [SYMBOL_TYPE, t], [SYMBOL_FUNCTION_SCOPE, which_function]));
}
       ;


paramdecl :
{ String t  = new String("");
  String t2 = new String("");  } 
    t=basic_type i:ID t2=array_extension  {

 String identifier = i.getText();
                if  (identifier.length() > 32)
                {
                        error(WARN00, i.getLine(), i.getColumn());
                        identifier = identifier.substring(0, 32);
                }
t  += t2;
 identifier=identifier + ":" + Integer.toString(i.getLine()) + ":" + Integer.toString(i.getColumn());
                if (is_in_function_list && is_in_parameter)
                        symbol_parameters.addChild(#([SYMBOL_PARAMETER, identifier], [SYMBOL_TYPE, t], [SYMBOL_FUNCTION_SCOPE, which_function]));

        else if (is_in_function_list && ! is_in_parameter)
                        symbol_locals.addChild(#([SYMBOL_LOCAL, identifier], [SYMBOL_TYPE, t], [SYMBOL_FUNCTION_SCOPE, which_function]));
} ;


parameter_list:
{
    symbol_parameters = (CommonAST) astFactory.create(SYMBOL_PARAMETERS, "symbol_parameters");
    is_in_parameter = true;
}
 LPAREN

          ( variable_list {is_in_parameter = false;}
           | {is_in_parameter = false;} ) RPAREN
           ;
         

variable_list : LPAREN  (paramdecl (COMMA  paramdecl)*)?  RPAREN ;
        

type returns [String t]
{
    String bt=new String();
     String ar=new String();
    t = new String();
}
     :
     bt=basic_type ar=array_extension
     {
         t = bt + ar;
     }
    ;
   
basic_type returns [String bt]
{
    bt = new String();
}       
       :
       "int" {
               bt = new String("int");
        }
       |"float" {
               bt = new String("float");
        }
       
      ;


array_extension returns [String ar]
{
    ar = new String();
}
        :
            lbracket:LBRAC{ar = new String("[");} (n:NUM {ar += n.getText();}) RBRAC{ar += "]" ;}
        | 
          {
            ar = new String("");
          }
           ;

array_extension2 returns [String ar]
{
    ar = new String();
}
        :
            lbracket:LBRAC{ar = new String("[");}  RBRAC{ar += "]" ;}
        | 
          {
            ar = new String("");
          }
           ;
                              

statement_list : (statement)+; 
statement:
       (assignment_statement)=>assignment_statement
          | read_statement
      |return_statement
      |if_statement
      |while_statement
          | compound_expr
          | print_statement
          | expression SEMI
      ;
      

return_statement : "return"^ expression SEMI;
assignment_statement:
             variable EQ^ expression SEMI!
            ;


compound_expr   :        ca;
ca      :       LCURLY!   (statement_list)?  RCURLY!;


variable:
     i:ID! (LBRAC expression RBRAC)? 
     {
         String identifier=i.getText()+":"+i.getLine()+":"+i.getColumn();
         #variable = #([ID,identifier], variable);
    }
    
    ;

expression:
        simple_expression ( (LEQ^ |NEQ^|LT^ |ISEQ^) simple_expression)*
       ;

simple_expression:
          term ((PLUS^|MINUS^) term)*
          ;           


term :
    factor ( (MULT | DIV ) factor)*
    ;
    
    
factor :
    i:ID! (LBRAC expression RBRAC | LPAREN argument_list RPAREN)? 
    {
         String identifier=i.getText()+":"+i.getLine()+":"+i.getColumn();
         #factor = #([ID,identifier], factor);
    }
    | j: NUM!
    {
         String identifier=j.getText()+":"+j.getLine()+":"+j.getColumn();
         #factor = #([NUM,identifier], factor);
    }
    | LPAREN! expression RPAREN! { #factor = #([PARANTEZLISIN, "parantezli"], factor);}
    ;


read_statement : read_item EQ! INPUTT^ LPAREN! (STRING)? RPAREN! ;
read_item : variable; 
print_statement : OUTPUTT^  LPAREN! (STRING COMMA)? print_item RPAREN!  SEMI!;
print_item: variable;


function_call : ID LPAREN argument_list RPAREN SEMI;

expression_list :
    expression (COMMA! expression)*
    ;

argument_list :
    expression_list 
    | 
    ;

    
text_character :
    CHARLIT
    | special_character
    ;

special_character :
    "\n"
    ;


if_statement : if_part then_part else_part
{
#if_statement = #([IF , "if"], #if_statement); };

if_part :  IFF! LPAREN! expression RPAREN! ;
then_part : statement;
else_part  : (ELSE^  statement)? ENDIF! SEMI! ;

while_statement : WHILE^  LPAREN! expression  RPAREN! statement;
    

This part is OK. It gives no compilation errors. Here's the erronous part:


program
    :
    #(PROGRAM symbol_table 
    {
        sTable.sort();
    }  function_list 
    {
    sTable.prettyPrint();
    }
    )

    ;

symbol_table
    :
    #(SYMBOL_TABLE  
    (
    #(i:SYMBOL_FUNCTION 
    j:SYMBOL_TYPE 
    {
        //Parse info
        String identifier;
        int line, column;
        String [] params = new String[3];
        
        identifier = i.getText();
        params = identifier.split(":");
        identifier = params[0];
        line = Integer.parseInt(params[1]);
        column = Integer.parseInt(params[2]);
        
        Function newFunction=new Function(identifier,j.getText(), line, column); 
        
        
        //add Function to Symbol Table
        int index;
         if ((index = sTable.searchFunction(newFunction.name)) != -1)
            {
                error(ERR01, line, column, ((Function)sTable.functions.elementAt(index)).line, ((Function)sTable.functions.elementAt(index)).column);
                isFunctionLegal = false;
            }
        else
            {
                sTable.addFunction(newFunction);
                int last=sTable.functions.size()-1;
                currentFunction =(Function) sTable.functions.elementAt(last);
                isFunctionLegal = true;
                
            }
            
    } 
    symbol_parameters symbol_locals)

    
    )*
    )
    ;

symbol_parameters
    :
    #(SYMBOL_PARAMETERS 
    (
    #(i:SYMBOL_PARAMETER j:SYMBOL_TYPE 
    
{
        //Parse info
        String identifier;
        int line, column;
        String [] params = new String[3];
        
        identifier = i.getText();
        params = identifier.split(":");
        identifier = params[0];
        line = Integer.parseInt(params[1]);
        column = Integer.parseInt(params[2]);

 
    if (isFunctionLegal)
    {
        
        int last=sTable.functions.size()-1;         
        Symbol newSymbol = new Symbol(identifier,j.getText(),line,column);
        if (currentFunction.searchParameter(newSymbol.name) != -1){
            error(FuncErr+currentFunction.name,line,column);
            error(ERR03, line, column);
        }
        else 
            currentFunction.addParameter(newSymbol);
    }
} 
    
    SYMBOL_FUNCTION_SCOPE))*
    )
    ;
    
symbol_locals
    :
    #(SYMBOL_LOCALS (#(i:SYMBOL_LOCAL j:SYMBOL_TYPE
{ 
        //Parse info
        String identifier;
        int line, column;
        String [] params = new String[3];
        
        identifier = i.getText();
        params = identifier.split(":");
        identifier = params[0];
        line = Integer.parseInt(params[1]);
        column = Integer.parseInt(params[2]);

    if (isFunctionLegal)
    {
        int index;
        int last=sTable.functions.size()-1;         
        Symbol newSymbol = new Symbol(identifier,j.getText(),line, column);
        if ((index = currentFunction.searchParameter(newSymbol.name)) != -1){
            error(FuncErr+currentFunction.name,line,column);
            error(ERR04, line, column, ((Symbol)currentFunction.parameters.elementAt(index)).line,
            ((Symbol)currentFunction.parameters.elementAt(index)).column);
        
        }
        else if ((index = currentFunction.searchLocal(newSymbol.name)) != -1){
            error(FuncErr+currentFunction.name,line,column);
            error(ERR05, line, column, ((Symbol)currentFunction.locals.elementAt(index)).line,            ((Symbol)currentFunction.locals.elementAt(index)).column);
        }
        else if(j.getText().indexOf("[]")==-1)
        {
            currentFunction.addLocal(newSymbol);
        }
        else
            error(ERR06, line, column);    

    }
}  
    SYMBOL_FUNCTION_SCOPE))*)
    ;
                

function_list
    :
    #(FUNCTION_LIST (function)+)
    ;

function
    :
    #(i:ID
{
    //Parse info
    String identifier;
    int line, column;
    
    String [] params = new String[3];
        
    identifier = i.getText();
    params = identifier.split(":");
    identifier = params[0];
    line = Integer.parseInt(params[1]);
    column = Integer.parseInt(params[2]);
    
    int index;
    int line2=0,column2=0; //line and column info from the symbol table
    index = sTable.getFunctionIndex(identifier);
    if(index != -1)
    {
    line2=((Function)sTable.functions.elementAt(index)).line;
    column2=((Function)sTable.functions.elementAt(index)).column;
    }
    if(index!=-1 && line==line2 && column==column2)
    {    
        isFunctionLegal=true;
        currentFunction = (Function) sTable.functions.elementAt(index);
    }
    else
        isFunctionLegal=false; 
        
    
}
     function_body)
    ;

function_body:
    statement_list
    ;
    
statement_list:
    (statement)+
    ;

statement:
          (assignment_statement)=>assignment_statement
          | read_statement
      |return_statement
      |if_statement
      |while_statement
          | compound_expr
          | print_statement
          | expression SEMI
      ;


compound_expr   :        ca;
ca      :         (statement_list)?  ;
      
assignment_statement
{
    Symbol retType ;
    String type = new String("");
    int index;
    String exType = new String("");
}    
    :
#(EQ retType=variable exType=expression)
{    
if(isFunctionLegal)
{
    if (exType.startsWith("float"))
    {
        index=currentFunction.getParameterIndex(retType.name);
            if(index!=-1)
            {
                type=((Symbol)currentFunction.parameters.elementAt(index)).type;
            }
            else{
                index = currentFunction.getLocalIndex(retType.name);
                if(index!=-1){
                    type=((Symbol)currentFunction.locals.elementAt(index)).type;
                }

            }
        if (type.startsWith(new String("int")))
        {
            error(FuncErr+currentFunction.name,retType.line,retType.column);
            error(ERR12,retType.line,retType.column);
                
        }
    }
}
}
        
    ;
    
return_statement
{
    String exType;
    String type;
}
    :
    #("return" exType=expression
    {
    if(isFunctionLegal)
    {
        type=currentFunction.returntype;
        if(exType.startsWith("float") && type.startsWith("int"))
        {
            String identifier;
            int line, column;
            String [] params = new String[3];
        
            identifier = exType;
            params = identifier.split(":");
            identifier = params[0];
            line = Integer.parseInt(params[1]);
            column = Integer.parseInt(params[2]);
            
            error(FuncErr+currentFunction.name,line,column);
            error(ERR13,line,column);    
        }
    }
    }
    )
    ;

print_statement
    :
    #(OUTPUTT print_item)
    ;


print_item :
    variable 
    ;

read_statement
    :
#(INPUTT  read_item)
    ;


read_item 
{
    Symbol retType;
}:
     retType=variable 
{
if(isFunctionLegal)
{
    int index;
    String type=new String("");
    int line,column;

    index=currentFunction.getParameterIndex(retType.name);
    if(index!=-1)
    {
        type=((Symbol)currentFunction.parameters.elementAt(index)).type;
    }
    else{
        index = currentFunction.getLocalIndex(retType.name);
        if(index!=-1){
            type=((Symbol)currentFunction.locals.elementAt(index)).type;
        }
        else 
        {
                index=sTable.getFunctionIndex(retType.name);
                if(index!=-1)
                {
                    line=((Function)sTable.functions.elementAt(index)).line;
                    column=((Function)sTable.functions.elementAt(index)).column;
            
                    error(FuncErr+currentFunction.name,retType.line,retType.column);
                    error(ERR07,retType.line,retType.column,line,column);
                }
        }
        }
        
        if ( !(type.startsWith(new String("int"))) && !(type.equals("")))
        {
            error(FuncErr+currentFunction.name,retType.line,retType.column);
            error(ERR09,retType.line,retType.column);
        }
     

}
}
;    
            
    
if_statement:
    #(IF if_part then_part else_part)
    ;
    
if_part:
    expression
    ;
    
then_part :
    #("then" statement)
    ;
    
else_part :
    #("else" statement)
    |
    ;
    
while_statement 
    :
    #("while" expression statement)
    ;

variable returns [Symbol v]{
    v =new Symbol(new String(""),new String(""),0,0);
    String exType;
    boolean isArray=false;
}
    :
#(i:ID (LBRAC exType=expression RBRAC 
{
if(isFunctionLegal)
{
    if(exType.startsWith(new String("float"))) 
    {    
            //Parse info
            String identifier;
            int line, column;
            String [] params = new String[3];
        
            identifier = exType;
            params = identifier.split(":");
            identifier = params[0];
            line= Integer.parseInt(params[1]);
            column= Integer.parseInt(params[2]);
            
            error(FuncErr+currentFunction.name,line,column);
            error(ERR11,line,column);        
    }

    isArray=true;
}
}
)? )
{    
if(isFunctionLegal)
{
    //Parse info
        String identifier;
        int line, column;
        String [] params = new String[3];
        
        identifier = i.getText();
        params = identifier.split(":");
        identifier = params[0];
        line = Integer.parseInt(params[1]);
        column = Integer.parseInt(params[2]);

        v = new Symbol(identifier,"", line, column);
        int index;
        String type=new String("");
        index=currentFunction.getParameterIndex(identifier);
        if(index!=-1)
        {
            type=((Symbol)currentFunction.parameters.elementAt(index)).type;
        }
        else{
            index = currentFunction.getLocalIndex(identifier);
            if(index!=-1){
                type=((Symbol)currentFunction.locals.elementAt(index)).type;
            }
        }
            
        if(!type.equals("") && type.indexOf("[")!=-1 && !isArray)    
        {
                error(FuncErr+currentFunction.name,line,column);
                error(ERR15,line,column);
        }
        if(!type.equals("") && type.indexOf("[")==-1 && isArray)    
        {
                error(FuncErr+currentFunction.name,line,column);
                error(ERR16,line,column);
        }
        if(isArray){
                isArray=false;
                
        }
}
}
    ;
    
expression returns [String exType]
{
    String sType;
    exType=new String(""); 
}
    
    :
         (#(ISEQ expression simple_expression)) => #(ISEQ exType=expression sType=simple_expression)
        {
        if(isFunctionLegal)
        {
        if (sType.startsWith(new String("float")) && !exType.startsWith(new String("float"))) 
        {    
            exType=sType;
        }    
        
            }
        }
        | (#(NEQ expression simple_expression)) => #(NEQ exType=expression sType=simple_expression)
        {
        if(isFunctionLegal)
        {
        if (sType.startsWith(new String("float")) && !exType.startsWith(new String("float"))) 
        {    
            exType=sType;
        }    
        
        }
        }
        |( #(LT expression simple_expression) ) => #(LT exType=expression sType=simple_expression)
        {
        if(isFunctionLegal)
        {
        if (sType.startsWith(new String("float")) && !exType.startsWith(new String("float"))) 
        {    
            exType=sType;
        }    
        
        }
        }
        |( #(LEQ expression simple_expression) ) => #(LEQ exType=expression sType=simple_expression)
        {
        if(isFunctionLegal)
        {
        if (sType.startsWith(new String("float")) && !exType.startsWith(new String("float"))) 
        {    
            exType=sType;
        }    
        
        }
        }
        
        |exType=simple_expression
        ;

simple_expression returns [String sType]
{
    String tType;
    sType=new String();
}
    :
      (#(PLUS simple_expression term))=>#(PLUS sType=simple_expression tType=term)
      {
      if(isFunctionLegal)
      {
        if (tType.startsWith(new String("float")) && !sType.startsWith(new String("float"))) 
        {    
            sType=tType;
        }    
        
      }
      }
      | (#(MINUS simple_expression term))=>#(MINUS sType=simple_expression tType=term)
      {
      if(isFunctionLegal)
      {
        if (tType.startsWith(new String("float")) && !sType.startsWith(new String("float"))) 
        {    
            sType=tType;
        }    
      } 
      }
      | sType=term
          ;           

term returns [String tType]
{
    Symbol retType=new Symbol(new String(""),new String(""),0,0);
    tType=new String("");
}
    :
    (#( MULT term factor))=>#( MULT tType=term retType=factor
    {
    if(isFunctionLegal)
    {
        int index;
        String type=new String("");
        int line,column;
        
        //Control whether it is a number or an identifier
        if(!retType.name.equals("") && new Character(retType.name.charAt(0))<=new Character('9') && new Character(retType.name.charAt(0))>=new Character('0'))
        {
            if(retType.name.indexOf('.')!=-1 || retType.name.indexOf('E')!=-1 || retType.name.indexOf('e')!=-1)
                type="float";
            else
                type="int";

        }
        else{
            index=currentFunction.getParameterIndex(retType.name);
            if(index!=-1)
            {
                type=((Symbol)currentFunction.parameters.elementAt(index)).type;
            }
            else{
                index = currentFunction.getLocalIndex(retType.name);
                if(index!=-1){
                    type=((Symbol)currentFunction.locals.elementAt(index)).type;
                }
                else 
                {
                    error(FuncErr+currentFunction.name,retType.line,retType.column);
                    error(ERR08,retType.line,retType.column);
                        }
                    }
        }
    
    if(type.startsWith(new String("float")) && !tType.startsWith(new String("float")))
            tType=type+new String(":")+new Integer(retType.line)+new String(":")+new Integer(retType.column);

    }
    }
    )
    |(#( DIV term factor))=>#( DIV tType=term retType=factor
    {
    if(isFunctionLegal)
    {
        int index;
        String type=new String("");
        int line,column;
        
        //Control whether it is a number or an identifier
        if(!retType.name.equals("") && new Character(retType.name.charAt(0))<=new Character('9') && new Character(retType.name.charAt(0))>=new Character('0'))
        {
            if(retType.name.indexOf('.')!=-1 || retType.name.indexOf('E')!=-1 || retType.name.indexOf('e')!=-1)
                type="float";
            else
                type="int";

        }
        else{
            index=currentFunction.getParameterIndex(retType.name);
            if(index!=-1)
            {
                type=((Symbol)currentFunction.parameters.elementAt(index)).type;
            }
            else{
                index = currentFunction.getLocalIndex(retType.name);
                if(index!=-1){
                    type=((Symbol)currentFunction.locals.elementAt(index)).type;
                }
                else 
                {
                        index=sTable.getFunctionIndex(retType.name);
                        if(index!=-1)
                        {
                            type=((Function)sTable.functions.elementAt(index)).returntype;    
                        }
                }
            }    
        }
    
        if(type.startsWith(new String("float")) && !tType.startsWith(new String("float")))
            tType=type+new String(":")+new Integer(retType.line)+new String(":")+new Integer(retType.column);
    }
    }
    )
           | retType=factor
    {
    if(isFunctionLegal)
    {    
        int index;
        String type=new String("");
        int line,column;
        
        //Control whether it is a number or an identifier
        if(!retType.name.equals("") && new Character(retType.name.charAt(0))<=new Character('9') && new Character(retType.name.charAt(0))>=new Charac
ter('0'))
        {
            if(retType.name.indexOf('.')!=-1 || retType.name.indexOf('E')!=-1 || retType.name.indexOf('e')!=-1)
                type="float";
            else
                type="int";

        }
        else{
            index=currentFunction.getParameterIndex(retType.name);
            if(index!=-1)
            {
                type=((Symbol)currentFunction.parameters.elementAt(index)).type;
            }
            else{
                index = currentFunction.getLocalIndex(retType.name);
                if(index!=-1){
                    type=((Symbol)currentFunction.locals.elementAt(index)).type;
                }
                else 
                {
                        index=sTable.getFunctionIndex(retType.name);
                        if(index!=-1)
                        {
                            type=((Function)sTable.functions.elementAt(index)).returntype;    
                        }    
                }
            }    
        }
    
        tType=type+new String(":")+new Integer(retType.line)+new String(":")+new Integer(retType.column);
    }
    }
    ;


factor returns [Symbol v]

{

    v=new Symbol(new String(""),new String(""),0,0);

    String exType;

    String errorStr;

    Vector argsVec;

    boolean isFunction=false;

    boolean isArray=false;

}    

    :

    (#(ID (LPAREN argument_list RPAREN))) => #(i:ID 

    {

    if(isFunctionLegal)

    {

        //Parse info

        String identifier;

        int line, column;

        String [] params = new String[3];

        
        identifier = i.getText();

        params = identifier.split(":");

        identifier = params[0];

        line = Integer.parseInt(params[1]);

        column = Integer.parseInt(params[2]);


        v = new Symbol(identifier, "", line, column);                            

                
    }

    }

    (LPAREN argsVec=argument_list RPAREN 

    {

    if(isFunctionLegal)

    {

        isFunction=true;

        String identifier;

        int line, column;

        String [] params = new String[3];

        
        identifier = i.getText();

        params = identifier.split(":");

        identifier = params[0];

        line = Integer.parseInt(params[1]);

        column = Integer.parseInt(params[2]);


        boolean errorVr=false;

        int index;

        index=sTable.getFunctionIndex(identifier);

        if(index!=-1)

        {

            currentCalledFunc=(Function)sTable.functions.elementAt(index);

            isCalledFunctionLegal=true;

        }

        else

        {

            isCalledFunctionLegal=false;

            error(FuncErr+currentFunction.name,line,column);

            error(ERR14,line,column);

        }


        if(isCalledFunctionLegal)

        {

            if(argsVec.size()!= currentCalledFunc.parameters.size())

                errorVr=true;

            else    

            for(int a=0;a<currentCalledFunc.parameters.size();a++)

            {

                //DEGISECEK

                if(((String)argsVec.elementAt(a)).indexOf("[")!=-1)

                {    

                    if(((Symbol)currentCalledFunc.parameters.elementAt(a)).type.indexOf("[")==-1)

                    {


                        errorVr=true;

                        break;

                    }    

                    
                    if((((String)argsVec.elementAt(a)).startsWith("float") &&

                    !((Symbol)currentCalledFunc.parameters.elementAt(a)).type.startsWith("float")) 

                    || (((String)argsVec.elementAt(a)).startsWith("int") &&

                    !((Symbol)currentCalledFunc.parameters.elementAt(a)).type.startsWith("int")) )

                    {


                        errorVr=true;

                        break;


                    }

                }

                else{

                    if(((Symbol)currentCalledFunc.parameters.elementAt(a)).type.indexOf("[")!=-1)

                    {


                        errorVr=true;

                        break;

                    }    

                    
                    if((((String)argsVec.elementAt(a)).startsWith("float") &&

                    !((Symbol)currentCalledFunc.parameters.elementAt(a)).type.startsWith("float")) 

                    || (((String)argsVec.elementAt(a)).startsWith("int") &&

                    !((Symbol)currentCalledFunc.parameters.elementAt(a)).type.startsWith("int")) )

                    {


                        errorVr=true;

                        break;


                    }

                
                }

            }

            
            if(errorVr)

            {

                //error(FuncErr+currentFunction.name,line,column);

                String protoFung,UseFung;

                String paramsStr=new String("");

                String arguments=new String("");

                
                for(int a=0;a<currentCalledFunc.parameters.size();a++)

                {

                    paramsStr+=((Symbol)currentCalledFunc.parameters.elementAt(a)).type;

                    if(a!=currentCalledFunc.parameters.size()-1)

                    paramsStr+=", ";

                }

                for(int a=0;a<argsVec.size();a++)

                {

                    arguments+=((String)argsVec.elementAt(a));

                    if(a!=argsVec.size()-1)

                    arguments+=", ";

                }

            
                errorStr=new String("Error 17: "+currentCalledFunc.name+"("+paramsStr+")"

                +" cannot be called with "+"("+arguments+")");

                
                error(FuncErr+currentFunction.name,line,column);

                error(errorStr,line,column);

            }

        }


    }

    }))


    |l:NUM 

    {

    if(isFunctionLegal)

    {

        //Parse info

        String identifier;

        int line, column;

        String [] params = new String[3];

        
        identifier = l.getText();

        params = identifier.split(":");

        identifier = params[0];

        line = Integer.parseInt(params[1]);

        column = Integer.parseInt(params[2]);


        v = new Symbol(identifier, "", line, column);


    }

    }

    |( #(ID (LBRAC expression RBRAC)?))=>#(j:ID

    {

    if(isFunctionLegal)

    {

        //Parse info

        String identifier;

        int line, column;

        String [] params = new String[3];

        
        identifier = j.getText();

        params = identifier.split(":");

        identifier = params[0];

        line = Integer.parseInt(params[1]);

        column = Integer.parseInt(params[2]);


        v = new Symbol(identifier, "", line, column);

    }

    }(LBRAC exType=expression RBRAC

    {

    if(isFunctionLegal)

    {

        isArray=true;

        if(exType.startsWith(new String("float")))

        {

            //Parse info

            String identifier;

            int line, column;

            String [] params = new String[3];

        
            identifier = exType;

            params = identifier.split(":");

            identifier = params[0];

            line = Integer.parseInt(params[1]);

            column = Integer.parseInt(params[2]);

            
            error(FuncErr+currentFunction.name,line,column);

            error(ERR11,line,column);        

        }

    }

    }

    )?

    {

    if(isFunctionLegal)

    {

            String identifier;

            int line, column;

            String [] params = new String[3];

        
            identifier = j.getText();

            params = identifier.split(":");

            identifier = params[0];

            line = Integer.parseInt(params[1]);

            column = Integer.parseInt(params[2]);

            
            int index;

            String type=new String("");

            index=currentFunction.getParameterIndex(identifier);

            if(index!=-1)

            {

                type=((Symbol)currentFunction.parameters.elementAt(index)).type;

            }

            else{

                index = currentFunction.getLocalIndex(identifier);

                if(index!=-1){

                    type=((Symbol)currentFunction.locals.elementAt(index)).type;

                }

                else 

                {


                        error(FuncErr+currentFunction.name,line,column);

                        error(ERR08,line,column);


                }

            }

            
            if(!type.equals("") && type.indexOf("[")!=-1 && !isArray)    

            {

                error(FuncErr+currentFunction.name,line,column);

                error(ERR15,line,column);

            }

            if(!type.equals("") && type.indexOf("[")==-1 && isArray)    

            {

                error(FuncErr+currentFunction.name,line,column);

                error(ERR16,line,column);

            }

            if(isArray)

                isArray=false;

    }

    }

    )

    | #(PARANTEZLISIN exType=expression

    {

    if(isFunctionLegal)

    {

        String identifier;

        int line, column;

        String [] params = new String[3];

        
        identifier = exType;

        params = identifier.split(":");

        identifier = params[0];

        line = Integer.parseInt(params[1]);

        column = Integer.parseInt(params[2]);


        v = new Symbol(identifier, "", line, column);                        

    }

    }

        )

    ;


expression_list returns [Vector args]
{
    args=new Vector();
    String argType;
}
    :
    (argType=expression
    {    
    if(isFunctionLegal)
    {
        String identifier;
        int line, column;
        String [] params = new String[3];
        
        identifier = argType;
        params = identifier.split(":");
        identifier = params[0];
        line = Integer.parseInt(params[1]);
        column = Integer.parseInt(params[2]);
        args.add(identifier);
    }
    }
    )+
    ;

argument_list returns [Vector args]
{
    args=new Vector();

} 
    :
    args=expression_list 
    |
    
    ;


From ron.hunter-duvar at oracle.com  Sat Jan 30 21:18:27 2010
From: ron.hunter-duvar at oracle.com (Ron Hunter-Duvar)
Date: Sat, 30 Jan 2010 22:18:27 -0700
Subject: [antlr-interest] ANTLR running out of memory during generation
In-Reply-To: <7277098525d5fb4685c662b1fba4f4e2@temporal-wave.com>
References: <7277098525d5fb4685c662b1fba4f4e2@temporal-wave.com>
Message-ID: <4B6512A3.9020304@oracle.com>

Jim,

Thanks for the response. Yeah, the target language is kind of obvious 
isn't it? What else could have that many keywords?

I might try turning off backtracking later on and see what all I have to 
fix. Right now it's turning out to be a lot easier, and hasn't created 
any performance problems. Also, I'm not concerned with rejecting invalid 
code, only with successfully parsing all valid code, which simplifies 
things.

But the problem I'm having doesn't relate to any specific keyword. I 
even try inserting garbage keywords, with the same result. To me, the 
fact that it runs perfectly fine (and fast) with 631, and apparently 
hits some endless loop/recursion at 632 that makes it run 10x longer and 
run out of memory indicates a bug or implementation limitation. The fact 
that 3.1 and 3.2 behave exactly the same way indicates it's code that 
hasn't changed in the latest release. Unfortunately, I don't know enough 
of ANTLR's internals to be able to track it down, and don't have the 
time now to learn what I need to.

I have run it with 2G heap space. I bumped it up from 512M to 1G then 
2G, and all it accomplished was to make it run a few seconds longer 
before running out of memory. A clear symptom of endless loop/recursion. 
There shouldn't be anything I can do in my grammar that would cause 
ANTLR to act this way.

I'll try those switches and see if they help. For the moment I've been 
able to side step the problem by cutting it down to the set of keywords 
for currently implemented parts of the language, bringing it down to 
about 150 (I had started with the full keyword list that's available, 
and then kept adding all the omissions from that list, of which there 
are many). But ultimately I'll have to find a way to deal with it. I'm 
hoping maybe Terry will have a bug fix for me before that 8^).

Ron


Jim Idle wrote:
> Ron,
>
> First you really need to switch off backtracking unless the objective of your parser is to analyze SQL (you gave it away when you mentioned 632 keywords that can be identifiers). There are not as many predicates required as you think so long as you left factor everything.
>
> Your tokens should be consecutive so long as you list them that way in the lexer. 
>
> The problem might well be that although SQL sort of allows all keywords to be identifiers, it does not allow all because some of them would be to ambiguous even for a syntax directed hand crafted parser. If you turn on backtracking then try to allow one of these reserved words to be an identifier, then you will probably mask the issue because all warnings and errors are turned off. 
>
> It is entirely feasible to create a full SQL parser without backtracking, very little look ahead and few predicates (all of the one or two token lookahead type). I have an online demo of T-SQL for instance on my web site at www.temporal-wave.com  (select 'online demos' link), and Oracle SQL/PLSQL will be up there before long too.
>
> So, I think you will need to do the following to have a chance of generating the code:
>
> 1) Use -Xconversiontimeout 10000
> 2) Cause switches to be generated rather than ifs: -Xmaxswitchcaselabels 32000 -Xminswitchalts 1-xmaxinlineddfastates 65534
> 3) Use -Xmx2G when invoking the java command (assuming your jvm allows that)
>
> But if you cannot get it going that way, then basically you are masking a bigger problem in your grammar that you are not seeing because of global backtracking. 
>
> Jim
>
>   
>> -----Original Message-----
>> From: antlr-interest-bounces at antlr.org [mailto:antlr-interest-
>> bounces at antlr.org] On Behalf Of Ron Hunter-Duvar
>> Sent: Friday, January 29, 2010 8:52 PM
>> To: antlr-interest at antlr.org
>> Subject: [antlr-interest] ANTLR running out of memory during generation
>>
>> I'm having a strange problem with ANTLR. I'm building a grammar for a
>> language with a huge number (hundreds) of non-reserved keywords. I'm
>> using the approach of having the lexer return a different token type
>> for
>> each keyword, and then having a parser rule of the form:
>>
>>     id : ( ID | QUOTED_ID | KW_A | KW_B | ... | KW_ZZZ );
>>
>> This was working great until today. In fact, ANTLR 3.2 generates
>> surprisingly clever code for this - all the keywords are assigned
>> consecutive token numbers, and generated code just says:
>>
>>     if ( (input.LA(1)>=KW_A && input.LA(1)<=KW_ZZZ)||(input.LA(1)>=ID
>> &&
>> input.LA(1)<=QUOTED_ID) ) {
>>         input.consume();
>>         ...
>>
>> This works all the way up to 631 keywords. ANTLR runs in about 20
>> seconds, and never uses more than 269MB of memory. When I add a 632nd
>> keyword (doesn't matter what the keyword is), and change nothing else,
>> ANTLR runs for 2 minutes and runs out of heap space. I kept bumping the
>> max space up, but even going to 2GB doesn't make any difference.
>>
>> What's really interesting is that I was using ANTLR 3.1 until now. When
>> I ran into this I upgraded to 3.2, but both of them fail at exactly the
>> same spot, 632 keywords. Not surprisingly, the stack trace varies from
>> one run to the next, depending on the exact point it runs out of
>> memory,
>> but it always has deeply nested calls to these and other methods:
>>
>>
>> org.antlr.stringtemplate.language.ASTExpr.writeTemplate(ASTExpr.java:75
>> 0)
>>     org.antlr.stringtemplate.language.ASTExpr.write(ASTExpr.java:680)
>>
>> org.antlr.stringtemplate.language.ASTExpr.writeAttribute(ASTExpr.java:6
>> 60)
>>
>> org.antlr.stringtemplate.language.ActionEvaluator.action(ActionEvaluato
>> r.java:86)
>>     org.antlr.stringtemplate.language.ASTExpr.write(ASTExpr.java:149)
>>
>> org.antlr.stringtemplate.StringTemplate.write(StringTemplate.java:705)
>>
>> I don't know if it makes a difference, but I'm using backtracking
>> (otherwise, this approach to non-reserved keywords doesn't work without
>> a lot of synpreds), and outputting ASTs.
>>
>> Since this is size related, it's hard to narrow it down to a simple
>> example. I could try to duplicate it with just the id rule and nothing
>> else.
>>
>> Any ideas what might be happening here, and whether a fix might be
>> possible?
>>
>> Thanks,
>> Ron
>>
>> --
>> Ron Hunter-Duvar | Software Developer V | 403-272-6580
>> Oracle Service Engineering
>> Gulf Canada Square 401 - 9th Avenue S.W., Calgary, AB, Canada T2P 3C5
>>
>> All opinions expressed here are mine, and do not necessarily represent
>> those of my employer.
>>
>>
>> List: http://www.antlr.org/mailman/listinfo/antlr-interest
>> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-
>> email-address
>>     
>
>
>
>
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address
>
>   

-- 
Ron Hunter-Duvar | Software Developer V | 403-272-6580
Oracle Service Engineering
Gulf Canada Square 401 - 9th Avenue S.W., Calgary, AB, Canada T2P 3C5

All opinions expressed here are mine, and do not necessarily represent
those of my employer.


From khamenya at gmail.com  Sun Jan 31 14:46:15 2010
From: khamenya at gmail.com (Valery Khamenya)
Date: Sun, 31 Jan 2010 23:46:15 +0100
Subject: [antlr-interest] "prog : .+ ;
	" ==> "no viable alternative at character" (antlr-3.1.2)
In-Reply-To: <84fecab1001311440r7aa627e2t9318591653225c42@mail.gmail.com>
References: <84fecab1001311440r7aa627e2t9318591653225c42@mail.gmail.com>
Message-ID: <84fecab1001311446l74172614v2c51a31b3c2284bb@mail.gmail.com>

Hi,

what's wrong with the following trivial lexer grammar?

  grammar Grammar;
  options {
language=Python;
 output=AST;
ASTLabelType=CommonTree;
  }
  prog : .+  ;

I am getting "no viable alternative at character ..." at every character of
input stream.

antlr-3.1.2

Of course I don't really need a 1-char chopping lexer. It is just a relevant
extraction from a real case grammar.

Comments and hints are welcome!

Best regards
--
Valery

From kirby.bohling at gmail.com  Sun Jan 31 15:44:38 2010
From: kirby.bohling at gmail.com (Kirby Bohling)
Date: Sun, 31 Jan 2010 17:44:38 -0600
Subject: [antlr-interest] "prog : .+ ;
	" ==> "no viable alternative at 	character" (antlr-3.1.2)
In-Reply-To: <84fecab1001311446l74172614v2c51a31b3c2284bb@mail.gmail.com>
References: <84fecab1001311440r7aa627e2t9318591653225c42@mail.gmail.com> 
	<84fecab1001311446l74172614v2c51a31b3c2284bb@mail.gmail.com>
Message-ID: <3cac8fdf1001311544x1a3f9bceyb6891c7e30c66b16@mail.gmail.com>

On Sun, Jan 31, 2010 at 4:46 PM, Valery Khamenya <khamenya at gmail.com> wrote:
> Hi,
>
> what's wrong with the following trivial lexer grammar?
>
> ?grammar Grammar;
> ?options {
> language=Python;
> ?output=AST;
> ASTLabelType=CommonTree;
> ?}
> ?prog : .+ ?;
>
> I am getting "no viable alternative at character ..." at every character of
> input stream.

In this case, I'm pretty sure it's because you don't have a lexer rule...

Just as an aside, I'm pretty sure this is a combined grammar, as you
didn't spec it to be a lexer only.

Uppercase prog to PROG, and it should generate exactly one token.
You'll probably want to add a parser rule if you make that change
otherwise it will lex, but not parse.

Kirby

>
> antlr-3.1.2
>
> Of course I don't really need a 1-char chopping lexer. It is just a relevant
> extraction from a real case grammar.
>
> Comments and hints are welcome!
>
> Best regards
> --
> Valery
>
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address
>