[antlr-interest] java.lang.OutOfMemoryError: Java heap space

Tue Jun 5 14:07:51 PDT 2007

El 5/6/2007, a las 22:05, Jim Idle escribió:

> I think that this may well be co-incidence. The previous poster has
> written a parser entirely as a lexer and that is why it is taking so
> long to produce the output. As another poster said, if you take you  
> one
> or two the lexer fragment rules, then the lexer generation can breath
> and it doesn't take so long. It looks to me like that is the issue  
> with
> the other grammar.
>
> While I cannot be certain, from what you said you were trying to do, I
> think you were seeing a similar problem, with trying to make the lexer
> too complicated and having non fragment lexer rules embedded in other
> rules and so on. Hence you do not see these issue until you try to
> generate the lexer. If ANTLRWorks were to try and find out if this  
> would
> happen, it would, guess what, have to pretty much generate the lexer.
>
> If you are trying to specify things that look suspiciously like syntax
> in the lexer, then you are doing it in the wrong place basically. Just
> list all the thing s that can be tokenized, then tell the parser  
> what is
> and is not a valid order.

The reason I approached it this way is like I said in my original post:

> The internal structure of the URI isn't of any interest to me, I  
> just want to get a token for each URI, so I'm doing all this in the  
> lexer.

But I can certainly try doing less in the lexer and more in the  
parser if it will solve these problems.

One of the reasons why I wanted URIs to be returned as a single token  
is that it would make the rest of the parser/lexer simpler... the  
sample grammar I posted was just for testing purposes, recognizing  
URIs in isolation, but this is all destined for a wiki markup  
translator, of which recognizing URIs will be just one small part...

Cheers,
Wincent