[antlr-interest] Re: Short circuit of the lexer

xadeck <decoret at graphics.lcs.mit.edu> decoret at graphics.lcs.mit.edu
Sat Jan 18 14:40:21 PST 2003


--- In antlr-interest at yahoogroups.com, Terence Parr <parrt at j...> 
> 
> Are you using the latest 2.7.2 stuff or 2.7.1?  I think 2.7.2 is faster 
> :)

Of course I am ;-)

> 
> Also, (INT)* is definitely more efficient than the tail recursion you 
> are using.  just add the action within the loop:
> 
> ( i:INT {result.push_back(atoi(i->getText().c_str()));} )*
> 
> Put that in rule decl instead of referring to values and you should be 
> good to go.  Let me know if this works.  The tail recursion will build 
> a HUGE stack of method invocation records if you have 180k lines...very 
> very inefficient.  Try the loop :)
> 

Well, I switch from tail recursion to list and it is still slow. I
wrote a dummy version of my grammar keeping only the array stuff and I
t is quite pretty fast (10s for a test file) but when I add more
recognition token to the lexer it gets slower (30s) and when I use my
full grammar without any action (so C++ extra code cannot be
involved), it get quite slow (1m30 to 2min) and it even seems the tail
recursion is faster ?!?.

I am trying to figure out what is going one -> will investigate) cause
I know such files can be parsed pretty fast with antlr (I have seen
examples but I cannot use their grammar). I can send the full grammar
but it is pretty long (VRML2 grammar) and you would need the associate
library to compile it. I guess you have something else to do than
debugging other people's grammar.

Anyway, the original questions still holds for curiosity: will
Lexer::LA() be messed if I screw up the input stream within
lexer::nextToken()?

And by the way, thanks for the help.


 

Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/ 



More information about the antlr-interest mailing list