[antlr-interest] Re: Short circuit of the lexer

Sat Jan 18 17:30:43 PST 2003

>>>>> "xadeck" == xadeck <decoret at graphics lcs mit edu> <decoret at graphics.lcs.mit.edu> writes:
[...]

> Well, I switch from tail recursion to list and it is still slow. I wrote
> a dummy version of my grammar keeping only the array stuff and I t is
> quite pretty fast (10s for a test file)

So, the array recognition stuff is now fast but the rest of your lexer and
parser are slow?  I think the answer is pretty obvious... :-)

> but when I add more recognition token to the lexer it gets slower (30s)
> and when I use my full grammar without any action (so C++ extra code
> cannot be involved), it get quite slow (1m30 to 2min) and it even seems
> the tail recursion is faster ?!?.

I doubt it.

>>From what little you've said, it seems that your grammar is either for a
really nasty language or your implementation isn't very speedy or both.
Hmm... Ah, VRML2.  Alas, I don't recall much nitty-gritty about VRML (let
alone VRML2) so I don't have any off the cuff answer.

> I am trying to figure out what is going one -> will investigate) cause I
> know such files can be parsed pretty fast with antlr (I have seen
> examples but I cannot use their grammar). I can send the full grammar but
> it is pretty long (VRML2 grammar) and you would need the associate
> library to compile it. I guess you have something else to do than
> debugging other people's grammar.

Administrivia:  Do NOT make large posts to the newsgroup/mailing list.  Put
your stuff up on a web site and send out a message with a link to it so
that people who are up for it may go get it and look at it.

FYI, FWIW, there are a (small) number of people in this forum who offer
professional ANTLR consulting and development so if you really need it to
kick ass and don't have the time to become an ANTLR master yourself....

> Anyway, the original questions still holds for curiosity: will
> Lexer::LA() be messed if I screw up the input stream within
> lexer::nextToken()?

(A) I would suggest that you spend the time to really analyze your lexer
and parser grammars first.  For example, other places where you're using
calls/recursion rather than looping, abuse of syntactic predicates, etc.

(B) Are you sure that it's actually the lexing that's taking a long time?
Have you actually profiled your lexer and parser to determine that or are
you just guessing?

(C) If you're going to muck with nextToken(), you'll really need to make
sure that your aren't violating the various assumptions that are being made
about the state that things are in at any point.

Take care,
	John

Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/