[antlr-interest] Re: Help with Java grammar

Ric Klaren klaren at cs.utwente.nl
Thu Mar 11 07:51:54 PST 2004


Hi,

Hmmmm nice one you find a bug, the vanilla java grammar has indeed the same
problem. SL_COMMENT hangs when it's fed only '//'.

I can offer another solution now: switch to C++ mode it is doing the right
thing (complain it found an unexpected EOF). Also the sane fix for
SL_COMMENT which I expected to work works in C++ mode.

So...
a) Java codegen is generating the wrong bitset.
b) The java implementation of BitSet is incorrectly saying that EOF value
   65535 is in the bitset of what is supposed to be valid for the
   (~(\n|\r))* part.

Hmmm writing that down a little coin dropped...

c) Antlr should warn if you set the charVocabulary to '\u0003'..'\uFFFF';
   then it includes EOF. Which of course gets automagically added if you
   use ~ constructs.

So there's another easy fix e.g. limiting the charVocabulary of the lexer
to:

charVocabulary='\u0003'..'\u7FFF';

And change SL_COMMENT to:

SL_COMMENT
	:	"//"
		(~('\n'|'\r'))* ('\n'|'\r'('\n')?)?
		{$setType(Token.SKIP); newline();}
	;

And you should be set.

Cheers,

Ric

On Thu, Mar 11, 2004 at 02:48:37PM -0000, cliftonccraig wrote:
> I have tried running in my debugger but I couldn't follow all of the
> jumps that were made. It appeared to be jumping back and forth between
> two statements in a switch nested in a infinite while loop. I couldn't
> tell at that point if it was stuck of trying to match some complex
> lexer rule or what. That experience did, however, point me to the fact
> that it was the last single line comment that was hanging everything
> up. I think I noticed it (the single line comment) when I eval'ed a
> buffer or something. I couldn't understand exactly what was happening
> but I knew it had to be at that point in the processing and then when
> I looked back at the input file I then noticed the comment was on the
> EOF line. I recalled reading somewhere that you shouldn't end a source
> file with EOF (probably an article on the ANTLR or JavaCC site) and
> made an educated guess that this is what was causing my problem. I
> confirmed my suspicion when I inserted a newline in the input file and
> all was good. I later (at home) tried the parser out of the box on one
> of the tests that ship out of the box and got the same results after
> putting a single line comment on the last line. That confirmed, for
> me, that it was not the additional logic from the rewrite engine that
>  was causing this. (I didn't think it would be but I had to confirm
> it.) I know little of why it happens and I will try to look into it
> again a little later. I'm sure anyone could replicate the problem just
> by downloading the Antlr package, generating the JavaReckognizer with
> the included grammar and running over any Java file that ends with a
> single line comment. For now I have my work around in place. I'd love
> to improve on it because it always inserts a new line which is carried
> over into the rewritten file. (Ooh, I just thought of a workaround for
> my workaround!) I'd love to take your advice on overridding the
> CharBuffer or whatever but I know little of these classes and have
> only been working with this technology for a matter of days. I'm not
> stupid, I'm just afraid it will take me some time to figure out what
> to put where in the overriding logic. Thank you for all of your help.
> I really appreciate it. You will probably here back from me on this
> mailing list when I get back to working with this.

--
-----+++++*****************************************************+++++++++-------
    ---- Ric Klaren ----- j.klaren at utwente.nl ----- +31 53 4893722  ----
-----+++++*****************************************************+++++++++-------
   Words fly like arrows
      as if we knew what was right and wrong. --- Chuang Tsu



 
Yahoo! Groups Links

<*> To visit your group on the web, go to:
     http://groups.yahoo.com/group/antlr-interest/

<*> To unsubscribe from this group, send an email to:
     antlr-interest-unsubscribe at yahoogroups.com

<*> Your use of Yahoo! Groups is subject to:
     http://docs.yahoo.com/info/terms/
 



More information about the antlr-interest mailing list