[antlr-interest] whitespaces in middle of rule

John D. Mitchell johnm-antlr at non.net
Thu Jan 23 09:38:22 PST 2003


>>>>> "William" == William Lam <xeenman at yahoo com> <xeenman at yahoo.com> writes:
[...]

> Sorry for this newbie question, but I wish to parse a grammar such as a
> Java import statement i.e.

> import aaaa.bbbb.cccc.*;

So, is the language you want actually Java or just something like Java?


> so I set up a grammar similar to this

> import_statement: "import" IDENT (DOT (IDENT | STAR))* SEMI ;

Actually, that allows for invalid import statements to be constructed.
For example:

	import foo.*.*.bar.* ;

Basically, the dot-star construct is only allowable at the end of a package
or type name.

See:
http://java.sun.com/docs/books/jls/second_edition/html/packages.doc.html#70209
for more information from the JLS.


> The problem is that the rule will validate statments filled with spaces,
> such as this

> import aaaa . bbbb . cccc . *;

That's actually correct for Java.

Basically, the "input elements" are lexed and *then* whitespace and
comments are discarded -- resulting in the program's lexical token stream.

See:
http://java.sun.com/docs/books/jls/second_edition/html/lexical.doc.html#44591
http://java.sun.com/docs/books/jls/second_edition/html/names.doc.html#31692
http://java.sun.com/docs/books/jls/second_edition/html/names.doc.html#25430


> How do I make sure my import statements do not have any spaces?  I notice
> that the example under antlr/examples/java exhibit this problem.

I don't understand what you mean by the fact that the java examples exhibit
this problem.  That use of white space is quite legal in Java.  For
example, all of the following are legal.  Ugly, confusing, whatever but
legal:

import java . io . PrintWriter;

import java . io 
		.* ;
import java. /* foo . */ lang.* ;
import java./*foo.*/lang.*;


If want to restrict the uses of whitespace then you'll need to push those
constructs back into the lexer.  I.e., have the lexer recognize the
construct that you want without any use of whitespace.

Hmm... Now that I think about it for 3 seconds, you might be able to do it
by adding a whitespace token stream and having your parser check it when
you get to the constructs that you want.

Take care,
	John


 

Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/ 



More information about the antlr-interest mailing list