[antlr-interest] grammar notation (every char except...)

bace.spam at gmx.net bace.spam at gmx.net
Fri Apr 20 15:49:01 PDT 2007


> > Hi all,
> > 
> > I am totally new to antlr, but I have some practice with other parser
> gernerators. I want to recognize something like 
> 
> I can help only with parser grammars for v3, which will be probably
> released next month as a final, so I suggest to learn v3 instead. You
> can download the betas, though, and use ANTLRworks. A few points of
> interest are shown here:
> <http://www.antlr.org/wiki/display/ANTLR3/Quick+Starter+on+Parser+Grammars+-+No+Past+Experience+Required>
> If you still prefer 2.7.7, you may get a few pointers nonetheless.

Okay I will change ;) I didn't use v3 because there is no eclipse integration available.

> 
> A general difference between ANTLR 3 and 2.7.7, that v3 uses '' instead
> "" as string delimiters.
> > 
> > "// comment/goes^&on //" and
> > "## comment/goes^&on ##"
> > 
> > So I want to allow everything inside, except the "//" and except the
> "##". It is a principle to let the tokens as much as atomic as possible, isn't
> it. I think 
> 
> Do you want to allow '##' in '//' comments and the other way around? It
> looks that way.

Yes I want, that is similar to a markup language, allow lot of tags between normal text, but the text is not restricted to letter, digits. So I would like to use the other way: allow everything, except these 2-3 character tags.

> 
> > TOKEN_COMMENT : "//" .* "//";
> > 
> > is not recommended. Better should be
> > 
> > TOKEN_SLASH : '/';
> > 
> > I could also imagine to define
> > 
> > TOKEN_TAG : "//";
> > 
> > instead of TOKEN_SLASH.
> > 
> > 
> > How can I specify the content (all chars allowed, except "//") in the
> grammar with antlr (I use 2.7.7)?
> > 
> > comment
> >   :  TOKEN_TAG ~("//" | "##")* TOKEN_TAG
> >   ;
> 
> Adapting the ML_COMMENT rule from the tutorial:
> 
> TOKEN_COMMENT : '//' ( options {greedy=false;} : . )* '//' ;
> 
> This matches multiline comments, as . recognizes the '\n'.
> 

Yes I tried this to, but (as above) I want to allow some other recognized tags within the comment, example:
"// bla ## bla ## bla //"


> > and a lot of other further notations like ( . | ~"//" | ~"##" )* are not
> accepted. Has anyone an idea to get this problem solved?
> 
> ( . | ~"//" | ~"##" )* would recognize everything. (~( '//' | '##' ))*
> may result in your desired behaviour, so I can't guarantee that ~ works
> on strings, too.
>

not in 2.7.7 ;)
 
> Best regards,
> Johannes Luber

THANKS,
Markus
-- 
"Feel free" - 10 GB Mailbox, 100 FreeSMS/Monat ...
Jetzt GMX TopMail testen: http://www.gmx.net/de/go/topmail

-- 
"Feel free" - 10 GB Mailbox, 100 FreeSMS/Monat ...
Jetzt GMX TopMail testen: http://www.gmx.net/de/go/topmail


More information about the antlr-interest mailing list