[antlr-interest] grammar notation (every char except...)
Johannes Luber
jaluber at gmx.de
Fri Apr 20 15:08:16 PDT 2007
bace.spam at gmx.net wrote:
> Hi all,
>
> I am totally new to antlr, but I have some practice with other parser gernerators. I want to recognize something like
I can help only with parser grammars for v3, which will be probably
released next month as a final, so I suggest to learn v3 instead. You
can download the betas, though, and use ANTLRworks. A few points of
interest are shown here:
<http://www.antlr.org/wiki/display/ANTLR3/Quick+Starter+on+Parser+Grammars+-+No+Past+Experience+Required>
If you still prefer 2.7.7, you may get a few pointers nonetheless.
A general difference between ANTLR 3 and 2.7.7, that v3 uses '' instead
"" as string delimiters.
>
> "// comment/goes^&on //" and
> "## comment/goes^&on ##"
>
> So I want to allow everything inside, except the "//" and except the "##". It is a principle to let the tokens as much as atomic as possible, isn't it. I think
Do you want to allow '##' in '//' comments and the other way around? It
looks that way.
> TOKEN_COMMENT : "//" .* "//";
>
> is not recommended. Better should be
>
> TOKEN_SLASH : '/';
>
> I could also imagine to define
>
> TOKEN_TAG : "//";
>
> instead of TOKEN_SLASH.
>
>
> How can I specify the content (all chars allowed, except "//") in the grammar with antlr (I use 2.7.7)?
>
> comment
> : TOKEN_TAG ~("//" | "##")* TOKEN_TAG
> ;
Adapting the ML_COMMENT rule from the tutorial:
TOKEN_COMMENT : '//' ( options {greedy=false;} : . )* '//' ;
This matches multiline comments, as . recognizes the '\n'.
> and a lot of other further notations like ( . | ~"//" | ~"##" )* are not accepted. Has anyone an idea to get this problem solved?
( . | ~"//" | ~"##" )* would recognize everything. (~( '//' | '##' ))*
may result in your desired behaviour, so I can't guarantee that ~ works
on strings, too.
Best regards,
Johannes Luber
More information about the antlr-interest
mailing list