[antlr-interest] All say that literal strings in parser rules are doing harm. Why?

Stefan Mätje Stefan.Maetje at esd-electronics.com
Tue Feb 28 05:37:21 PST 2012


Hi Eric,

thanks for that information. I added my comments below.

But to all the others: Are there more drawbacks to expect using literals in 
parser rules?

Thanks in advance,
	Stefan


Am 28.02.2012 13:37:39 schrieb(en) Eric:
> Hi Stefan,
> 
> As I only use the tools and do not do formal proofs on them, there may be
> more to this than what I present here.
>
> If you are using string and/or char literals in parser rules, then ANTLR
> must create a new set of lexer rules that include all of the string and/or
> char literals in the parser rules. Remember that the parser can only see
> tokens and not raw text. So string and/or char literals cannot be passed to
> the parser.

That's clear so far.

> To see the new set of lexer rules, use org.antlr.Tool –Xsavelexer, and then
> open the created grammar file. The name may be like <grammar>__.g . If you
> have string and/or char literals in your parser rules you will see lexer
> rules with name starting with T__  .

That is a valuable hint to see how the real lexer will be implemented by 
ANTLR.

> The T__ names make it harder to debug because you don't know what they
> mean. 

I always used the generated *.token file to match T__xxx names to the strings 
they mean. But I needed to do that nearly never.

> Also because ANTLR added them at the top, it may cause other problems
> for other lexer rules.

As I only used the keywords directly in the parser rules (punctuation symbols 
have lexer rules) the keywords surprisingly appear in the generated lexer 
intermediate grammar at the point I myself would have written them down.

> 
> Eric

Thank you so far,
	Stefan



> On Tue, Feb 28, 2012 at 6:08 AM, Stefan Mätje <
> Stefan.Maetje at esd-electronics.com> wrote:
> 
> > Dear list members,
> >
> > often I read on this list that including literal strings in parser rules 
> is
> > not recommended. Doing this would provoke problems and make error 
> reporting
> > more difficult.
> >
> > Could somebody explain the possible problems and drawbacks to me. All
> > postings
> > I found on the list so far sound a little bit vague to me.
> >
> > Can somebody please point me to a discussion or example grammar where the
> > pros
> > and cons are displayed more thoroughly?
> >
> > At the moment I have a somewhat mixed grammar file (around 1800 lines)
> > with in
> > part using lexer tokens and in part using string literals in the parser
> > rules.
> > Especially I do that if the keyword exists only in a single rule.
> >
> > Regards,
> >        Stefan Mätje
> >
> >
> > List: http://www.antlr.org/mailman/listinfo/antlr-interest
> > Unsubscribe:
> > http://www.antlr.org/mailman/options/antlr-interest/your-email-address
> >
> 



More information about the antlr-interest mailing list