[antlr-interest] ANTLR version 2.X to ANTLR version 3.X (the horror, the horror)

Fri Aug 8 13:07:01 PDT 2008

Ian Kaplan wrote:

> More examples which semantic actions (Java code) would be very helpful. 

> It took me some time, for example, to understand that the blocks follow
each other.

> 

>  @init{

>  }

>  @after{

>   }

Yes, it is not obvious whether tags like this should go before or after the
colon and that order would matter so much. The order of the options, tokens,
header, members tags is also rigid with confusing error messages when they
are out of order, making it seem as though the tag itself is invalid.

> As noted in the 2.X to 3.X documentation, there's no built in way to

> create case insensitivity without overriding the scanner input stream.

It may be noted, but no good solution is. Case-insensitive keywords is such
a common feature that it's difficult to believe that a straightforward
solution is not provided.

> The good news is that there's documentation, but for some reason with
ANTLR

> there never seems to be enough documentation to make the initial learning

> curve anything but painful.

Exactly.

It's free. It's powerful. There are some great people actively improving the
tool. All this is not taken for granted. But it doesn't change the fact that
even with the DAR (Definitive Antlr Reference), getting it to do what you
want is a frustrating experience. It seems fine once you've gone through the
multi-day learning experience. But the syntax is varied enough and methods
different enough that there will still be some struggling.

A v3.1 "gotchas" page, along with an Antlr Cookbook would probably go a long
way toward helping those new to v3 specifically, and ANTLR in general.

>  I noticed that the person who maintains the 2.X C++ grammar is looking
for someone 

> to take it over since they don't want to deal with the conversion to ANTLR
3.X.  I can't

> say I blame them.   My grammar is a lot smaller and it's going to be at
least a two day 

> slog with a fair amount of frustration.

Likely more than that.

> In addition to the fact that the 2.X grammar is obsolete, I'm doing the
conversion 

> because I am hoping that the LL(*) will avoid left factoring my grammar
into a less 

> clear form.  I hope that I am not disappointed.

I don't know that LL(*) will solve all of your left-factoring woes, but the
backtracking does help you make your grammar more readable (and therefore
maintainable). It adds some parsing overhead, but it's worth it. (How many
times have we wished an LALR parser generator would just "figure it out"?)
And it can be localized to just the rules you need, once you get it working,
by turning off backtracking at the global level and adding it to individual
rules that are ambiguous to the LL(k) algorithm:

stmt  options {backtrack=true; memoize=true;}

      :  expr .

      ;

Brent

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.antlr.org/pipermail/antlr-interest/attachments/20080808/c18e1930/attachment-0001.html