[antlr-interest] Non-reserved keywords (again)

Wed Sep 14 12:53:13 PDT 2005

On Wed, Aug 10, 2005 at 07:44:07AM -0700, Monty Zukowski wrote:
> Olivier Dragon wrote:
> > I have search long and far to see if anyone had had the same problem I'm
> > having with a language that does not reserve keywords, like SQL and in
> > my case Fortran. I have found many people with the problem, yet no
> > useful solutions. The two main ones I found were
> > 
> > http://www.jguru.com/faq/view.jsp?EID=140
> > and
> > http://www.antlr.org/pipermail/antlr-interest/2002-June/001486.html
> > I have no idea how to execute the first one, that is create a custom
> > token class and have ANTLR actually use it.
> 
> Search through the example code, there should be one that does this.  If
> not, I know the C parser does.
>
> You may also get some mileage out of my parser filter approach.  See
> http://www.codetransform.com/filterexample.html

I found the examples you were talking about, but I think my problem is
much larger than that. The first link above and your parser filter
approach won't work. The problem with the first one is that I have many
rules that start with an identifier, which means I end up with the same
hoistering problem as using a wrapper rule.

No matter what I do I end up with a very large number of non-determinism
problems, which can't be resolved simply.

I thought of not testing literals in my identifier lexer rule, and then
for each NAME (my ID rule) have a syntactic predicate to check if the
text of the NAME token is the one I want. But this causes the same
non-determinism issues as having a keyword rule as mentioned in the
second link above.

The only way for me to fix this problem "The Right Way(tm)" would be to
use Martin's proposition of a stateful lexer... but I hate to think
about this one :o)

Right now I'm making the assumption on input that the code was written
"sanely" without using reserved keywords as identifiers. The only
problem I've encountered so far is the FORTRAN intrinsic function REAL()
which clashes with the type specification REAL for floating points. To
fix this I added a parser rule that was (NAME | "real"). It would be
nice however to be able to resolve this issue entirely without
exceptions.

> ANTLR 3 will be way easier dealing with unreserved keywords.  I doubt if
> the latest preview is stable enough for real use yet.

I don't need production stability, and Terence seems relatively fast at
correcting bugs. I'm using ANTLR for my master's research project. What
would be an issue for me is to have to completely redo a lot of the work
I've done so far (grammar, tree parser, and a fair amount of tree
transformation Java code). Another showstopper is the ANTLRv3
documentation which appears to be scarce... There are some of Terence's
notes in the wiki but nothing substantial, interspersed with a lot of
thinking and design speculations.

If ANTLRv3 does indeed makes this simple and it's not too much work for
me to port my grammars then I may consider it.

Thanks for your help!

-Olivier

-- 
          __-/|    ? ?     |\-__
     __--/  /  \   (^^)   /  \  \--__
  _-/   /   /  /\ / ( )  /\  \   \   \-_
 /  /   /  /  /  (   ^^ ~  \  \  \   \  \
 / Oli Dragon    ( dragonoe at mcmaster.ca \
/  B.Eng. Sfwr   (     )    \    \  \    \
/  /  /    /__--_ (   ) __--__\    \  \  \
|  /  /  _/        \_ \_       \_  \  \  |
 \/  / _/            \_ \_       \_ \  \/
  \_/ /                -\_\        \ \_/
    \/                    )         \/
                        *~
        ___--<***************>--___
       [http://dragon.homelinux.org]
        ~~~--<***************>--~~~
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
Url : http://www.antlr.org/pipermail/antlr-interest/attachments/20050914/bee642e9/attachment.bin