[antlr-interest] recognizing a function

Sat Jul 26 06:09:43 PDT 2008

Thank you Ana Nelson and John B Brodie, I have learned much from your
assiduous responses.

My goal is to write a program that gets a valid Fortran code and output the
locations of the functions ( later subroutines and function calls too ).

I am still having a hard time figuring out how can I a grammar that will
only match a certain rule and ignore all other input.

Must I define a full Fortran grammar for that?

On Fri, Jul 25, 2008 at 1:28 AM, John B. Brodie <jbb at acm.org> wrote:

> Greetings!
>
> Guy Kroizman wrote (in part):
> >I have written a grammar that I hoped would find a function definition in
> a
> >Fortran file.
> >Running it produces nothing. s-:
> >
> >I played with it a lot and debugged it with jdb and ANTLRWorks but to
> avail.
> >I wonder if anybody would be so kind to point me to the problem with the
> >grammar.
> >
> >grammar fun;
> >
> > root     :
> >     (functionStatement)*
> >     ;
>
> It is that pesky * on your start rule.
>
> You have said that a valid program (e.g. any parsable derivation starting
> from your root rule) may contain ZERO or more functionStatement's.
>
> So when you run your parser against the input you supplied in the previous
> message.  The parser sees the keyword - er I mean the NAME - PROGRAM as the
> first token it encounters.  PROGRAM is not a valid starting token for the
> functionStatement rule. So the parser just silently quits, without parsing
> anything because it found ZERO functionStatement's and you have said that
> is an okay thing.
>
>
> Suggestions:
>
> 1) I would suggest that you explicitly require an EOF token at the end of
>   any valid input - this will immediately show problems like the one
>   discussed above.  So I would suggest that you change your root rule to:
>
> root : ( functionStatement )* EOF ;
>
>   running your parser with this version of the root rule should produce a
>   syntax error - something similar to "found PROGRAM, expecting FUNCTION"
>
> 2) I would suggest not trying to deal with case insensitivity in your
>   lexer. Rather I would suggest using the case insensitive input file
>   stream posted to the antlr-interest mailing list back in december of
>   2006. ask about it again if you can't find it in the list's archives.
>
> 3) I would not try to recognize keywords using a Parser rule - such as your
>   type rule. Your type rule expects to see each individual letter of the
>   various keywords. However, ANTLR lexers are very greedy, they will
>   consume the longest possible sequence of characters that matches some
>   lexer rule. So your type rule will never see any individual letter
>   because all of the letters will be greedily gobbled up by the NAME
>   rule. Make the type rule be a lexer rule, and see the next suggestion...
>
> 4) You are going to experience a devil of a time trying to deal with
>   keywords that also may be identifiers.  I believe there are lots of
>   messages about this in the mailing list archives.
>
> Hope this helps.
>    -jbb
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.antlr.org/pipermail/antlr-interest/attachments/20080726/bdb0199e/attachment.html