[antlr-interest] recognizing a function
Guy Kroizman
kroizguy at gmail.com
Sat Jul 26 06:09:43 PDT 2008
Thank you Ana Nelson and John B Brodie, I have learned much from your
assiduous responses.
My goal is to write a program that gets a valid Fortran code and output the
locations of the functions ( later subroutines and function calls too ).
I am still having a hard time figuring out how can I a grammar that will
only match a certain rule and ignore all other input.
Must I define a full Fortran grammar for that?
On Fri, Jul 25, 2008 at 1:28 AM, John B. Brodie <jbb at acm.org> wrote:
> Greetings!
>
> Guy Kroizman wrote (in part):
> >I have written a grammar that I hoped would find a function definition in
> a
> >Fortran file.
> >Running it produces nothing. s-:
> >
> >I played with it a lot and debugged it with jdb and ANTLRWorks but to
> avail.
> >I wonder if anybody would be so kind to point me to the problem with the
> >grammar.
> >
> >grammar fun;
> >
> > root :
> > (functionStatement)*
> > ;
>
> It is that pesky * on your start rule.
>
> You have said that a valid program (e.g. any parsable derivation starting
> from your root rule) may contain ZERO or more functionStatement's.
>
> So when you run your parser against the input you supplied in the previous
> message. The parser sees the keyword - er I mean the NAME - PROGRAM as the
> first token it encounters. PROGRAM is not a valid starting token for the
> functionStatement rule. So the parser just silently quits, without parsing
> anything because it found ZERO functionStatement's and you have said that
> is an okay thing.
>
>
> Suggestions:
>
> 1) I would suggest that you explicitly require an EOF token at the end of
> any valid input - this will immediately show problems like the one
> discussed above. So I would suggest that you change your root rule to:
>
> root : ( functionStatement )* EOF ;
>
> running your parser with this version of the root rule should produce a
> syntax error - something similar to "found PROGRAM, expecting FUNCTION"
>
> 2) I would suggest not trying to deal with case insensitivity in your
> lexer. Rather I would suggest using the case insensitive input file
> stream posted to the antlr-interest mailing list back in december of
> 2006. ask about it again if you can't find it in the list's archives.
>
> 3) I would not try to recognize keywords using a Parser rule - such as your
> type rule. Your type rule expects to see each individual letter of the
> various keywords. However, ANTLR lexers are very greedy, they will
> consume the longest possible sequence of characters that matches some
> lexer rule. So your type rule will never see any individual letter
> because all of the letters will be greedily gobbled up by the NAME
> rule. Make the type rule be a lexer rule, and see the next suggestion...
>
> 4) You are going to experience a devil of a time trying to deal with
> keywords that also may be identifiers. I believe there are lots of
> messages about this in the mailing list archives.
>
> Hope this helps.
> -jbb
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.antlr.org/pipermail/antlr-interest/attachments/20080726/bdb0199e/attachment.html
More information about the antlr-interest
mailing list