[antlr-interest] literals versus arbitrary text

Jose San Leandro jose.sanleandro at ventura24.es
Wed Jul 19 02:07:27 PDT 2006


Hi,

I'm encountering difficulties trying to implement a grammar to process Java 
sources, but only to get the main class or interface declaration.
Previously, I was using a regexp such as:

(.*?)?(public|protected|private)\s+(class|interface)(\s+extends.*?)?
(\s+implements.*?)?(\{.*)

Such approach had a performance drawback, and I chose to define a grammar 
instead. However, I don't know how to handle the difference between an 
occurrence of the 'class' word inside the copyright header, in the class 
declaration, or inside the class' code.
I started defining a literal for each of the words I care about, but the 
difference is not the word, but its context.
Which is the approach in cases like this? I mean, a grammar which starts 
thinking everything is arbitrary text until a context is found, then parse 
such context, and digest the rest.

Thank you very much in advance, and thanks Ter and the rest for both ANTLR and 
ST.

Jose.


More information about the antlr-interest mailing list