[antlr-interest] literals versus arbitrary text
Jose San Leandro
jose.sanleandro at ventura24.es
Wed Jul 19 02:07:27 PDT 2006
Hi,
I'm encountering difficulties trying to implement a grammar to process Java
sources, but only to get the main class or interface declaration.
Previously, I was using a regexp such as:
(.*?)?(public|protected|private)\s+(class|interface)(\s+extends.*?)?
(\s+implements.*?)?(\{.*)
Such approach had a performance drawback, and I chose to define a grammar
instead. However, I don't know how to handle the difference between an
occurrence of the 'class' word inside the copyright header, in the class
declaration, or inside the class' code.
I started defining a literal for each of the words I care about, but the
difference is not the word, but its context.
Which is the approach in cases like this? I mean, a grammar which starts
thinking everything is arbitrary text until a context is found, then parse
such context, and digest the rest.
Thank you very much in advance, and thanks Ter and the rest for both ANTLR and
ST.
Jose.
More information about the antlr-interest
mailing list