[antlr-interest] Suggested enhancement for code generation with imported grammars
Ron Hunter-Duvar
ron.hunter-duvar at oracle.com
Fri Feb 26 15:51:58 PST 2010
Hi,
Splitting my grammar into a main one and imported sub-grammars solved my
Java class size problem, but it created another unexpected one. The
driver logic for the parsers creates a single parser object for a
language and reuses it for parsing multiple files. It's created
initially with a null TokenStream, then for each input file a new
TokenStream is created and set with a call to setTokenStream (as well as
a call to reset). This worked fine when it was all in one grammar. But
each imported grammar creates a separate parser class and the main
parser object gets fields generated for it with delegate sub-parser
objects assigned to them in the constructor. A call to setTokenStream
only sets the token stream for the top level parser object, not for the
delegates. That leaves the delegates still using the TokenStream passed
into the constructor. In my case, that resulted in a null pointer
exception as soon as a rule in a delegate was called, because input was
null.
It would be nice if an overriding setTokenStream method was generated in
the parser class that knew about the delegates and called setTokenStream
on them as well. This is already being done for setTreeAdaptor. So it
should be simple to do it for setTokenStream as well.
Of course, if I provided a valid TokenStream when I created the parser
object and didn't try to reuse the parser, I wouldn't run into this.
There probably isn't a good argument for reusing them, as the amount of
extra garbage created by creating new ones each time would be trivial
compared to all the Token and other objects created. But if parser
objects are not intended to be reusable, the reset and setTokenStream
methods in org.antlr.runtime.Parser should be protected rather than
public, as public implies they are there for public use (but I know
sometimes it becomes necessary for public to be used when neither
package or protected will work). At least it would be good if it were
stated somewhere that reuse of parser objects is not intended/supported.
In my case it would be non-trivial to eliminate all the reuse of parser
objects, so I hacked around it with some ugly introspection code to find
the delegates and call setTokenStream on them too.
Ron
--
Ron Hunter-Duvar | Software Developer V | 403-272-6580
Oracle Service Engineering
Gulf Canada Square 401 - 9th Avenue S.W., Calgary, AB, Canada T2P 3C5
All opinions expressed here are mine, and do not necessarily represent
those of my employer.
More information about the antlr-interest
mailing list