[antlr-interest] Suggested enhancement for code generation with imported grammars

Ron Hunter-Duvar ron.hunter-duvar at oracle.com
Fri Feb 26 15:51:58 PST 2010


Hi,

Splitting my grammar into a main one and imported sub-grammars solved my 
Java class size problem, but it created another unexpected one. The 
driver logic for the parsers creates a single parser object for a 
language and reuses it for parsing multiple files. It's created 
initially with a null TokenStream, then for each input file a new 
TokenStream is created and set with a call to setTokenStream (as well as 
a call to reset). This worked fine when it was all in one grammar. But 
each imported grammar creates a separate parser class and the main 
parser object gets fields generated for it with delegate sub-parser 
objects assigned to them in the constructor. A call to setTokenStream 
only sets the token stream for the top level parser object, not for the 
delegates. That leaves the delegates still using the TokenStream passed 
into the constructor. In my case, that resulted in a null pointer 
exception as soon as a rule in a delegate was called, because input was 
null.

It would be nice if an overriding setTokenStream method was generated in 
the parser class that knew about the delegates and called setTokenStream 
on them as well. This is already being done for setTreeAdaptor. So it 
should be simple to do it for setTokenStream as well.

Of course, if I provided a valid TokenStream when I created the parser 
object and didn't try to reuse the parser, I wouldn't run into this. 
There probably isn't a good argument for reusing them, as the amount of 
extra garbage created by creating new ones each time would be trivial 
compared to all the Token and other objects created. But if parser 
objects are not intended to be reusable, the reset and setTokenStream 
methods in org.antlr.runtime.Parser should be protected rather than 
public, as public implies they are there for public use (but I know 
sometimes it becomes necessary for public to be used when neither 
package or protected will work). At least it would be good if it were 
stated somewhere that reuse of parser objects is not intended/supported.

In my case it would be non-trivial to eliminate all the reuse of parser 
objects, so I hacked around it with some ugly introspection code to find 
the delegates and call setTokenStream on them too.

Ron

-- 
Ron Hunter-Duvar | Software Developer V | 403-272-6580
Oracle Service Engineering
Gulf Canada Square 401 - 9th Avenue S.W., Calgary, AB, Canada T2P 3C5

All opinions expressed here are mine, and do not necessarily represent
those of my employer.



More information about the antlr-interest mailing list