[antlr-interest] Merging token vocabularies

Wed Jun 23 12:27:46 PDT 2004

I have been writing multiple parsers to and from various formats, and
have reached a point at which it is impossible to use pure ANTLR
mechanisms to manage all the different token vocabularies.  For
example, I parse language C to language X, where I do this by building
AST for C and then using a tree parser from C AST to a X AST, then
tree walker for X AST to serialize it.  I also parse languages A and B
to C.  Furthermore, most of the parsers actually call other parsers in
order to do the complete job, e.g., a parser for E might get an E
token but then create a D lexer and D parser in order to build an AST
that gets embedded into the E AST.

In short, because of the rampant sharing of token space, the only way
to manage all this seems to be to have all of the lexers, parsers, and
tree parsers importVocab from a single huge CommonTokenTypes, which is
the union of everything that exists.  So scripts have to be used in
order to make sure that CommonTokenTypes is generated properly and
ANTLR rerun until a fixed point is reached.  Does anyone else have an
alternate strategy for handling the problem of token space?

-- 
Franklin

Yahoo! Groups Links

<*> To visit your group on the web, go to:
     http://groups.yahoo.com/group/antlr-interest/

<*> To unsubscribe from this group, send an email to:
     antlr-interest-unsubscribe at yahoogroups.com

<*> Your use of Yahoo! Groups is subject to:
     http://docs.yahoo.com/info/terms/