[antlr-interest] Island grammars: returning information through tokens?

Fri Feb 25 02:04:02 PST 2011

Hi there,

I've been looking at the island-grammar example, switching between
Simple and Javadoc at the lexer level.  One thing that's not clear to me
from this is how information can be returned to the containing document
from this kind of setup, because it doesn't seem to be possible to
easily attach information to lexer tokens (the island grammar example
treats the island (Javadoc) material as a single JAVADOC token in the
containing (Simple java) document, and I'm not sure how to attach
information discovered in the island grammar to the JAVADOC token).

It's artificial, but let's say I wanted the Simple parser to know the
names of authors contained in the Javadoc.  In the example, all
information is simply println()'ed, never communicated upwards.

I've tried a number of different ways of doing this, several have turned
out to be wrong-headed:

  - 'returns' from a JAVADOC token: looks as if tokens could have return
    values at one stage in ANTLR's development, but not any more.
  - Scoped variables: don't seem to work between the lexer and the
    parser, or maybe I didn't try hard enough?  I can see it making
    sense that they don't work in this context though.
  - Subclass CommonToken and use TokenLabelType to allow JAVADOC tokens
    to contain additional information, then add this information to the
    JAVADOC token as I complete the Javadoc parse.
  - Patch my lexer to emit multiple tokens, adding a fake "AUTHOR" token
    derived from the Javadoc parse at the same time that it explicitly
    emits the JAVADOC token.

I have the last method working, but is there a better way to do this?

Conrad