[antlr-interest] Island grammar in AntlrV3

David Holroyd dave at badgers-in-foil.co.uk
Mon Dec 4 15:20:05 PST 2006


On Sat, Sep 02, 2006 at 11:01:43PM +0000, David Holroyd wrote:
> On Sat, Sep 02, 2006 at 11:09:32PM +0400, Ilia Kantor wrote:
> > I have island grammar, that becomes known on parsing stage.
> > 
> > How can I parse it with separate lexer/parser, 
> >   adding result to common tree ?
> 
> I've also been wondering about this, but I don't have an answer yet.
> 
> I'm worried that by the time the parser has the chance to try and fiddle
> with the lexer, lookahead has consumed the input anyway.  Maybe the
> standard infrastructure can let the input character stream 'backtrack'?
> 
> My specific use case is regular expression literals, e.g. I'd like to be
> able to handle,
> 
>   r =   / b; f = r/m;  // regexp literal with 'm' flag
>   r = a / b; f = r/m;  // two expr-statements involving division
> 
> It seems that the lexer needs context from the grammar in order to tell
> what to do on seeing '/'.

I've been avoiding working on this bit of my grammar, but I'm starting
to need it now.

At what level should I attack the problem?

My first idea is to have an action at the point in the outer grammar
where the island grammar's start-marker is recognised, which will...

 1) take the unprocessed tail of the CommonTokenStream that the
    outer parser currently has as input, and turn back into a string
 2) create a new island lexer/TokenStream that reprocesses the tail
    from 1)
 3) create a parser for the island grammar, and parse the new token
    stream from 2)
 4) get the tail of the island grammar's token stream once the
    end-marker was found, and convert back to the lexer for 'this'
    grammar again
 5) replace the original 'input' reference the parser was using, and get
    going with the outer grammar again

If all that works, I can hook the AST built by the island grammar into
the AST that the outer grammar is creating.


How does that compare with the approach that others are taking?  Does it
sound like it might work, or is it wrong-headed and silly?


thanks!
dave

-- 
http://david.holroyd.me.uk/


More information about the antlr-interest mailing list