[antlr-interest] XML island grammar

Matthieu Riou matthieu at offthelip.org
Mon Oct 8 11:28:16 PDT 2007


Thanks a lot, that's really helpful! I roughly see how this can be pieced
together to get something working although I don't fully understand how the
lexer can recognize a bad match.

Say that you have something that looks like a regular expression but isn't
really one, the island grammar parser won't be able to match it, so you have
to "refuse" the match so that another rule in the main grammar can be
checked, right? How does that work, does an exception thrown in
parseRegexpLiteral or parseXMLLiteral forces the main grammar parser to go
look for another match?

Thanks again!
Matthieu

On 10/8/07, David Holroyd <dave at badgers-in-foil.co.uk> wrote:
>
> On Sun, Oct 07, 2007 at 10:34:42PM -0700, Matthieu Riou wrote:
> > I have a main grammar that can embed some pieces of XML. A bit like E4X
> in
> > Javascript if you're familiar with the language. I'd like to handle this
> > with an island grammar but I'm not so clear on how to do detect the XML
> > block. I've had a look at the javadoc island grammar example which gives
> me
> > a pretty good idea of how to delegate parsing. However detecting XML in
> the
> > middle of some code is not as simple as with Javadoc (you can assume /**
> to
> > be a uniquely used token but not <).
> >
> > Here is a code snippet of what I'm trying to parse:
> >
> > process ExternalCounter {
> >   receive(my_pl, start_op) (msg_in) {
> >     resp = <message><counter>0</counter></message>
> >     while(resp.counter < 10) {
> >       invoke(partner_pl, partner_start_op) (msg_in)
> >       resp = receive(partner_pl, partner_reply_op)
> >     }
> >     reply resp
> >   }
> > }
>
> I've hacked together some stuff that tries to handle this kind of thing.
> The overview is that I do a bit of extra admin in order to have the
> parser 'direct' the lexical interpretation of the input, and
> specifically avoid ANTLR's default behaviour of lexing all input at
> startup.
>
>
> See a slightly old description here,
>
>
> http://www.antlr.org/wiki/display/ANTLR3/Island+Grammars+Under+Parser+Control
>
>
> I have a partial E4X grammar (actionscript4 is basically ECMAscript).
> See,
>
>   Main grammar:
>
> http://svn.badgers-in-foil.co.uk/metaas/trunk/src/main/antlr/org/asdt/core/internal/antlr/AS3.g3
>
>   Helper glue code:
>
> http://svn.badgers-in-foil.co.uk/metaas/trunk/src/main/java/uk/co/badgersinfoil/metaas/impl/parser/E4XHelper.java
>
>   E4X grammar (incomplete):
>
> http://svn.badgers-in-foil.co.uk/metaas/trunk/src/main/antlr/uk/co/badgersinfoil/metaas/impl/parser/e4x/E4X.g
>
>
>
> hope that helps,
> dave
>
> --
> http://david.holroyd.me.uk/
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.antlr.org/pipermail/antlr-interest/attachments/20071008/c8d093fd/attachment.html 


More information about the antlr-interest mailing list