[antlr-interest] Re: ANTLR 3: Lexer problem

Martin Probst mail at martin-probst.com
Mon Sep 12 07:50:10 PDT 2005


Hi,

are you sure it's a good idea to do this XML parsing in the lexer? It
should be really trivial in the parser (as XML is indeed more or less
trivial to parse). Is there a specific reason to do so?

If you increase your Lexer's k to 2 this should (theoretically) work. If
it doesn't, I would opt for a bug in ANTLR.

Martin

On Mon, 2005-09-12 at 16:32 +0200, Oliver Zeigermann wrote:
> Forgot to mention, the problem is that the end tag "</t>" can not be
> parsed as the generated lexer upon seing '<' predicts a second "<t>"
> and chokes when it sees the '/'.
> 
> Oliver
> 
> 2005/9/12, Oliver Zeigermann <oliver.zeigermann at gmail.com>:
> > Again me - I am pretty sure to start bothering ;)
> > 
> > Anyway, any idea why this lexer grammar
> > 
> > ELEMENT
> >     : "<t>"
> >             (ELEMENT
> >             | (options {greedy=true;} : ~'<')*
> >             )*
> >       "</t>"
> >     ;
> > 
> > fails to parse this
> > 
> > <t>Huhu</t>
> > 
> > while this works ok:
> > 
> > ELEMENT
> >     : "<t>"
> >             (ELEMENT
> >             | ~'<'
> >             )*
> >       "</t>"
> >     ;
> > 
> > The second grammar is not suitable for me as I want text (everything,
> > not starting with '>') to be reported in a bunch, and not as single
> > characters. And - of course - this is not my real grammar, but a
> > simplified version of it.
> > 
> > Oliver
> >
> 



More information about the antlr-interest mailing list