[antlr-interest] Re: ANTLR 3: Lexer problem

Martin Probst mail at martin-probst.com
Mon Sep 12 08:53:30 PDT 2005


Hi,

> > are you sure it's a good idea to do this XML parsing in the lexer? It
> > should be really trivial in the parser (as XML is indeed more or less
> > trivial to parse). Is there a specific reason to do so?
> 
> Why not? ;) Besides, AFAIK an XML parser in ANTLR isn't actually
> trivial. You would at least need sematic predicates (somehow
> incomplete in ANTLR 3), or more than one grammar.
> 
> Any better ideas?

Well, you'll only need semantic predicates for matching the open/close
tags, but in your case it's limited to a finite set of element types,
isn't it? Otherwise your Lexer wouldn't work too, would it?

Apart from that I've found that ANTLR's strength is not necessarily
within XML, it's quite good for complex grammars, but if you only want
to parse XML there are a lot of high-quality, highly-optimized products
for that. Maybe you can just create a token that contains all of the XML
stuff and then parse the .getText() of that using Xerces or whatever?

> 
> > If you increase your Lexer's k to 2 this should (theoretically) work. If
> > it doesn't, I would opt for a bug in ANTLR.
> 
> How do you set the lookahead in ANTLR 3?

Well, if that isn't a nice question for Terence ;-)

Seriously I don't know, I'm using ANTLR 2.x - we have an existing parser
that works nicely with ANTLR 2.x and in a database environment you don't
change your components just for fun ;-)

Martin



More information about the antlr-interest mailing list