[antlr-interest] non-LL(*) HTML grammar

denstar valliantster at gmail.com
Sat Dec 25 13:44:25 PST 2010


On Sat, Dec 25, 2010 at 10:58 AM, Andrzej wrote:
> I convert grammar http://www.antlr.org/grammar/HTML to ANTLR version 3
> from 2.
> I meet error:
> error(211): HTML.g:209:3: [fatal] rule paragraph has non-LL(*) decision
> due to recursive rule invocations reachable from alts 1,2.  Resolve by
> left-factoring or using syntactic predicates or using backtrack=true option.

Unless you have a really compelling reason to parse HTML with ANTLR,
I'd recommend using something like Jericho HTML or TagSoup instead.

Otherwise, you can check out the XML parsing example in the wiki, as
it's probably the best route to an HTML parser.

Sadly, any use of predicates or backtracking seems to make using the
nifty tools for ANTLR kinda hard...

I'm a tad crazy, and am working on a HTML-ish parser using ANTLR, but
it's still pretty rough.

:Den

-- 
As a rule, all heroism is due to a lack of reflection, and thus it is
necessary to maintain a mass of imbeciles. If they once understand
themselves the ruling men will be lost.
Ernest Renan


More information about the antlr-interest mailing list