[antlr-interest] Parsing HTML pages

Ana Nelson nelson.ana at gmail.com
Thu Jul 3 02:32:43 PDT 2008


Hi, Alexander,

It might be simpler to use an existing HTML parsing tool rather than
writing your own.

Here's one tool I really like, it's written in Ruby:
http://code.whytheluckystiff.net/hpricot/

If you are trying to parse web pages remember that lots of people
write very bad HTML which doesn't conform to the standard, so it can
be very difficult to parse.





2008/7/3 Alexander Nicolaysen Sørnes <alex at thehandofagony.com>:
> Hello,
>
> Since I'm quite new parsing, I'd just like to ask if ANTLR is a good choice
> for extracting info out of HTML pages.  I greatly appreciate any feedback.
>
>
>
> Alexander N. Sørnes
>


More information about the antlr-interest mailing list