[antlr-interest] parsing unstructured text with xml-type tags.
priyank at osellus.com
Mon Jun 7 03:05:29 PDT 2004
Sorry if this is very trivial but I could not find a solution in any
of the examples.
I am writing a parser for a template that contains unstructured text
with embedded XML-type tags. So one example of such a template is
This is a test template created by <mytag>some text here</mytag>.
Thanks for your time.
In the lexer, I have defined rules like
PLAIN_TEXT: consume whatever you see till < tag
STARTTAG: copied from xml parser in examples
ENDTAG: copied from xml parser in examples
So the stream of tokes i am expecting is
PLAIN_TEXT STARTTAG PLAIN_TEXT ENDTAG
Here are my problems...
1) PLAIN_TEXT consumes everything
2) since this is unstructures text, how do I know where to stop (EOF)
3) I also need to support HTML tags in the PLAIN_TEXT (I have to
consume them in PLAIN_TEXT)
I am stuck how to go about it.
Any pointers would be greatly appreciated.
Yahoo! Groups Links
<*> To visit your group on the web, go to:
<*> To unsubscribe from this group, send an email to:
antlr-interest-unsubscribe at yahoogroups.com
<*> Your use of Yahoo! Groups is subject to:
More information about the antlr-interest