[antlr-interest] parsing unstructured text with xml-type tags.

Priyank RASTOGI priyank at osellus.com
Mon Jun 7 03:05:29 PDT 2004


Hi,

Sorry if this is very trivial but I could not find a solution in any 
of the examples.

I am writing a parser for a template that contains unstructured text 
with embedded XML-type tags. So one example of such a template is

This is a test template created by <mytag>some text here</mytag>. 
Thanks for your time.

In the lexer, I have defined rules like 
PLAIN_TEXT: consume whatever you see till < tag
STARTTAG: copied from xml parser in examples
ENDTAG: copied from xml parser in examples

So the stream of tokes i am expecting is
PLAIN_TEXT STARTTAG PLAIN_TEXT ENDTAG

Here are my problems...
1) PLAIN_TEXT consumes everything
2) since this is unstructures text, how do I know where to stop (EOF)
3) I also need to support HTML tags in the PLAIN_TEXT (I have to 
consume them in PLAIN_TEXT)

I am stuck how to go about it.

Any pointers would be greatly appreciated.

Thanks
Priyank




 
Yahoo! Groups Links

<*> To visit your group on the web, go to:
     http://groups.yahoo.com/group/antlr-interest/

<*> To unsubscribe from this group, send an email to:
     antlr-interest-unsubscribe at yahoogroups.com

<*> Your use of Yahoo! Groups is subject to:
     http://docs.yahoo.com/info/terms/
 



More information about the antlr-interest mailing list