[antlr-interest] Help w/Complicated lexer/parser

Mon Mar 3 08:23:33 PST 2003

I think the multiple lexers/parsers is the right way.  If it's really tricky
you could do something like my filter example:
www.codetransform.com/filterexample.html

Monty

-----Original Message-----
From: Andrew Deren [mailto:andrew at adersoftware.com]
Sent: Thursday, February 27, 2003 8:43 PM
To: antlr-interest at yahoogroups.com
Subject: [antlr-interest] Help w/Complicated lexer/parser

I don't know if it's really complicated, but can't figure it out.
Basically I'm writing parser for a language called cold fusion. It's like
HTML with embeded code in it (like PHP or JSP).

The language looks like this:
each construct starts with <CF followed by tag name (ex. <CFLOOP)
after the beginning tag there are name=value attributes like html. However
each value could itself be an expression. Expressions are enclosed in ##
(ex. #function(x + 3)#)
So an example attribute could be from="#someFunction(x - 2 * f(x))#"
Another complicated thing is that a function could accept a string, and a
string could have an expression inside of it.
ex. #left("this is x:#x# and y:#y#", 3)
Additionally regular text (outside of tags) could have expressions in it
(enclosed in ##)

First I tried using Disambiguating predicates, but that turned out to be too
cumbersome.
My second attempt was to use multiple lexers, but I can't find much
documentation and examples on it (except a small java/javadoc example).
Besides, I think I would have to have too many lexers/parsers.
The way I was thinking, I would need:
Inside of tag parser (starts when <CF is seen)
Inside of expression parser (starts when # is seen)
Regular parser (for text, switches to tag or expression parsers)

Is that the right direction?
Thanks,
Andrew

Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/