[antlr-interest] Lexing an interesting syntax

Wed Jan 2 08:13:07 PST 2008

Hi,

Just started work on a lexer for an Io-based language. I want the lexing 
to handle the same constructs as Io, and mostly it's really easy. I hit 
one little snag though. I have a solution, but it's incredibly ugly. So 
I'm wondering how this can be done in the Antlr way.

To make it easy, the lexing is only on identifiers, where any 
combination of the letter "s" and ":" is valid, "=", ":=" and "::=" is 
valid. That's all.
With these constraints, I need:

* "s:" to lex into "s:"
* "s:=" to lex into "s" and ":="
* "s::=" to lex into "s:" and ":="
* "s::::=" to lex into "s:::" and ":="

How can I accomplish this in a good way? The ugly way I have right now 
depends on an action that checks if the next character is "=" and the 
currently matched token ends with ":". If that's true, it sets a flag 
and strips the ":" away from the text of the token. Conversely, when "=" 
is matched, an action checks if the flag is set, and in that case sets 
the text to be ":=" instead. This is obviously extremely ugly, but it 
seems to work, except that indices will be a bit wrong.

Can anyone give me a better solution to this?

Cheers

-- 
 Ola Bini (http://ola-bini.blogspot.com) 
 JRuby Core Developer
 Developer, ThoughtWorks Studios (http://studios.thoughtworks.com)
 Practical JRuby on Rails (http://apress.com/book/view/9781590598818)

 "Yields falsehood when quined" yields falsehood when quined.