[antlr-interest] Lexing an interesting syntax
Jim Idle
jimi at temporal-wave.com
Wed Jan 2 08:42:45 PST 2008
> -----Original Message-----
> From: Ola Bini [mailto:ola.bini at gmail.com]
> Sent: Wednesday, January 02, 2008 8:13 AM
> To: antlr-interest at antlr.org
> Subject: [antlr-interest] Lexing an interesting syntax
>
> Hi,
>
> Just started work on a lexer for an Io-based language. I want the
lexing
> to handle the same constructs as Io, and mostly it's really easy. I
hit
> one little snag though. I have a solution, but it's incredibly ugly.
So
> I'm wondering how this can be done in the Antlr way.
>
> To make it easy, the lexing is only on identifiers, where any
> combination of the letter "s" and ":" is valid, "=", ":=" and "::=" is
> valid. That's all.
> With these constraints, I need:
>
> * "s:" to lex into "s:"
> * "s:=" to lex into "s" and ":="
> * "s::=" to lex into "s:" and ":="
> * "s::::=" to lex into "s:::" and ":="
Don't try to do so much of this in the lexer is the answer. Allow a
separate token, COLON and either make the operator ":=" in the lexer or
perhaps even parse that in the parser. But adding a bit more to your
requirements by guessing ;-), then for the input:
s:
s:=f
s::=f
s::::=f
s:=h==i
The grammar below should do it:
grammar t;
code
: line*
;
line
: id ((COLEQ | OPASS) expr?)?
;
expr
: e1 (OPEQ e1)*
;
e1
: id
;
id
: ID COLON*
;
COLEQ : ':=' ;
OPEQ : '==' ;
OPASS : '=' ;
COLON : ':' ;
ID : 'a'..'z'+ ;
WS : ('\r' | ' ' | '\n' | '\t')+
{
$channel = HIDDEN;
}
;
Note that you may find the lexer does not do what you expect if you have
other uses of COLON in pairs of operator characters. Then you will need
to use predicates in a COLON lexer rule that start with ':', then select
'', '=' and 'x' where 'x' is your other character, and set $type
accordingly. Sounds like you won't need that though.
Jim
More information about the antlr-interest
mailing list