[antlr-interest] Too many uses for escape character giving me lexer troubles.
Terence Parr
parrt at cs.usfca.edu
Wed Mar 14 10:05:33 PDT 2007
On Mar 13, 2007, at 6:51 PM, Jeremy D. Frens wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> I'm using ANTLR v3 (and quite liking it).
>
> In my language (http://nolatte.sf.net/), the backslash character is
> the
> escape character, and it gets used for (at least) two different tasks.
> Here's a stripped down grammar:
>
> atom : WORD | IDENTIFIER ;
> WORD : ( ('a'..'z') | ( '\\' '{' ) )+ ;
> IDENTIFIER : '\\' ('a'..'z')+ ;
>
> The key is that the backslash gets used for two purposes: as a real
> escape character (to escape '{' in a WORD) and as the beginning of an
> IDENTIFIER. The problem comes in when my grammar tries to scan and/or
> parse something like this:
>
> abc\xyz
>
> This should be two tokens: a WORD "abc" and an IDENTIFIER "\xyz".
> However, since the backslash is allowed at all in a WORD, the lexer
> consumes it, and then it gets confused by the 'x'.
try putting ID before WORD
Ter
More information about the antlr-interest
mailing list