[antlr-interest] newbie question, escaped characters
Rob Shields
rob at cmsnet.org.uk
Wed Mar 12 11:03:45 PDT 2008
Richard Clark wrote:
> I have a better answer (courtesy of a long drive where I had time to think.)
>
> I suggested "k=2;" because ANTLR 2 is a LL(k) parser -- it looks ahead
> "k" tokens when resolving ambiguities and the default k is 1. In your
> case, it's looking at that leading '\\' in more than one place and
> resolves the ambiguity in favor of the first lexer rule using it. But
> it makes the resulting code more complex and is a bit like swatting
> flies with a sledgehammer.
That's what I thought. I was a bit hesitant to change k in case it had
side effects.
> Rather than alter the lookahead, it's simpler to collapse the
> decisions into one rule and alter the text in the token for your
> couple of special cases. You should be able to write this:
>
> protected SIMPLETERM: (TERM_CHAR)+;
>
> protected TERM_CHAR: SIMPLE_TERM_CHAR | ESCAPED_TERM_CHAR;
>
> protected SIMPLE_TERM_CHAR: ~( ' ' | '\t' | '!' | '(' | ')' | ':' |
> '^' | '[' | ']' | '\\' | '\"' | '{' | '}' | '~' | '/' | '\r' | '\n' );
>
> protected ESCAPED_TERM_CHAR: '\\'! (
> '*' { $setText("\\*"); }
> | '?' { $setText("\\?"); }
> | '\\' | '+' | '-' | '!' | '(' | ')' | ':' | '^' | '[' | ']' | '\"'
> | '{' | '}' | '~' | '/'
> );
Excellent. I have tried that and can confirm that it works. I'm really
pleased, thankyou :)
> That should do it. (By the way, ANTLR 3 replaces $setText("foo"); with
> $text = "foo"; )
>
> ...Richard
Well $setText("foo"); seems to work so I guess I must be using ANTLR 2.
The jar file I have is from 2004 or thereabouts.
Rob
More information about the antlr-interest
mailing list