[antlr-interest] Lexer strangeness
Austin Hastings
Austin_Hastings at Yahoo.com
Sat Oct 6 02:40:32 PDT 2007
Clifford,
Before you get too involved in this problem, consider your statement,
"IDENTIFIER may not end with a digit, but VARIABLE might."
That right there tells me you are going to have grammar problems later,
because you will never completely trust the lexer to determine
IDENTIFIERs from VARIABLES - after all, what is "foo"?
You should probably establish a firm rule - variables always end in a
digit - to simplify things. Or handle disambiguation in the parser,
rather than in the lexer. (My favorite.)
=Austin
Clifford Heath wrote:
> New to Antlr, not to parser-generators. Hi all!
> I'm using ANTLRWorks downloaded today on a MAC under OS/X.
>
> I have the following snippet which works when I do it this way:
>
> IDENTIFIER: ID;
> fragment ID: LETTER ( ( LETTER|DIGIT )* LETTER)? DIGIT?;
> // with:
> fragment LETTER : ('_'|'A'..'Z'|'a'..'z');
> fragment DIGIT: '0'..'9';
>
> but not when I do it this way:
>
> IDENTIFIER: ID DIGIT?;
> fragment ID: LETTER ( ( LETTER|DIGIT )* LETTER)?;
>
> In the latter case it fails to match the IDENTIFIER 'f3', for example.
>
> I have it structured this way because in the full grammar,
> I have two similar things:
> IDENTIFIER: ID;
> VARIABLE: ID DIGIT?;
>
> where IDENTIFIER may not end with a digit, but VARIABLE might.
> Use of a VARIABLE always makes the lexer/parser fail. I thought
> I might have an greedy/non-greedy problem here, but even without
> the similar definitions, the above fails.
>
> What's happening here? Can someone point me in the right direction
> please?
>
> Clifford Heath.
>
>
>
More information about the antlr-interest
mailing list