[antlr-interest] Re: Lexer makes 2 valid tokens when there is only 1 invalid one

bchagenbuch bhagenbuch at didera.com
Wed Apr 16 14:03:29 PDT 2003


I agree with you that this isn't so easy, especially when you consider 123E3, etc.  My 
lexer has the same problem.  

Perhaps we can both take consolation in the fact that Oracle9i sees both 

  SELECT 123 W ...

and 

  SELECT 123W ...

as if they were 

  SELECT 123 AS W ...

 while PostgreSQL rejects them both with ' parse error at or near "w" '.

It appears to me that the SQL99 standard agrees with you: 123 and W are 
<nondelimiter token>s and, hence, need whitespace between them.

--- In antlr-interest at yahoogroups.com, "martinkbraid" <mbraid at s...> wrote:
> I believe I have a reasonably standard lexer for the SQL language, a 
> language in which all identifiers have to begin with an alpha. It 
> therefore correctly identifies "W123" as an identifier, however, if I 
> give it "123W" the lexer figures there are two tokens: "123" (a 
> NUMBER) and "W" (an IDENTIFIER). This is wrong, it should reject this 
> (and because by chance this can be valid at the syntactic level, the 
> parser cannot do anything about it). So what am I doing wrong. A 
> fragment of my lexer follows:
> 
> Many thanks
> Martin Braid
> 
> protected
> DIGIT    : ('0'..'9');
> 
> protected
> LETTER   : ('a'..'z');
> 
> protected
> SPECIAL  : "_" ;
> 
> protected
> EXPONENT : "e" ( PLUS | MINUS )? (DIGIT)+ ;
> 
> protected
> INTEGER : (DIGIT)+;
> 
> protected
> FLOAT  : (INTEGER '.' INTEGER) => INTEGER '.' INTEGER EXPONENT)?
>        | (INTEGER '.'        ) => INTEGER '.'         (EXPONENT)?
>        | (        '.' INTEGER) =>         '.' INTEGER (EXPONENT)?
>        ;
> 
> NUMBER :  (FLOAT) => FLOAT   {$setType(FLOAT);}
>        |  INTEGER {$setType(INTEGER);}
>        |  '.'     {$setType(DOT);}
>        ;
> 
> IDENT   options {testLiterals = true;}
>        : (LETTER) ( SPECIAL | LETTER | DIGIT )*;


 

Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/ 




More information about the antlr-interest mailing list