[antlr-interest] Re: Lexer makes 2 valid tokens when there is only 1 invalid one

Mon Apr 14 21:27:42 PDT 2003

> I believe I have a reasonably standard lexer for the SQL language, 
a 
> language in which all identifiers have to begin with an alpha. It 
> therefore correctly identifies "W123" as an identifier, however, if 
I 
> give it "123W" the lexer figures there are two tokens: "123" (a 
> NUMBER) and "W" (an IDENTIFIER). This is wrong, it should reject 
this 

The Lexer is working fine. It is tokenizing the stream of characters 
presented to it accurately. The parser grammar should be responsible 
for determining the validity of the tokens in whatever context they 
occur during parsing.

> (and because by chance this can be valid at the syntactic level, 
the 
> parser cannot do anything about it).

Could you explain this further please?. Why do you believe the parser 
can't do anything about it?. Perhaps examples of SQL text that 
illustrates the issue....

Cheers,

Micheal

Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/