[antlr-interest] Re: Lexer makes 2 valid tokens when there is only 1 invalid one

micheal_jor open.zone at virgin.net
Mon Apr 14 21:27:42 PDT 2003


> I believe I have a reasonably standard lexer for the SQL language, 
a 
> language in which all identifiers have to begin with an alpha. It 
> therefore correctly identifies "W123" as an identifier, however, if 
I 
> give it "123W" the lexer figures there are two tokens: "123" (a 
> NUMBER) and "W" (an IDENTIFIER). This is wrong, it should reject 
this 

The Lexer is working fine. It is tokenizing the stream of characters 
presented to it accurately. The parser grammar should be responsible 
for determining the validity of the tokens in whatever context they 
occur during parsing.

> (and because by chance this can be valid at the syntactic level, 
the 
> parser cannot do anything about it).

Could you explain this further please?. Why do you believe the parser 
can't do anything about it?. Perhaps examples of SQL text that 
illustrates the issue....

Cheers,

Micheal



 

Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/ 




More information about the antlr-interest mailing list