[antlr-interest] Re: Lexer makes 2 valid tokens when there is only 1 invalid one

Thu Apr 17 10:52:00 PDT 2003

> Here is a valid SQL stmt: select 123 w from table;
> 
> Here is an invalid SQL stmt: select 123w from table;

I defer to Monty's post on this matter. You would have to either use 
a validating predicate as he advises or, you might want to check for 
whitespace in the hidden TokenStream in your parser.

For "123 w" the Lexer would generate NUMBER WHITESPACE IDENTIFIER
For "123w"  the Lexer would generate NUMBER            IDENTIFIER 

> In the first stmt, the "w" is a column alias for the 
constant "123"; 
> in the 2nd stmt, "123w" is an invalid column name. My problem is 
that 
> I need to weed out bad stmts, like the 2nd one, but I cannot do 
that 
> if my lexer converts it to a valid stmt, like the first one. That's 
> why the parser cannot capture this problem - it doesn't think there 
> is one.

It just needs to use more of the info available (e.g. the hidden 
TokenStream).

Cheers,

Micheal

Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/