[antlr-interest] Lexer doesn't agree with me (gives other tokens than I need)

Alexander Brown abrown at analytics8.com
Sun Apr 12 03:51:52 PDT 2009


Hi,

It's sort of an odd question in the sense that LEFT or RIGHT (either as outer join type specifiers or as character value functions in TSQL) are legtimate keywords rather than identifiers (like column and table names or schema qualifiers, etc).  There's no ambiguity at a parser level for those two scenarios though, so there isn't any need to force the lexer to generate an identifier in one scenario and a keyword in another.

I can only imagine that you want to identify the keywords as identifiers for two reasons- either the DB doesn't constrain users from using keywords as identifiers (CREATE TABLE TABLE, for example) or that what you want in your AST is to produce as generic character function node for all character functions with a specific signature (function_name LPAREN character_value_expression COMMA numeric_value_expression RPAREN, for example).  Even in the latter scenario I don't think you really want to identify the function 'RIGHT' or 'LEFT' as an identifier.

All this being said, you could probably could rewrite the AST to do what you want (haven't tried it though).  Maybe if you provide some more detail about what you are trying to achieve at the AST level perhaps I could suggest a way to achieve it?

Alex

Alexander Brown
Partner | Analytics 8 | Tel +61 2 9299 4430 | Mob +61 424 043 485| abrown at analytics8.com | www.analytics8.com
 


________________________________


Hi, 

I'm creating a parser for a SQL dialect (sue me :oP) and I'm facing a problem regarding the lexer generating the wrong kind of token in a certain context. 

Basically I have defined two tokens called LEFT & RIGHT which are needed to parse SQL joins (left outer join, right outer join, etc...) 

LEFT : 'left' ; RIGHT : 'right' ; 

The problem occurs when I'm matching the SQL *functions* LEFT & RIGHT. 

LEFT (functionArgs) RIGHT (functionArgs) 

I want the function name to be an IDENTIFIER token but no can do due to the lexer... It gives me a LEFT or RIGHT token obviously :'o( 

What are the general recommendations you experienced ANTLR buffs can give me? The parser is generating an AST so I don't really care how it matches as long as I can keep my AST neat 'n tidy :o/ 

Thanks! Bill 

 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.antlr.org/pipermail/antlr-interest/attachments/20090412/331ec344/attachment.html 


More information about the antlr-interest mailing list