[antlr-interest] NON-reserved Words

mzukowski at yci.com mzukowski at yci.com
Tue Apr 29 08:08:34 PDT 2003


I used your second approach with my AREV parser and it worked ok, but it
still introduces ambiguities you have to deal with.  Then I started playing
around with the open source PICK basic parser and decided to implement a
filter between the lexer and parser.  See
http://www.codetransform.com/filterexample.html

The nice thing about a filter is that it can substantially reduce ambiguity
in the parser.  So, for instance, if CASE as a keyword is always followed by
WHEN then a simple filter could turn CASE back into an ID if it is not
followed by WHEN.

Monty

-----Original Message-----
From: Brian Hagenbuch [mailto:bhagenbuch at didera.com]
Sent: Monday, April 28, 2003 4:54 PM
To: antlr-interest at yahoogroups.com
Subject: [antlr-interest] NON-reserved Words


I'm trying to parse a a dialect of SQL in which there are both
reserved and "non-reserved" words.  A non-reserved word is one
that can be an identifier or a syntactic marker depending on the
context.  For example CASE can be a variable, as in

  ... WHERE CASE=12 AND...

or it can begin an expression, as in

  SELECT CASE WHEN AGE<20 THEN 'kid' ELSE 'geezer' END, ...

The language has about 100 such non-reserved words and about 100
reserved words.

The approach suggested by the FAQ and the Yahoo Group seems to
something like

- Have the lexer treat non-reserved words (like CASE) as ordinary
  identifiers, i.e., don't represent them in the literals table.

- Use semantic predicates in the parser to distinguish the cases.

I started down this road, but found it to be complicated to
implement and hard/impossible to remove non-determinism.  So I'm 
considering a different approach:

- Have the lexer treat non-reserved words as keywords and collect
  them under a parser rule such as maybeIdentifier.

- In the parser, include maybeIdentifer in the rule for variable.

- Use syntactic predicates to distinguish the cases.

So far this seems to be easier, but I'm inexperienced with Antlr
and parsing in general and I'm concerned that there's some GOTCHA
lurking in there somewhere.

Ideas?  Experiences?  Thank you.


 

Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/ 


 

Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/ 




More information about the antlr-interest mailing list