[antlr-interest] Defining keywords with varying length

John Green greenj at ix.netcom.com
Tue Nov 29 09:10:15 PST 2005


Hi Lars,

My parser for Progress has to support its nearly 1000 keywords, many of those with abbreviations. Not only that, but the language is case insensitive. Antlr handles all this fine.

You will want to review this link, and the two links it leads to:
  http://antlr.org/doc/lexer.html#Keywords_and_literals

In your case, you will want LIBRARY to be the keyword, but your literals table will contain "LIB", "LIBR", ..., "LIBRARY", all with the integer type for LIBRARY.

In other words, the lexer deals with it, not the parser. All your parser grammar has to deal with is the LIBRARY token type, because the lexer has taken care of recognizing the string literal.

HTH,
John
www.joanju.com


Lars von Wedel wrote:
> 
> Hello,
> 
> I am about to write parsers for (lots of) input files to an existing 
> tool. The language I have to deal with (besides other annoyances) 
> permits abbreviated keywords, so you could write something like
> 
> LIBRARY
> 
> as a keyword, but since
> 
> LIB
> 
> is unique to identify that keyword it would be sufficient, as well as 
> anything else such as LIBR, LIBRAR or so. Defining all these variants as 
> tokens and dealing with them in the parser does not appear as a clean 
> solution to me.
> 
> How would I best incorporate this into a lexer/parser developed in Antlr?
> 
> Lars
> 
> 
> 
> 




More information about the antlr-interest mailing list