[antlr-interest] Disabling rules in the lexer

Jeff Wilcox jeff.wilcox at mac.com
Wed Jan 27 13:22:16 PST 2010


Yes, I agree with you, and in general this is how my parsers have worked.  But there are a couple cases where disabling lexer rules is useful and/or necessary.  Disable keywords that exist only in newer versions of the language which could be identifiers in older versions for example; there are other semi tedious ways around that with predicates but it should not be necessary. 

This case though involves a table section of characters, symbols and numbers.  So a N column row of N discrete symbols could otherwise be a single number, a single identifier, a number plus an identifier, etc.  So without special casing the lexer, the easiest thing was to accept possible candidates, suck it all into a string a re-parse in the semantic analyzer.  But that feels like the wrong solution.  

In general though, it seems like there is a bug in ANLTR's treatment of gated semantic predicates in the lexer.  It does not work unless there are other alternatives in the rule.

Is there any other way to completely turn off a rule in the lexer (without throwing a FPE)?

Thanks,
Jeff


On Jan 26, 2010, at 8:58 PM, William B. Clodius wrote:
> Generally don't try to be too restrictive with your lexer and parser. This sort of context dependence is more naturally handled in the semantic analysis. In particular error reporting is much better if you accept things that are ultimately illegal in the lexer and parser and determine whether they are they are illegal in the semantic analysis. Instead of a minimal message such as "Illegal token" you can report "Illegal token for the table structure see constraint # in the language definition", or "Token is not one of the set of ..."
> 
> On Jan 26, 2010, at 7:52 AM, Jeff Wilcox wrote:
> 
>> Hi,
>> 
>> I have a special area in this language that has symbols within a table structure that are normally used in other tokens in other areas of the language (like a couple digits, a couple letters and a couple symbols).  So I am trying to setup the lexer to accept these table tokens only when in a table.  Based on what I have been able to dig up, I believe gated semantic predicates are a valid way to disable rules in the lexer.  However, I am seeing issues with this with ANTLR 3.2 and the java language target.  
>> 
>> So I expected a lexer rules like this to do the trick:  
>> 
>> Level0       : {inTable}?=> '0';
>> 
>> But that actually creates a very strange loop when inTable is false.  I basically throws a FailedPredicateException (which I would not have expected for a gated predicate) and then retries the same token with the same rule, obviously resulting in an infinite loop.
>> 
>> Can someone clarify whether this is allowed and if so whether there is some trick to using it?  I am stumped.  
>> 
>> Thanks
>> Jeff


More information about the antlr-interest mailing list