[antlr-interest] Why does the unused rule effect parser behaviour?

Tue Jan 10 04:13:00 PST 2012

At 00:45 11/01/2012, Seref Arikan wrote:
>Thanks, very useful advice regarding token ranges. Is that the 
>reason for the trouble others when through in their grammars 
>(such as SQL grammar) to list each and every char? (or is one of 
>the reasons?)

Possibly, although another common reason is for case 
insensitivity.

The general rule of thumb I use (since most grammars have a 
skip-whitespace rule) is that anything which cares about 
whitespace must be a lexer rule, while anything that doesn't care 
must be a parser rule (eg: "14 .05" should normally be parsed as 
two separate numbers -- the space is significant, thus the 
single-number rule must be a lexer rule).  Similarly, anything 
that deals with large ranges of characters should be a lexer 
rule.  There are a few exceptions to this, of course, but it 
covers the majority.