[antlr-interest] Question about lexer/parser boundaries

Mon Jun 4 13:21:35 PDT 2007

> -----Original Message-----
> From: antlr-interest-bounces at antlr.org [mailto:antlr-interest-
> There is also this, the pattern of which is very useful to "merge"
> multiple tokens into a single token for the parser in order to reduce
> lookahead 'k':
> 
> MULTIPLE : TOKEN1 TOKEN2;
> TOKEN1 : 'Test1';
> TOKEN2 : 'Test2';
> 
> So my question is this: Are DUMMY, DUMMY2, and MULTIPLE permissible
> lexer rules, or should they still be defined as parser rules? (then
> defined with lower case letters.)

If this is indeed useful in your lexer, then you make TOKEN1 and TOKEN2
fragments. Basically, it should fall out sensibly if you remember not to
use no fragment rules within other lexer rules and if it is difficult to
do that without a lot of messing around creating no fragment rules that
are just a reference to a single fragment, then you are probably trying
to do too much in the lexer. 

So, in the simple context in which your example is presented ;-), then
this would not be a lexer rule, but a parser rules saying:

multiple : TOKEN1 TOKEN2;

Jim