[antlr-interest] Similar lexer rules

Tue Apr 29 04:54:12 PDT 2008

At 22:41 29/04/2008, Gioele Barabucci wrote:
 >  quote: name S? '=' S? '<' text '>';
 >  name: ID;
 >  text: (LETTER|S)+;
 >
 >  //RAW_TEXT: (LETTER|S)+;
 >  ID: LETTER+;
 >  fragment LETTER: ('a'..'z'|'A'..'Z');
 >  S: (' '|'\n'|'\t')+;
 >
 >will not parse "xx=< yy >" because "yy" will be matched by token 

 >ID, so the grammar rule 'text' will not be accepted. Is there a
 >way to solve this?

Actually, the problem is that LETTER is a fragment rule.  Since 
(unless you do it manually) fragment tokens don't ever get emitted 
by the lexer, they're not viable in parser rules.

If you change LETTER to ID in the text rule then it should work.

Another option is that you could create a single token for 
"arbitrary stuff in angle brackets", eg. one of these two:
   TAG : '<' .* '>' ;
   TAG : '<' (LETTER | S)* '>' ;

This will only work properly if angle brackets aren't used 
differently in another context, however.