[antlr-interest] [Antlr3 grammar] how to specify alpha token, numeric token and mix of both

David-Sarah Hopwood david-sarah at jacaranda.org
Wed Oct 21 19:20:47 PDT 2009


Hieu Phung wrote:
> Hi all,
> 
> My grammar has 3 kinds of tokens:
> 1) number: contain numeric character
> 2) alpha: contain alphabetic character;
> 3) mix: contain number and alpha and hyphen, full stop or space
> 
> For example:
> 1/VEC305/03MAR/PTY
> => in the above input data, 03MAR should be interpreted as a number of
> length 2 followed by alpha of length 3. But VEC305 is a mix of length 6.
> 
> If I define grammar like below:
> 
> NUMBER    : ('0'..'9')+ ;
> ALPHA    : ('a'..'z'|'A'..'Z')+;
> MIX    : (NUMBER | ALPHA | OTHER)+;
> fragment OTHER    : (' ' | '-' | '.')+;
> SLANT    :    '/';
> 
> Antlr will return me VEC305 and 03MAR as two MIX tokens. Is there any way to
> define tokens such that Antlr will return me number, slant, mix, slant,
> number, alpha, slant, alpha for the input "1/VEC305/03MAR/PTY" ?

Since you don't want "03MAR" to be interpreted as a MIX, presumably you
mean that a MIX cannot start with a NUMBER. In that case, try:

  fragment DIGIT  : '0'..'9' ;
  fragment LETTER : 'a'..'z' | 'A'..'Z' ;
  fragment SYMBOL : ' ' | '-' | '.' ;

  NUMBER : DIGIT+ ;
  ALPHA  : LETTER+ ;
  MIX    : LETTER+ (DIGIT | SYMBOL) (DIGIT | LETTER | SYMBOL)*
         | SYMBOL (DIGIT | LETTER | SYMBOL)*
         ;
  SLANT  : '/';

-- 
David-Sarah Hopwood  ⚥  http://davidsarah.livejournal.com



More information about the antlr-interest mailing list