[antlr-interest] Parsing simple file

Sat Nov 29 15:33:47 PST 2008

Hello list,

I want to parse the string TEST 00125 . The result should be tokenized 
like this name=TEST jobId=00 and mailpieceId=125.
The problem is that for token jobId, the lexer discards the first three 
digits and matches the piece 25(with an UnwantedTokenException for 001) 
and for mailpieceId I get a MissingtokenException   .
I tryied to use greedy=false  on every token definition and rule 
definition but that didn't help.
I need to parse longer strings (approx. 100 chars) composed of codes 
made up of up to 8 digits.
There are no separator between codes.
How can I solve this ?

This is my grammar

grammar Test;
options{
language=Java;
k=1;
}
tokens{
   SPACE = ' ';
}

start    :
   name NEWLINE EOF
   ;
name     :
   WORD+ SPACE jobId mailpieceId
   ;

jobId    :    TWO_DIGIT_CODE ;

mailpieceId
   :    THREE_DIGIT_CODE
   ;

NEWLINE    :    '\r'? '\n';
WORD    :    'A'..'z';

fragment DIGIT
   :    '0' .. '9';
  THREE_DIGIT_CODE
   : DIGIT DIGIT DIGIT
   ;
  TWO_DIGIT_CODE
   :  DIGIT DIGIT
   ;

ONE_DIGIT_CODE
   : DIGIT;