I'm no expert, but it looks to me like you have
combined the work of your parser and lexer. I'm just
guessing here, but something like this might be what
you want:
//parser rules
file //assuming all are optional
: (des)? (tn)? (type)? (cden)?
(cust)? (kls)? (fdn)? (tgar)?
(ldn)? (ncos)? (sgrp)? (rnpg)?
(sci)? (ssu)? (xlst)? (scpw)? (sflt)?
//don't know where the date rule goes
EOF
;
protected tn : "TN" (NUMERIC)+ NEWLINE ;
protected des : "DES" anything NEWLINE ;
protected anything
: (
(ALPHA)+
| (NUMERIC)+
| PUNCTUATION
)+ //I'm guessing...
;
protected date
: "DATE"
NUMERIC NUMERIC '/'
NUMERIC NUMERIC '/'
NUMERIC NUMERIC
;
//lexer rules
/* there are probably better ways to wrap
single-character tokens into "word" tokens...
*/
WS: ' ' | '\t' {$setType(Token.SKIP);} ;
ALPHA : ('a'..'z'|'A'..'Z');
NUMERIC :('0'..'9');
PUNCTUATION
: '_' | '-' | '+' | '/' | ';' | '#'
| '*' | '\\' | ':' | ',' | '\'' | '.' | '?'
;
NEWLINE
: ('\r' '\n')+ | ('\n')+ | ('\r')+
{ newline(); })
;
-Matt
--- setuk_x <set at nortelnetworks.com> wrote:
> I am new Java and Antlr.
> I have written a basic parser in Perl before - but
> it is proving slow
> and unwieldy and so I am looking to Antlr to fill
> the gap.
> I need to parse a text file which contains text in
> the format
> (simplest form)
> DES MAIL1
> TN 001 0 02 00
> TYPE SL1
> CDEN DD
> CUST 0
> KLS 1
> FDN
> TGAR 0
> LDN NO
> NCOS 4
> SGRP 0
> RNPG 0
> SCI 0
> SSU
> XLST
> SCPW
> SFLT NO
>
> I need to be able to classify each line a specific
> type so I can pass
> these types to the parser and validate that what I
> have is a valid
> record.
>
> Is the best way to do this using Lexer tokens? Such
> as:-
>
> class TNBLexer extends Lexer;
> options { k = 5;
> defaultErrorHandler = true;
> }
> // TNB is mostly uppercase but we need lowercase in
> here because of
> the CPND
>
> TN : (("TN")+ (NUMERIC)+ NEWLINE);
> DES : (("DES") (ANYTHING)+);
> DATE: (("DATE")+ (NUMERIC NUMERIC '/'NUMERIC NUMERIC
> '/'NUMERIC
> NUMERIC));
> WS: ((' ')|('\t')){$setType(Token.SKIP);};
>
> protected ANYTHING : ((ALPHA|NUMERIC|PUNCTUATION));
> protected ALPHA : ('a'..'z'|'A'..'Z');
> protected NUMERIC :('0'..'9');
> protected PUNCTUATION :('_'|'-
> '|'+'|'/'|';'|'#'|'*'|'\\'|':'|','|'\''|'.'|'?');
> protected NEWLINE: ((('\r' '\n')+ |('\n')+ | ('\r'))
> { newline(); });
>
> Or am I completely on the wrong track.
> I am wading my way through the doc at the moment so
> any advice would
> be helpful.
>
> Thanks Simon
>
>
>
>
>
>
> Your use of Yahoo! Groups is subject to
> http://docs.yahoo.com/info/terms/
>
>
__________________________________
Do you Yahoo!?
SBC Yahoo! DSL - Now only $29.95 per month!
http://sbc.yahoo.com
Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/