[antlr-interest] Re: Guidance Required
setuk_x
set at nortelnetworks.com
Thu Jul 3 11:51:34 PDT 2003
OK recoded with the following.
Given a file which starts "TN 000 00 00 00"
It errors not recognising character
When debugging the parser code the first comparison against a literal
TN doesn't evalate to true - even though the first token is a literal
TN.
Any ideas.
//Attempt to classify the TNB File into TNB Records
// Containing explicit tokens
class TNBParser extends Parser;
options { k = 3;
defaultErrorHandler = true;
}
//A tnbfile consists of one or more tnbrecords
tnbfile : (record)+ EOF;
record //A tnbrecord consists of this number of explicit values
: ( (tn) (des)? (date))
;
protected tn : "TN" (NUMERIC)+ NEWLINE ;
protected des : "DES" anything NEWLINE ;
protected anything :
(
(ALPHA)+
| (NUMERIC)+
| (PUNCTUATION)+
);
protected date
: "DATE"
NUMERIC NUMERIC FW_SLASH
NUMERIC NUMERIC FW_SLASH
NUMERIC NUMERIC
;
class TNBLexer extends Lexer;
options { k = 3;
defaultErrorHandler = true;
}
// TNB is mostly uppercase but we need lowercase in here because of
the CPND
WS: '\t' {$setType(Token.SKIP);} ;
protected ALPHA : ('a'..'z'|'A'..'Z');
protected NUMERIC :('0'..'9');
protected PUNCTUATION :('_'|'-
'|'+'|FW_SLASH|';'|'#'|'*'|'\\'|':'|','|'\''|'.'|'?');
protected NEWLINE: ((('\r' '\n')+ |('\n')+ | ('\r')) { newline(); });
protected SPACE: ' ';
protected FW_SLASH: '/';
--- In antlr-interest at yahoogroups.com, Matt Benson <gudnabrsam at y...>
wrote:
> I'm no expert, but it looks to me like you have
> combined the work of your parser and lexer. I'm just
> guessing here, but something like this might be what
> you want:
>
> //parser rules
>
> file //assuming all are optional
> : (des)? (tn)? (type)? (cden)?
> (cust)? (kls)? (fdn)? (tgar)?
> (ldn)? (ncos)? (sgrp)? (rnpg)?
> (sci)? (ssu)? (xlst)? (scpw)? (sflt)?
> //don't know where the date rule goes
> EOF
> ;
>
> protected tn : "TN" (NUMERIC)+ NEWLINE ;
> protected des : "DES" anything NEWLINE ;
> protected anything
> : (
> (ALPHA)+
> | (NUMERIC)+
> | PUNCTUATION
> )+ //I'm guessing...
> ;
>
> protected date
> : "DATE"
> NUMERIC NUMERIC '/'
> NUMERIC NUMERIC '/'
> NUMERIC NUMERIC
> ;
>
> //lexer rules
>
> /* there are probably better ways to wrap
> single-character tokens into "word" tokens...
> */
> WS: ' ' | '\t' {$setType(Token.SKIP);} ;
> ALPHA : ('a'..'z'|'A'..'Z');
> NUMERIC :('0'..'9');
> PUNCTUATION
> : '_' | '-' | '+' | '/' | ';' | '#'
> | '*' | '\\' | ':' | ',' | '\'' | '.' | '?'
> ;
>
> NEWLINE
> : ('\r' '\n')+ | ('\n')+ | ('\r')+
> { newline(); })
> ;
>
>
> -Matt
>
>
> --- setuk_x <set at n...> wrote:
> > I am new Java and Antlr.
> > I have written a basic parser in Perl before - but
> > it is proving slow
> > and unwieldy and so I am looking to Antlr to fill
> > the gap.
> > I need to parse a text file which contains text in
> > the format
> > (simplest form)
> > DES MAIL1
> > TN 001 0 02 00
> > TYPE SL1
> > CDEN DD
> > CUST 0
> > KLS 1
> > FDN
> > TGAR 0
> > LDN NO
> > NCOS 4
> > SGRP 0
> > RNPG 0
> > SCI 0
> > SSU
> > XLST
> > SCPW
> > SFLT NO
> >
> > I need to be able to classify each line a specific
> > type so I can pass
> > these types to the parser and validate that what I
> > have is a valid
> > record.
> >
> > Is the best way to do this using Lexer tokens? Such
> > as:-
> >
> > class TNBLexer extends Lexer;
> > options { k = 5;
> > defaultErrorHandler = true;
> > }
> > // TNB is mostly uppercase but we need lowercase in
> > here because of
> > the CPND
> >
> > TN : (("TN")+ (NUMERIC)+ NEWLINE);
> > DES : (("DES") (ANYTHING)+);
> > DATE: (("DATE")+ (NUMERIC NUMERIC '/'NUMERIC NUMERIC
> > '/'NUMERIC
> > NUMERIC));
> > WS: ((' ')|('\t')){$setType(Token.SKIP);};
> >
> > protected ANYTHING : ((ALPHA|NUMERIC|PUNCTUATION));
> > protected ALPHA : ('a'..'z'|'A'..'Z');
> > protected NUMERIC :('0'..'9');
> > protected PUNCTUATION :('_'|'-
> > '|'+'|'/'|';'|'#'|'*'|'\\'|':'|','|'\''|'.'|'?');
> > protected NEWLINE: ((('\r' '\n')+ |('\n')+ | ('\r'))
> > { newline(); });
> >
> > Or am I completely on the wrong track.
> > I am wading my way through the doc at the moment so
> > any advice would
> > be helpful.
> >
> > Thanks Simon
> >
> >
> >
> >
> >
> >
> > Your use of Yahoo! Groups is subject to
> > http://docs.yahoo.com/info/terms/
> >
> >
>
>
> __________________________________
> Do you Yahoo!?
> SBC Yahoo! DSL - Now only $29.95 per month!
> http://sbc.yahoo.com
Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/
More information about the antlr-interest
mailing list