[antlr-interest] Re: NEWBIE: Parsing log files

Wed Jul 16 17:37:14 PDT 2003

If DASH is only used at the beginig of a message, then the easiest way
I can think about is using your DASH to recognize the begining of
"message"; I mean having a rule like this on the lexer:

// a MESSAGE is a DASH followed by 0 or more 'non \n' chars and a '\n'
// at the end 
MESSAGE : DASH (~'\n')* '\n';

and then in your parser:

record : date time priority MESSAGE
       ;

If the las log does not end with '\n', you'll have to add EOF:

MESSAGE : DASH (~('\n'|EOF))* ('\n'|EOF) ;

Does this help?

Cheers, 

Enrique

--- In antlr-interest at yahoogroups.com, "cunctator69" <agarrett at a...>
wrote:
> I've got what seems to be a fairly straightforward
> problem, but I've had no luck solving it, nor have I
> found an answer in the fairly copious antlr docs. I'm
> sure it's there, but I don't think I'm recognizing it.
> 
> I'm trying to use antlr to parse a Log4J log file. The
> problem is that I'm trying to match something like:
> 
> // from the parser
> startRule
> 	:	(record)+
> 	;
> record 
> 	: 	date time priority DASH! message
> 	;
> 
> and message represents everything from DASH to the end
> of the line. I tried a rule like:
> 
> message
> 	:	UP_TO_EOL
> 	;
> 
> // lexer
> 
> UP_TO_EOL
> 	:LETTER (options {greedy=false;} :.)* '\n'
> 	;
> 
> but that conflicts with:
> 
> WORD
> 	:	('a'..'z' | 'A'..'Z')+
> 	;
> 
> As I said, it seems pretty straightforward. I
> understand why I'm getting the ambiguity warning -- I
> just can't figure out how to build a single
> unambiguous token that matches from an arbitrary point
> to the end of the line. I would appreciate any
> guidance offered.
> 
> Thanks,
> Alex

Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/