[antlr-interest] Stuck

Ryan Daum ryan at darksleep.com
Mon Aug 20 14:01:10 PDT 2007


I see exactly what you mean now; I was able to do as you say with the
toEof piece, but now as I am realizing I probably will want to keep the
same lexer/parser/walker around for multiple lines, I am running up
against a situation where toEof of course eats everything until the end
of file.  Are you saying that it is not possible to tell it to just eat
to the next EOL character without the context sensitive flagging?  If
so, is there a good place for me to look for tips on how to do this?

Ryan

On Mon, 2007-20-08 at 11:29 -0700, Jim Idle wrote:
> You are trying to make your lexer context sensitive. If this must be
> done in the lexer, then you will have to keep a flag to say that it is
> time to return LINE, which will match everything up to the end of file
> (are you sure you mean end of file?)
> 
> However, assuming that you really man end of file, then just get rid of
> LINE and match .+ EOF in the parser:
> 
> messageContinue
>     	:	STAR SPACE datatag SPACE IDENT COLON SPACE line=toEof
> 			-> ^(MESSAGE_CONTINUE datatag $line)
> 	;
> 
> toEof : .+ EOF;
> 
> Or some similar variant.
> 
> Jim
> 
> > -----Original Message-----
> > From: antlr-interest-bounces at antlr.org [mailto:antlr-interest-
> > bounces at antlr.org] On Behalf Of John B. Brodie
> > Sent: Monday, August 20, 2007 11:08 AM
> > To: Ryan Daum
> > Cc: antlr-interest at antlr.org
> > Subject: Re: [antlr-interest] Stuck
> > 
> > Ryan Daum asked (in part):
> > >Hi all,
> > 
> > Greetings!
> > 
> > >I'm writing a fairly simple grammar for the following protocol:
> > >
> > >http://www.belfry.com/fuzzball/trebuchet/mcp.html
> > >
> > >However, I'm stuck on a problem at the lexer level that I can't seem
> > to
> > >solve.  I believe it's my final issue before I have a working parser.
> > >
> > >Basically, I have a number of rules which can match a combination of
> > >characters:
> > >
> > >        fragment
> > >        LINE 	:	(LINE_CHAR)* EOF;
> > >        IDENT 	:	ALPHA (ALPHA|DIGIT|'-' ~(SPACE | COLON |
> > >        OTHER_CHAR))* ;
> > 
> > ...snipped...
> > 
> > >This is all fine, individually they work well.  However, in the rule:
> > >
> > >        messageContinue
> > >          	:	STAR SPACE datatag SPACE IDENT COLON SPACE LINE
> > >          		-> ^(MESSAGE_CONTINUE datatag LINE);
> > >
> > >Working against the following line:
> > >
> > >        * 9b76 text: This is some sample text.
> > >
> > >I always get a MismatchedTokenException because the parser seems to
> > want
> > >to turn everything after SPACE into an IDENT, rather than a line.
> The
> > >intention of "LINE" is just to collect all input after the SPACE in a
> > >messageContinue; I do not want the rest of the lexer's rules to apply
> > at
> > >all.
> > >
> > 
> > I have not really looked too much at your grammar...
> > 
> > But right off I see that LINE is a fragment. This means that the
> > Parser will never see a LINE token.
> > 
> > Could that be your problem?
> > 
> > HTH
> >    -jbb
> 



More information about the antlr-interest mailing list