[antlr-interest] Re: Newbie Question about Syntactic Predicates

lgcraymer lgc at mail1.jpl.nasa.gov
Fri Nov 7 14:35:05 PST 2003


Mike--

It looks like you are doing a lot in the lexer.  That's okay if your problem is fairly static, but you may trip over that if you end up doing 
more complicated parsing.  What you are still missing in your second solution is left factoring:  "NUMBER.OF." is the common prefix, 
so you can rewrite

SURFACE_OR_STANDOFF
     :
     "NUMBER" DOT "OF" DOT
     (   "SURFACE" other stuff { $setType(SURFACE); }
     |    "STANDOFF" more stuff { $setType(STANDOFF); }
     )
    ;

Do that and you may be able to avoid the large lexer lookahead.  You may be better off using literals, though, and plan to do more in 
the parser.

--Loring

--- In antlr-interest at yahoogroups.com, "hawkwall" <hawkwall at y...> wrote:
> Why is it I can't figure something out until I post to the newsgroup?
> 
> I think I have a solution.  My predicate was wrong on the type
> STANDOFF.  Changing it too:
> 
> SURFACE_OR_STANDOFF
> : ("NUMBER" DOT "OF" DOT "SURFACE" ) =>
> "NUMBER" DOT "OF" DOT "SURFACE" DOT "TO" DOT "AIR" DOT "THREAT" DOT
> "CLASSES" COLON
> {$setType(SURFACE); }
> | ("NUMBER" DOT "OF" DOT "STANDOFF" ) =>
> "NUMBER" DOT "OF" DOT "STANDOFF" DOT "RANGE" DOT "AIRCRAFT" DOT
> "CLASSES" COLON
> {$setType(STANDOFF);}
> ;
> 
> and then making the original SURFACE and STANDOFF protected seems have
>  fixed my problem.  I read that the protected keeps the tokens from
> being sent to the parser, but I still don't quite understand it.  Is
> this the correct way to handle large tokens without a large k value?
> 
> Thanks for your time.
> 
> Mike 
> 
> 
> > Hello,
> > 
> > I need to match the following data
> > 
> > NUMBER.OF.SURFACE: 3
> > NUMBER.OF.STANDALONE: 5
> > 
> > Where all I am really concered about is that the surface has a number
> > 3, and the standalone has a number 5. I put the following in my Lexer
> > 
> > DIGITS : (0..9)+ ;
> > 
> > DOT : '.' ;
> > 
> > COLON : ':' ;
> > 
> > SURFACE : "NUMBER" DOT "OF" DOT "SURFACE" COLON ;
> > 
> > STANDALONE : "NUMBER" DOT "OF" DOT "STANDALONE" COLON;
> > 
> > WS : ( ' '
> > | '\t'
> > | '\f'
> > | ( options {generateAmbigWarnings=false;}
> > : "\r\n" // DOS
> > | '\r' // Macintosh
> > | '\n' // Unix
> > )
> > {newline();}
> > )+
> > 
> > // now the overall whitespace action -- skip it!
> > { $setType(Token.SKIP); }
> > ;
> > 
> > And my Parser looks like:
> > 
> > start : rule1 rule2;
> > 
> > rule1 : SURFACE DIGITS ;
> > 
> > rule2 : STANDALONE DIGITS ;
> > 
> > with some actions to print out the number it finds. If k<12 in the
> > lexer, I get a nondeterminism error, and can see the problem in the
> 
> > [snip]
> 
> > SURFACE_OR_STANDOFF
> > : ("NUMBER" DOT "OF" DOT "SURFACE" ) =>
> > "NUMBER" DOT "OF" DOT "SURFACE" DOT "TO" DOT "AIR" DOT "THREAT" DOT
> > "CLASSES" COLON
> > {$setType(SURFACE); }
> > | ("NUMBER" DOT "OF" DOT "STANDOFF" DOT "RANGE" DOT "AIRCRAFT" DOT
> > "CLASSES" COLON ) =>
> > "NUMBER" DOT "OF" DOT "STANDOFF"
> > {$setType(STANDOFF);}
> > ;
> > 
> > [snip]


 

Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/ 




More information about the antlr-interest mailing list