[antlr-interest] Re: A question about extracting comments

mzukowski at yci.com mzukowski at yci.com
Fri Oct 11 10:24:20 PDT 2002


"rem" and ident are ambiguous...the string "rem" matches in both.  Try out
Ter's new lexer-only predicate hoisting in the latest red hot alpha release.
Ter posted something about it to the list recently, in the past month or so.

Monty

> -----Original Message-----
> From: Fan Yang [mailto:yhhf_dy at yahoo.com]
> Sent: Friday, October 11, 2002 10:07 AM
> To: antlr-interest at yahoogroups.com
> Subject: [antlr-interest] Re: A question about extracting comments
> 
> 
> Hi,
> 
> Thank you very much for your response.
> 
> I have already tried to use k=4 for 4 characters look ahead. But it 
> didn't work; it still gave me the following warning.
> 
> warning: lexical nondeterminism between rules COMMENT and IDENT upon
> MyVB.g:0:       k==1:'r'
> MyVB.g:0:       k==2:'0'..'9','_','a'..'z'
> MyVB.g:0:       k==3:<end-of-token>,'0'..'9','_','a'..'z'
> MyVB.g:0:       k==4:<end-of-token>,'0'..'9','_','a'..'z'
> 
> Is it possible to put "rem" into literal table(I know it's possible 
> for parser not for lexer)? because I have given  the options 
> {testLiterals=true;} for IDENT. or someway else?
> 
> 
> >But are you sure "rem " it's what you really want? What about "rem" 
> >followed by a tab? 
> 
> By the way, you are right about production for "rem". it must change 
> to something like following rule:
> 
> COMMENT
>  : ("rem " | '\'')
>    (~('\n'|'\r'))* ('\n'|'\r'('\n')?)
>    {$setType(ANTLR_USE_NAMESPACE(antlr)Token::SKIP); newline();}
>  ;
> 
> Thanks,
> 
> fan
> 
> --- In antlr-interest at y..., Bogdan Mitu <bogdan_mt at y...> wrote:
> > Hi,
> > 
> > If you use k=4, you will probably get rid of the nondeterminism, 
> since "rem
> > " contains a space, while IDENT doesn't.
> > 
> > But are you sure "rem " it's what you really want? What about "rem" 
> followed
> > by a tab? 
> > 
> > Regards,
> > Bogdan
> > 
> > 
> > --- Fan Yang <yhhf_dy at y...> wrote:
> > > Hi everybody,
> > > 
> > > I'm new to Antlr. I want to develop a parser for a language. In 
> the 
> > > language it uses REM and ' as keywords for comment. I wrote the 
> > > following grammar to deal with comments. for ' kind of comments 
> is 
> > > ok. But it is obviously that "rem " is nondeterminism upon IDENT. 
> But 
> > > I don't know how to remove it. Would you please help me erase the 
> > > nondeterminism error. 
> > > 
> > > thanks a lot.
> > > 
> > > COMMENT
> > > : "rem "
> > > | '\'' (~('\n'|'\r'))* ('\n'|'\r'('\n')?)
> > > {$setType(ANTLR_USE_NAMESPACE(antlr)Token::SKIP); newline();}
> > > ; 
> > > 
> > > IDENT
> > > options {testLiterals=true;}
> > > :('a'..'z'|'_') ('a'..'z'|'_'|'0'..'9')* 
> > > ;
> > > 
> > > WS_ 
> > > :(' '
> > > | '\t'
> > > ){ _ttype = ANTLR_USE_NAMESPACE(antlr)Token.SKIP; }
> > > ;
> > > 
> > > 
> > > 
> > >  
> > > 
> > > Your use of Yahoo! Groups is subject to 
> http://docs.yahoo.com/info/terms/ 
> > > 
> > > 
> > > 
> > 
> > 
> > __________________________________________________
> > Do you Yahoo!?
> > Faith Hill - Exclusive Performances, Videos & More
> > http://faith.yahoo.com
> 
> 
>  
> 
> Your use of Yahoo! Groups is subject to 
http://docs.yahoo.com/info/terms/ 


 

Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/ 



More information about the antlr-interest mailing list