[antlr-interest] Re: check tokens for whitespace?
mazypath
eitan at cs.ucla.edu
Wed Sep 29 17:05:06 PDT 2004
Thanks, I knew there had to be an easy way. Unfortunately, I think
your answer uncovered a bug in ANTLR (or is it just uncovering my
ignorance).
I want function names to be mapped to tokens because I am defining my
own AST classes. So my token definition looked like:
tokens {
FUNC<AST=my.ast.here>;
}
So according to my reading of the ANTRL docs I should be able to do this:
tokens {
FUNC="func"<AST=my.ast.Class>;
}
Problem is that is generates parser code that looks like this:
protected void buildTokenTypeASTClassMap(){
tokenTypeToASTClassMap.put(new Integer("func"), my.ast.Class);
}
Which of course causes an exception when java tries to make an integer
out of the string "func."
So, before I report this as a bug:
I doing something wrong or is this an ANTLR bug?
If this is a bug to whom do I report it?
BTW, Right now the only working solution I've found is a very tedious
semantic predicate:
VAR : {FunctionTests.isFunc(LA(1), LA(2), LA(3), LA(4))}?
('a'..'z') ('a'..'z'|'0'..'9')* { $setType(FUNC) ;} |
('a'..'z') ('a'..'z'|'0'..'9')* ;
Which isn't a very good solution at all.
--- In antlr-interest at yahoogroups.com, Joan Pujol <joanpujol at g...> wrote:
> Hi,
>
> I think that you have to do is use the tokens section of the lexer for
> your reserved keywords
> (in your case func)
>
> tokens {
> FUNC="func";
>
> }
> VAR: ('a'..'z') ('a'..'z'|'0'..'9')*;
>
>
> Make sure that in VAR you use the testLiterals option to true. This is
> the default, but be sure that you haven't put it to false in global
> options.
>
> Cheers,
>
> On Fri, 24 Sep 2004 01:29:13 -0000, mazypath <eitan at c...> wrote:
> >
> > Thanks for your quick answer. My question may not have been clear.
> >
> > I would like VAR to be any string including those starting including
> > those that start with "plus" (or another keyword/token) followed by
> > letters or integers. so:
> > helloWorld ---> VAR
> > plus ---> FUNC
> > plus1 ---> VAR
> >
> > In your reply VAR must start with "plus". Add the origional VAR
> > defintion ('a'..'z') ('a'..'z'|'0'..'9' | '.')* to the rules below and
> > you get nondeterminism.
> >
> > VAR :
> > ("plus " ( 'a'..'z'|'0'..'9')) => ('a'..'z') ('a'..'z'|'0'..'9' |
> > '.')* |
> > (('a'..'z') ('a'..'z'|'0'..'9' | '.')*) |
> > ("plus ") => "plus " {$setType(FUNC); } ;
> >
> > There is now nondetermenism between block 2 and 3. Move the last
> > block up and "plus1" is labled FUNC again. Even if this were to work
> > I have a lot of keywords, defining them WITHIN another token
> > definition seems bad.
> >
> > What would be ideal (in my mind) is if I could leave VAR as is and
> > change FUNC to be something like
> > FUNC: "plus" ~( 'a'..'z'|'0'..'9')
> > And then have that last charater not be consumed (or re-inject it into
> > the stream).
> >
> > Thank you agian!
> >
>
> --
> Joan Jesús Pujol Espinar
Yahoo! Groups Links
<*> To visit your group on the web, go to:
http://groups.yahoo.com/group/antlr-interest/
<*> To unsubscribe from this group, send an email to:
antlr-interest-unsubscribe at yahoogroups.com
<*> Your use of Yahoo! Groups is subject to:
http://docs.yahoo.com/info/terms/
More information about the antlr-interest
mailing list