[antlr-interest] Re: check tokens for whitespace?
mazypath
eitan at cs.ucla.edu
Thu Sep 23 18:29:13 PDT 2004
Thanks for your quick answer. My question may not have been clear.
I would like VAR to be any string including those starting including
those that start with "plus" (or another keyword/token) followed by
letters or integers. so:
helloWorld ---> VAR
plus ---> FUNC
plus1 ---> VAR
In your reply VAR must start with "plus". Add the origional VAR
defintion ('a'..'z') ('a'..'z'|'0'..'9' | '.')* to the rules below and
you get nondeterminism.
VAR :
("plus " ( 'a'..'z'|'0'..'9')) => ('a'..'z') ('a'..'z'|'0'..'9' |
'.')* |
(('a'..'z') ('a'..'z'|'0'..'9' | '.')*) |
("plus ") => "plus " {$setType(FUNC); } ;
There is now nondetermenism between block 2 and 3. Move the last
block up and "plus1" is labled FUNC again. Even if this were to work
I have a lot of keywords, defining them WITHIN another token
definition seems bad.
What would be ideal (in my mind) is if I could leave VAR as is and
change FUNC to be something like
FUNC: "plus" ~( 'a'..'z'|'0'..'9')
And then have that last charater not be consumed (or re-inject it into
the stream).
Thank you agian!
--- In antlr-interest at yahoogroups.com, "kozchris" <csnyder at a...>
wrote:
> Something like this is one way.
>
> class LTest extends Lexer;
>
> tokens {
> FUNC;
> }
>
> VAR : ("plus" ('a'..'z'|'0'..'9')) => ('a'..'z')
('a'..'z'|'0'..'9')*
> | ("plus")=> "plus" {$setType(FUNC);};
>
> WS : ( ' '| '\t' | '\f') { $setType(Token.SKIP); }
>
> Chris
>
> --- In antlr-interest at yahoogroups.com, "mazypath" <eitan at c...>
wrote:
> > Sorry if this is a newbie question but I can't seem to find an
answer
> > in the docs or online.
> >
> > Is there anyway to define a token as a string and to only have
have
> > that string recognized as a token if it is not followed by
whitespace?
> >
> > For example if I define the Lexer as follows:
> > class L extends Lexer;
> >
> > FUNC : "plus";
> > WS : ( ' '| '\t' | '\f') { $setType(Token.SKIP); }
> > VAR : ('a'..'z') ('a'..'z'|'0'..'9')*;
> > ;
> >
> > Can I get the Lexer to parse the string "plus1" as a VAR token
and not
> > a FUNC token followed by "1"?
> >
> > Thanks in advance!
Yahoo! Groups Links
<*> To visit your group on the web, go to:
http://groups.yahoo.com/group/antlr-interest/
<*> To unsubscribe from this group, send an email to:
antlr-interest-unsubscribe at yahoogroups.com
<*> Your use of Yahoo! Groups is subject to:
http://docs.yahoo.com/info/terms/
More information about the antlr-interest
mailing list