[antlr-interest] non-determinism.

Greg Lindholm glindholm at yahoo.com
Wed Mar 26 06:10:34 PST 2003


If each token type has a "required" distinguishing character then there would not be an non-determiniism, but that is not what you have written in the rules below.
Did you decide which token type an 'a' is? How about a '9'? You're not going to get very far building a lexer until you make these basic decisions.
Once you have some example cases, if you then have trouble building the lexer to match your examples, then people on this list will help you.
 mark kant <markkant2001 at yahoo.com> wrote:There is a slight difference. Each of them also has
extra characters to distinguish. Example TOKEN also
has '~' character in it. If I expected an ID, but I
return TOKEN_OR_ID, then how do I know it is a valid
ID (it may have '~' in it, which makes it invalid ID,
but valid TOKEN )


Mark

------------------------------
--- Greg Lindholm wrote:
> 
> To understand the non-determinism it might help you
> if you consider some example tokens with this lexer.
> If your lexer sees the single character 'a' what
> type of token would you like it to return? One of
> the non-determinism this lexer has is that 'a'
> matches the NAME, ID, and TOKEN rules. Which is it?
> Note that ANTLR doesn't care what order the rules
> appear in (unlike lex). Same thing goes with the
> single character '9', it matches both TOKEN and
> NUMBER.
> So I recommend work up some example cases and decide
> what you want your lexer to return for each case. 
> In some languages a given sequence of characters can
> mean completely different things (different token
> type) based on the context of those characters. 
> Antlr is basically a context-free lexer (predicates
> can help sometimes). In these cases you might need
> to delay exact identification of the token type
> until you know the context (symantic analysis
> phase). For example you might have the lexer return
> a token type NAME_OR_ID then later figure out which
> it is once you know the context.
> Hope this helps,
> Greg
> 
> mark kant wrote:How about
> the following lexer
> 
> 
> protected: 
> ALPHA: ('a'..'z'|'A'..'Z')
> ;
> protected:
> ALPHA_NUM: ('a'..'z'|'A'..'Z'|'0'..'9')
> ;
> protected:
> DIGIT: '0'..'9'
> ;
> 
> 
> NAME: (ALPHA) ((ALPHA) | '_' | '.') )*
> ;
> 
> ID: (ALPHA) ( (ALPHA_NUM) |'_'|'.'|'@')*
> ;
> 
> TOKEN: (ALPHANUM|'_'|'.'|'@'|'%'|';'|'~')+
> ;
> 
> NUMBER: ( DIGITS )+
> ;
> 
> 
> Thanks
> 
> Mark
> --- mzukowski at yci.com wrote:
> > remove your AT rule and then add a literal keyword
> > AT='@' to the keywords
> > section and test for it in TOKEN by turning on the
> > option testLiterals=true.
> > See the docs on literals.
> > 
> > Monty
> > 
> > -----Original Message-----
> > From: mark kant [mailto:markkant2001 at yahoo.com]
> > Sent: Tuesday, March 25, 2003 9:42 AM
> > To: antlr-interest at yahoogroups.com
> > Subject: [antlr-interest] non-determinism.
> > 
> > 
> > Hi,
> > 
> > I get non-determinism in the following lexer
> > (relevant
> > portion of parser and lexer)
> > 
> > hosport: host COLON password
> > 
> > password: TOKEN
> > 
> > host: NAME AT TOKEN
> > 
> > 
> > lexer ...............
> > 
> > COLON: ':'
> > 
> > SEMI: ';'
> > 
> > AT: '@'
> > 
> > TOKEN: ('a'..'z' | 'A'..'Z'
> > |'0'..'9'|'.'|':'|';'|'@')+
> > 
> > 
> > What is the best way to resolve it:
> > 1. multiple lexers
> > 2. syntactic predicates - not appropriate as I
> have
> > other similar rules for special characters
> > 3. some kind of flag set in parser and lexer
> checks
> > it
> > before matching a rule in lexer (how do I
> > communicate
> > the flag state from parser to lexer). I have done
> > this
> > in Lex and YAcc.
> > 
> > Thanks
> > 
> > Mark
> > 
> > __________________________________________________
> > Do you Yahoo!?
> > Yahoo! Platinum - Watch CBS' NCAA March Madness,
> > live on your desktop!
> > http://platinum.yahoo.com
> > 
> > 
> > 
> > Your use of Yahoo! Groups is subject to
> > http://docs.yahoo.com/info/terms/ 
> > 
> > 
> > 
> > 
> > Your use of Yahoo! Groups is subject to
> > http://docs.yahoo.com/info/terms/ 
> > 
> > 
> 
> 
> __________________________________________________
> Do you Yahoo!?
> Yahoo! Platinum - Watch CBS' NCAA March Madness,
> live on your desktop!
> http://platinum.yahoo.com
> 
> 
> 
> Your use of Yahoo! Groups is subject to
> http://docs.yahoo.com/info/terms/ 
> 
> 
> 
> 
> ---------------------------------
> Do you Yahoo!?
> Yahoo! Platinum - Watch CBS' NCAA March Madness,
> live on your desktop!


__________________________________________________
Do you Yahoo!?
Yahoo! Platinum - Watch CBS' NCAA March Madness, live on your desktop!
http://platinum.yahoo.com



Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/ 




---------------------------------
Do you Yahoo!?
Yahoo! Platinum - Watch CBS' NCAA March Madness, live on your desktop!
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.antlr.org/pipermail/antlr-interest/attachments/20030326/4c39752c/attachment.html


More information about the antlr-interest mailing list