[antlr-interest] lexical nondeterminism between IDENT & LABEL

John D. Mitchell johnm-antlr at non.net
Wed Nov 3 12:44:36 PST 2004


>>>>> "Paul" == Paul J Lucas <pauljlucas at mac.com> writes:
[...]

> 	What's wrong with using a syntaxtic predicate in the lexer?

Theoretically?  Nothing.

> Just because a label happens to have the same character pattern as an
> identifier doesn't mean it's conceptually the same kind of token.

Indeed.

However:

(A) Newbies (and even experienced folks :-) too often try to jam way too
much into the lexer.  This is a Very Bad Thing(tm) and, IMHO, should be
generally discouraged.

(B) A common reason given is that "the language is simple" so just do it in
the lexer.  All too often, that's not the case and semantic context is
required. (See my comment below).

(C) When people start their "simple" solutions in the lexer and things get
wacky, they all too often try to hack things to e.g. push context back into
the lexer from the parser to "fix" the problem (and that's Pure Evil(tm) :-).

> 	Since ANTLR has a much more powerful lexer than most, why not take
> advantage of it?

For a complex example of how to deal with sort of confusion in the lexer,
check out the Number rule near the bottom of the StdC grammar.  This is
dealing with purely syntactic ambiguity because of the many uses of '.'.

Take care,
	John



 
Yahoo! Groups Links

<*> To visit your group on the web, go to:
    http://groups.yahoo.com/group/antlr-interest/

<*> To unsubscribe from this group, send an email to:
    antlr-interest-unsubscribe at yahoogroups.com

<*> Your use of Yahoo! Groups is subject to:
    http://docs.yahoo.com/info/terms/
 





More information about the antlr-interest mailing list