[antlr-interest] Lexer Predicates?

Raphael Reitzig r_reitzi at cs.uni-kl.de
Wed Aug 6 03:12:20 PDT 2008


"Gavin Lambert" <antlr at mirality.co.nz> wrote (Mon Aug  4 11:21:31 2008):

> The lexer has to generate a single set of tokens, yes, but as long  
> as you don't assign too much semantic meaning at the lexer level  
> then it's usually ok :)

I think Gavin means: Let your lexer disintegrate your input in so  
"small" pieces (= tokens) that no conflicts occur. That means,  
abstract. Do not try to create as "large" tokens as possible, each  
carrying rich syntactic and/or even semantic information. Do not think  
in terms of "value" or "identifier", but of "an arbitrary string  
starting with a lowercase letter". Put similarities of choices in one  
token.
Put the pieces together again at your parser. Finding a lot of little  
pieces, you should be able to extract the context you need. If I find  
a token QUOT before my token LOWERCASE_STARTED_STRING, it is a value.  
If not, an identifier.

Regards

Raphael

----------------------------------------------------------------
This message was sent using IMP, the Internet Messaging Program.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 191 bytes
Desc: PGP Digital Signature
Url : http://www.antlr.org/pipermail/antlr-interest/attachments/20080806/ca50a2e1/attachment.bin 


More information about the antlr-interest mailing list