[antlr-interest] Really big generated C lexer?

Jim Idle jimi at temporal-wave.com
Thu Apr 21 11:36:20 PDT 2011


You are just trying to do too much in the lexer really so it means you get
a lot of tables. Left factor and don't try to validate things in the
lexer. For instance you just need a very generic rule for matching a GUID
and then verify it semantically.

Use antlr.markmail.org for getting advice on pushing error messages as far
down the tool chain as you can and why these kinds of things happen.
Basically though you will get:

Unexpected character 'x'

vs:

Line 12, offset 33: "A GUID should be of the form XXXX-XXXXXXX- ...."

But, the C compiler will do a good job of dealing with the code you
generated.

Jim

> -----Original Message-----
> From: antlr-interest-bounces at antlr.org [mailto:antlr-interest-
> bounces at antlr.org] On Behalf Of Chris McConnell
> Sent: Thursday, April 21, 2011 11:23 AM
> To: antlr-interest at antlr.org
> Subject: [antlr-interest] Really big generated C lexer?
>
> The attached grammar generates a C lexer file of 150,000 lines.  Is
> this typical or did I do something dumb in the grammar?  I'd attach the
> C lexer file, but it is 10mb...


More information about the antlr-interest mailing list