[antlr-interest] Lexer generated for C# more than 100 times larger than for Java

Farr, John john.farr at medtronic.com
Tue Jul 24 13:23:05 PDT 2007


For a grammar I've been working on, the lexer file generated for
"language=CSharp" is over 100 times as large as that generated when
"language=Java". The C# lexer file size 26 MB (26,019,532) whereas the
Java lexer file is 240 KB (240,282). Obviously such a huge file taxes
the C# compiler (amazingly, it does build, but ever so slowly).

The size difference seems to be, at least in part, in the way the "DFA
transition" tables are generated. For Java these tables are generated as
Strings; for C# they're generated as arrays of shorts. There may be
other contributors to the size difference as well.

It seems peculiar that there would be such a huge difference in
generated source code size for 2 targets that are so similar. Is there
any possibility of reducing the size of the generated C# lexer code?

Thanks,
John

___________________________________________________________________________________________________
CONFIDENTIALITY AND PRIVACY NOTICE
Information transmitted by this email is proprietary to Medtronic and is intended for use only by the individual or entity to which it is addressed, and may contain information that is private, privileged, confidential or exempt from disclosure under applicable law. If you are not the intended recipient or it appears that this mail has been forwarded to you without proper authority, you are notified that any use or dissemination of this information in any manner is strictly prohibited. In such cases, please delete this mail from your records.

To view this notice in other languages you can either select the following link or manually copy and paste the link into the address bar of a web browser: http://emaildisclaimer.medtronic.com


More information about the antlr-interest mailing list