[antlr-interest] Problems with memory consumption when generating parsers
Marcin Rzeźnicki
marcin.rzeznicki at gmail.com
Sun Dec 13 18:28:23 PST 2009
2009/12/13 Gavin Lambert <antlr at mirality.co.nz>:
> At 07:37 14/12/2009, Marcin Rzeźnicki wrote:
>>Specifically I constructed a sort of catch-all rule which I
>>called LINEOFTEXT and was like ~('\n' | '\r')*. After
>>replacing that with simple .* LINETERMINATOR my problems went
>>away.
>
> Actually, the former is better than the latter (more specific) -- you were
> just missing some parentheses:
> (~('\n' | '\r'))* LINETERMINATOR
>
Yep, sorry, I typed that without bothering to paste hence the error.
>>ANTLR wasn't sure about typeArguments because they can be
>>arbitrarily nested (like in List<List<List<String>>>) so I
>>changed that to:
>>IDENTIFIER ( ( '<' ) => typeArguments )? ( '.' IDENTIFIER ( ( '<' )
>>=>typeArguments )? )*
>>
>>because when I expect typeIdentifier '<' inevitably marks beginning
>>of type parameter list (I hope that's good reasoning)
>
> That's odd, the original shouldn't have been ambiguous. It could be
> something about how the '<' character is being lexed -- bear in mind that by
> using it as a quoted literal in a parser rule you are effectively creating a
> new (unnamed) token. It's usually easier to spot lexer ambiguity and fix it
> if you explicitly define all the lexer rules yourself and don't use any
> quoted literals in the parser.
>
>
That's interesting. You made the right point, I think, and I were
wrong. I will try without quoted literals.
--
Greetings
Marcin Rzeźnicki
More information about the antlr-interest
mailing list