[antlr-interest] Problems with memory consumption when generating parsers
Gavin Lambert
antlr at mirality.co.nz
Sun Dec 13 12:08:46 PST 2009
At 07:37 14/12/2009, Marcin Rzeźnicki wrote:
>Specifically I constructed a sort of catch-all rule which I
>called LINEOFTEXT and was like ~('\n' | '\r')*. After
>replacing that with simple .* LINETERMINATOR my problems went
>away.
Actually, the former is better than the latter
(more specific) -- you were just missing some parentheses:
(~('\n' | '\r'))* LINETERMINATOR
>ANTLR wasn't sure about typeArguments because they can be
>arbitrarily nested (like in List<List<List<String>>>) so I
>changed that to:
>IDENTIFIER ( ( '<' ) => typeArguments )? ( '.'
IDENTIFIER ( ( '<' )
>=>typeArguments )? )*
>
>because when I expect typeIdentifier '<'
inevitably marks beginning
>of type parameter list (I hope that's good reasoning)
That's odd, the original shouldn't have been
ambiguous. It could be something about how the
'<' character is being lexed -- bear in mind that
by using it as a quoted literal in a parser rule
you are effectively creating a new (unnamed)
token. It's usually easier to spot lexer
ambiguity and fix it if you explicitly define all
the lexer rules yourself and don't use any quoted
literals in the parser.
More information about the antlr-interest
mailing list