[antlr-interest] Q: How do I left-factor this?
Austin Hastings
Austin_Hastings at Yahoo.com
Tue Nov 13 03:01:05 PST 2007
I'm trying to lexically recognize a (recursive) block of
code+comments+strings as a single token. I'm building the inverse of an
island grammar -- a "hole" grammar? -- a grammar with a hole in the
middle, like a donut. I have a rule to recognize a code block, thus:
CODE_BLOCK : NestedCodeBlock { setText(getText().substring(1,
getText().length() - 1)); } ;
fragment MultiLineComment : '/*' .* '*/';
fragment SingleLineComment : '//' ~('\r' | '\n')* '\r'? '\n';
fragment NestedCodeBlock
: '{'
(options {greedy=false;}
: MultiLineComment
| NestedCodeBlock
| SingleLineComment
| QUOTED_LITERAL
| .
)*
'}'
;
The problem is that ANTLR complains about non-LL(*) left recursion in
alternatives 1,2 and 5, and suggests left factoring the things. I have
found that adding option k=2 will make the problem (apparently) go away.
Can anyone tell me how I would "left factor" these things together? Is
it just asking for help with building a predicate because the .*
overlaps some of the others? I had thought that having the concrete '{'
token out front would prevent any recursion issues with NestedCodeBlock.
What did I miss?
=Austin
More information about the antlr-interest
mailing list