[antlr-interest] greedy vs nongreedy lexer rules
Marcin Rzeźnicki
marcin.rzeznicki at gmail.com
Sun Apr 18 15:40:53 PDT 2010
On Sun, Apr 18, 2010 at 11:02 PM, Terence Parr <parrt at cs.usfca.edu> wrote:
>
> Can you folks give me examples that are really difficult to implement without the non-greedy operator? I'm trying to find use cases to push me one direction or the other. Assume you will have lexical states. The /* ... */ comment is an obvious one I guess that you can implement without a non-greedy loop or a semantic predicate or lexical states. Hmm...seems a shame to destroy my beautiful DFA for this one case that I can solve easily enough, cutting and pasting again for the rest of my life ;) (or importing it with grammar import statement).
>
> any thoughts are welcome.
>
>
Hi,
Well, once I posted here the example of some construct which, in my
opinion, is hard to get right without non-greedy rules. Let me repost:
fragment
VerbatimString
:
(
'[' BlanksOrTabs NewLine BlanksOrTabs
( options {greedy=false;}:
~(
'\r'
| '\n'
)*
NewLine BlanksOrTabs
)*
']'
)
|
(
'{' BlanksOrTabs NewLine BlanksOrTabs
( options {greedy=false;}:
~(
'\r'
| '\n'
)*
NewLine BlanksOrTabs
)*
'}'
)
;
What;s going on here is that you may have two kinds of strings -
either with '[' ']' as delimiters, or '{' '}' - there are different
semantics that depend on chosen delimiter. Lexer states can be used
for eliminating clumsy alternative, I suppose - if you see '{' on
input enter the 1st mode, otherwise enter the 2nd mode . But the inner
loop here is not solvable with lexer states unless one is willing to
duplicate it in both modes (am I right here?).
More information about the antlr-interest
mailing list