[antlr-interest] greedy vs nongreedy lexer rules
Terence Parr
parrt at cs.usfca.edu
Sun Apr 18 16:02:39 PDT 2010
Hi Marcin,
First, can you do this in v3?
fragment
VerbatimString
:
(
'[' GUTS ']'
)
|
(
'{' GUTS '}'
)
;
fragment
GUTS : BlanksOrTabs NewLine BlanksOrTabs
( options {greedy=false;}:
~(
'\r'
| '\n'
)*
NewLine BlanksOrTabs
)*
;
Then, with lexical modes, you'd share the same mode for the inside/guts.
Ter
On Apr 18, 2010, at 3:40 PM, Marcin Rzeźnicki wrote:
> Hi,
> Well, once I posted here the example of some construct which, in my
> opinion, is hard to get right without non-greedy rules. Let me repost:
>
> fragment
> VerbatimString
> :
> (
> '[' BlanksOrTabs NewLine BlanksOrTabs
> ( options {greedy=false;}:
> ~(
> '\r'
> | '\n'
> )*
> NewLine BlanksOrTabs
> )*
> ']'
> )
> |
> (
> '{' BlanksOrTabs NewLine BlanksOrTabs
> ( options {greedy=false;}:
> ~(
> '\r'
> | '\n'
> )*
> NewLine BlanksOrTabs
> )*
> '}'
> )
> ;
>
> What;s going on here is that you may have two kinds of strings -
> either with '[' ']' as delimiters, or '{' '}' - there are different
> semantics that depend on chosen delimiter. Lexer states can be used
> for eliminating clumsy alternative, I suppose - if you see '{' on
> input enter the 1st mode, otherwise enter the 2nd mode . But the inner
> loop here is not solvable with lexer states unless one is willing to
> duplicate it in both modes (am I right here?).
More information about the antlr-interest
mailing list