[antlr-interest] onerous lex pattern
Jeff Barnes
jbarnesweb at yahoo.com
Tue Jan 24 01:11:25 PST 2006
Hi Bryan,
Looks good! I've not done a lot of LR stuff; looks
like your bias is towards that kind of thinking. Good
job with the analysis.
Thanks!
:)
--- Bryan Ewbank <ewbank at gmail.com> wrote:
> Hi Jeff,
>
> How about if you change the way you think about this
> multi-line token so that
> it starts with a "|" in col 1, and continues through
> the first newline not
> followed by a "|" char? It requires k=2, but that
> shouldn't be a problem...
>
> I'm not too good with ANTLR lexer rules - I just use
> lex - but it would look
> something like this:
>
> MULTILINESTRING:
> ( {inputState.guessing != 0 || getColumn() ==
> 1}?
> '|'!
> ( options {greedy=true;}: ~('\r' | '\n') )*
> ( options {greedy=true;}:
> NL
> '|'!
> ( options {greedy=true;}: ~('\r' | '\n')
> )*
> )*
> )
> ;
>
> Is the final NL of the last line starting with "|"
> considered part of the
> token? I'd assume "no", right?
>
> Note that there is a difference between what you
> described and the rule that
> you wrote:
>
> > Rose serializes strings that have a quote or a
> newline
> > in them by starting them at column 1 and beginning
> > each line of the string with a '|'. So my lexer
> rule
> > looks like this:
>
> > MULTILINESTRING:
> > ({inputState.guessing != 0 || getColumn() ==
> 1}?
> > '|'!)
> > ( options { greedy = false; }:
> > ~('\r' | '\n')
> > )*
> > (NL)+
> > ;
>
> The description requires every line in the string to
> have a leading "|", but
> the rule allows blank lines to be part of the token.
> Is this desired, rather
> than requiring a "|" between adjacent newlines?
>
> E.g.
> |this is the question - one string or two?
>
> |is this the same string?
> |description says no, rule says yes...
>
=========
Jeff Barnes
(206)245-6100
There are two rules for being a successful consultant: Rule 1 - Don't tell people everything you know.
More information about the antlr-interest
mailing list