[antlr-interest] Two more lexer bugs in antlr-03-16-2007.10
Terence Parr
parrt at cs.usfca.edu
Sun Mar 18 14:00:55 PDT 2007
Howdy. I think i made it work. 2 false starts, wasting a day. The
fix was like 30 minutes once I reverted. ;) Try new build.
antlr-03-18-2007.14.tar.gz
Ter
On Mar 16, 2007, at 7:30 PM, Gavin Lambert wrote:
> At 13:49 17/03/2007, you wrote:
> >ok, i figured out how to refactor/clean up, but it will take some
> >work. ;) Might get it done tomorrow.
>
> Ok, now I'm a little more puzzled. I thought it was the reference
> to WS that it was objecting to, especially given your earlier
> comment about the order of input. But the following grammar fails
> in the same way:
>
> lexer grammar Test;
>
> NormalChar
> : ~('"' | '\\' | '\r' | '\n' | ' ' | '\t')
> ;
>
> QSTRING
> : '"' (NormalChar | ' ' | '\t')* '"'
> ;
>
> ... even if I make NormalChar a fragment.
>
> ....
>
> Ok, a little more fiddling around reveals that it's the (NormalChar
> | anything) bit that it's really objecting to. If I change it to
> just NormalChar* then it compiles.
>
> I tried declaring a fragment rule in between the two (shown below),
> but it wouldn't compile that either.
>
> fragment ExtendedChar: NormalChar | ' ' | '\t';
> QSTRING: '"' ExtendedChar* '"';
>
>
> Anyway, if you're reworking sets, one idea that's crossed my mind
> is that it'd be nice (read: completely optional, ignore me if it's
> too much work) to be able to exclude characters from an existing
> set rule as well. So you could for example take a WS rule
> containing the twenty different characters that are considered
> whitespace normally, and in one particular lexer rule say you want
> anything that's whitespace unless it's this one character you don't
> want. Or take the NormalChar rule above and exclude an additional
> character (say, single quote) when referring to it from another
> rule. (Another example might be for handling things like octal
> digits, where you want a digit but only in a smaller range than
> normal.) This is not really a big deal since you can fairly easily
> (once set combining works, anyway) factor the existing rule out to
> a smaller set (or larger exclusion set), but it could come in handy
> sometimes. I have no idea what a reasonable syntax for that would
> be though (except maybe something like 'set1 & ~set2', which is a
> bit bizarre), so maybe it's not worth worrying about.
>
More information about the antlr-interest
mailing list