[antlr-interest] MismatchedSetException in lexer grammar

Kevin J. Cummings cummings at kjchome.homeip.net
Wed Apr 14 10:50:58 PDT 2010


On 04/14/2010 07:51 AM, Tyler Distad wrote:
> I have the following grammar:
> 
> fragment TAB : '\t';
> fragment PRINTABLE : '\u0020'..'\u007F' | TAB;
> 
> fragment DELIM: '|||||';
> FILE_DELIMITER : DELIM (PRINTABLE ~ '|')+ DELIM;

The group (PRINTABLE ~'|') does not do what you want it to.

It will match any pair of characters of which the first is a PRINTABLE
and the second is not a '|' character.  Thus your grammar only matches
even numbers of characters between your delimiters....

> I have the following sample lexer rules:
> |||||12|||||
> |||||123|||||
> 
> The ANTLR interpreter accepts the first one, but the generated tree diagram
> looks like:
> FILE_DELIMITER:
>  |--DELIM
>  |--PRINTABLE
>  |--DELIM
> (Note there's only one PRINTABLE.)
> 
> For the second one, the ANTLR interpreter blows up with a
> MismatchedSetException and a diagram like:
> FILE_DELIMITER:
>   |--DELIM
>   |--PRINTABLE
>   |--PRINTABLE
>   |--MismatchedSetException
> 
> This pattern is reproducible: any input with an even number of PRINTABLE
> characters succeeds. All odd inputs fail.
> 
> Any thoughts?
> 
> Tyler Distad
> 
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address


-- 
Kevin J. Cummings
kjchome at rcn.com
cummings at kjchome.homeip.net
cummings at kjc386.framingham.ma.us
Registered Linux User #1232 (http://counter.li.org)


More information about the antlr-interest mailing list