[antlr-interest] Strangeness when parsing strings and spaces

Bart Kiers bkiers at gmail.com
Mon Jan 17 23:21:22 PST 2011


On Tue, Jan 18, 2011 at 7:41 AM, Kevin Jackson <foamdino at gmail.com> wrote:

> Hi,
>
> I know that this is a problem with my lexer and I'm doing something
> stupid, but I have a problem with simple k,v pairs of the format:
>
> [String "quoted string with spaces and non-alhpa chars"]
>
> My grammar
> ------------------
>
> LEFT_SQUARE: '[';
> RIGHT_SQUARE: ']';
> STRING: 'a'..'z'|'A'..'Z';
> TEXT: ('a'..z'|'A'..'Z'|' '|',')+
>
>
Your STRING and TEXT have too much in common. Better let TEXT also include
the double quotes. Also, you could just skip the spaces outside you quoted
text and your STRING rule only matches a single character, which is probably
a mistake.

Try something like:

pair
  :  LEFT_SQUARE IDENTIFIER QUOTED_TEXT RIGHT_SQUARE
  ;

LEFT_SQUARE  : '[';
RIGHT_SQUARE : ']';
IDENTIFIER   : ('a'..'z'|'A'..'Z')+;
QUOTED_TEXT  : '"' ('a'..'z' | 'A'..'Z' | ' ' | ',' | '-')+ '"';
SPACES       : (' ' | '\t')+ {skip();};


Regards,

Bart.


More information about the antlr-interest mailing list