[antlr-interest] Simple lexer grammar doesn't match '''
Mauro Pellicioli
nightwolf at email.it
Wed Aug 29 05:07:44 PDT 2007
Great!It works, I only modified STRING rule adding 'u2019'.
Thanks a lot,
Regards
--------- Original Message --------
Da: Gavin Lambert <antlr at mirality.co.nz>
To: antlr-interest at antlr.org <antlr-interest at antlr.org>
Oggetto: Re: [antlr-interest] Simple lexer grammar doesn't match '''
Data: 29/08/07 13:43
>
>
>
> At 22:40 29/08/2007, Mauro Pellicioli wrote:
> >fragment
> >LINK:'<a href="' STRING_LINK {System.out.println("Link:
> >"+$STRING_LINK.text); '">' STRING {System.out.println("Hotel:
> >"+$STRING.text);} '</a>';
> [...]
> >fragment
> >STRING: ( ('u0020'..'u003B') | 'u003D' | ('u003F'..'u007E')
> >|('u0080'..'u017F') )+;
> [...]
> ><a
> >href="/hotel/us/enfant-plaza.html?sid=b02d5b4438247c402f4a43539dfc9
> >d8c">L'Enfant
> >Plaza Hotel</a>
> >
> >Output:
> >
> >Link:
> >/hotel/us/enfant-plaza.html?label=short-index.htmlerrorc_search_in_
> >invalid%3Dsi;sid=1892815e8db2e96caca618e2377948d8
> >Hotel: L
> >
> >Instead of:
> >
> >Link:/hotel/us/enfant-plaza.html?sid=b02d5b4438247c402f4a43539dfc9d
> >8c
> >Hotel:L'Enfant Plaza Hotel
> >Address:480 L'Enfant Plaza, SW, Washington (Washington DC)
> >
> >
> >It seems that STRING rule fails when it encounters a ' char (hex
>
> >value 0x27), but STRING has the correct range of chars.
>
> Are you certain that it's a ' character (0x27) and not a '
> character (0x2019)? Because it definitely looks like the latter
> one in your email message....
>
> (Given 0x2019 is also 0x92 in the standard 1252 codepage, it's not
> that uncommon to see it in the wild. You should probably be
> accepting it too.)
>
>
>
>
--
Email.it, the professional e-mail, gratis per te: http://www.email.it/f
Sponsor:
Prestiti personali fino a 30.000 Euro. Clicca qui per un preventivo
immediato, richiedi subito limporto e la rata che desideri!
Clicca qui: http://adv.email.it/cgi-bin/foclick.cgi?mid=6917&d=20070829
More information about the antlr-interest
mailing list