[antlr-interest] Simple lexer grammar doesn't match '''

Mauro Pellicioli nightwolf at email.it
Wed Aug 29 05:07:44 PDT 2007


Great!It works, I only modified STRING rule adding 'u2019'.

Thanks a lot,
Regards	

--------- Original Message --------
	Da: Gavin Lambert <antlr at mirality.co.nz>
	To: antlr-interest at antlr.org <antlr-interest at antlr.org>
	Oggetto: Re: [antlr-interest] Simple lexer grammar doesn't match '''
	Data: 29/08/07 13:43
	
	> 
> 
> 
> At 22:40 29/08/2007, Mauro Pellicioli wrote:
>  >fragment
>  >LINK:'<a href="' STRING_LINK {System.out.println("Link:
>  >"+$STRING_LINK.text); '">' STRING {System.out.println("Hotel:
>  >"+$STRING.text);} '</a>';
> [...]
>  >fragment
>  >STRING: ( ('u0020'..'u003B') | 'u003D' | ('u003F'..'u007E')
>  >|('u0080'..'u017F') )+;
> [...]
>  ><a
>  >href="/hotel/us/enfant-plaza.html?sid=b02d5b4438247c402f4a43539dfc9
>  >d8c">L'Enfant
>  >Plaza Hotel</a>
>  >
>  >Output:
>  >
>  >Link:
>  >/hotel/us/enfant-plaza.html?label=short-index.htmlerrorc_search_in_
>  >invalid%3Dsi;sid=1892815e8db2e96caca618e2377948d8
>  >Hotel: L
>  >
>  >Instead of:
>  >
>  >Link:/hotel/us/enfant-plaza.html?sid=b02d5b4438247c402f4a43539dfc9d
>  >8c
>  >Hotel:L'Enfant Plaza Hotel
>  >Address:480 L'Enfant Plaza, SW, Washington (Washington DC)
>  >
>  >
>  >It seems that STRING rule fails when it encounters a ' char (hex 
> 
>  >value 0x27), but STRING has the correct range of chars.
> 
> Are you certain that it's a ' character (0x27) and not a ' 
> character (0x2019)?  Because it definitely looks like the latter 
> one in your email message....
> 
> (Given 0x2019 is also 0x92 in the standard 1252 codepage, it's not 
> that uncommon to see it in the wild.  You should probably be 
> accepting it too.)
> 
> 
> 
>  
 --
 Email.it, the professional e-mail, gratis per te: http://www.email.it/f
 
 Sponsor:
 Prestiti personali fino a 30.000 Euro. Clicca qui per un preventivo
immediato, richiedi subito l’importo e la rata che desideri!
 Clicca qui: http://adv.email.it/cgi-bin/foclick.cgi?mid=6917&d=20070829




More information about the antlr-interest mailing list