[antlr-interest] How can I write lexer with multiple codepage support?

alekseyandreev andreev at quorum.ru
Mon Sep 16 03:22:57 PDT 2002


How can I write lexer with multiple codepage support?

In my grammar identifiers space not only characters from 0x00-0x7f
range.

They also partially use 0x7f - 0xff range. 
But letters from this range depends from active CodePage. 

Grammar is'not case sensitive, and toUpper method must work 
correctly for active CodePage.

How is better to add "active CodePage support" 
to antlr-runtime and grammar file?

How is better to add lexer rules for identifiers, 
which vocabulary depends from isLetter() method?
Shall I use guard predicates like that or something else?
  ID : LETTER (LETTER|DIGIT)* ;

  LETTER : (.) => {isLetter( $getText );}? .;
  DIGIT : ('0'..'9');

aleksey


 

Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/ 



More information about the antlr-interest mailing list