[antlr-interest] Lexer vs Parser

Mon May 20 12:31:00 PDT 2002

Hi All,

Following on from my Unicode post, I would appreciate some advice on 
where to place some lexing/parsing decisions.

For instance with my definitions for Unicode categories, the 
characters to be recognized include the individual characters in the 
category as well as Unicode escape sequences that resolve to a 
characters in the category.

This what I did in the lexer:

protected UNICODE_CLASS_Nl
  : ( { IsUnicodeClass_Nl(LA(1)) }? . 
    | { IsUnicodeClass_Nl(esc_char.getText()) }? 
esc_char:UNICODE_ESCAPE_SEQUENCE 
    )
  ;

My question is should I be checking that the escape sequence resolves 
to a character in the category in the Lexer or should that be 
postponed to the Parser's battery of semantic analysis.

In general, what is a good rule of thumb for deciding what goes into 
a Lexer or a Parser. With Flex/Bison it is much easier since the 
lexer/parser have very different capabilities.

What do you fine people suggest?

Cheers,

Micheal

Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/