[antlr-interest] Re: Parsing keyword vs symbol

Bryan Ewbank ewbank at gmail.com
Thu Apr 7 04:31:05 PDT 2005


I think the key is that we are miscommunicating and I missed it. 
Sorry about that :-(

There is only one /token/ (EQ), and the lexer uses that /token/ to
tell the parser that it saw one of two strings: the string "=" or the
string "eq".

Here's a sketch of the lexer:

  class MyLex extends Lexer;
  options { testLiterals=false; }   // don't need to test everywhere
  tokens { EQ="eq"; }
  EQ: "=";
  IDENT { testLiterals=true; }: ...; // test for tokens here

What happens:

  if lexer sees "=", return EQ
  else if lexer sees whatever is an IDENT, then
    if the ident is "eq" return EQ
    else return IDENT
    endif
  endif

The parser need only deal with one /token/, but the generated nodes
(AST nodes) in the parser will contain the pair <EQ,"=">, or the pair
<EQ,"eq">.
  getType -> EQ
  getText -> "=" or "eq"

Hope this helps,
- B

On Apr 6, 2005 4:09 PM, Peter Kronenberg <PKronenberg at technicacorp.com> wrote:
> Bryan,
>    thanks for your response.  I'm trying to understand the best way to
> use the tool.  It's not always straight forward.
> 
> If I understand you correctly, I need two different tokens: one, defined
> in the lexer, to represent the symbol (e.g., EQ1: '=') and another,
> defined in the parser, to represent the keyword (tokens: EQ2="EQ".  And
> then, in the parser, I would need to test for all posibilities, e.g.
>   relationalExpr: term (EQ1 | EQ2 | LT1 | LT2 | GT1 | GT2 ...) term
> 
> Is this correct?


More information about the antlr-interest mailing list