[antlr-interest] difference between tokens and string literals

Mon Dec 31 07:34:32 PST 2007

I'm confused about how to detail with the difference between tokens
and string literals. In the example below I have a lexer, parser and
tree grammar all pared down to be a simple as possible. In particular,
I don't know the correct way to write the "list" rule in the tree
grammar. When I give this the input "list variables", why is the
output "failed"?

--- lexer grammar ---

lexer grammar BasicLexer;

LIST: 'list';
LIST_OPTION: 'functions' | 'variables';

NEWLINE: ('\r'? '\n')+;
WHITESPACE: ' '+ { $channel = HIDDEN; };

--- parser grammar ---

parser grammar BasicParser;

options {
  output = AST;
  tokenVocab = BasicLexer;
}

list: LIST LIST_OPTION terminator -> ^(LIST LIST_OPTION);

// It seems that you cannot refer to EOF in a lexer rule,
// so I made this a parser rule.
terminator: NEWLINE | EOF;

--- tree grammar ---

tree grammar BasicTree;

options {
  ASTLabelType = CommonTree;
  tokenVocab = BasicParser;
  output = template;
}

list
  : ^(LIST 'functions') { System.out.println("list functions isn't
supported yet"); }
  | ^(LIST 'variables') { System.out.println("list variables isn't
supported yet"); }
  | ^(LIST LIST_OPTION) { System.out.println("failed"); }
  ;

If needed I can provide the Java code I wrote that uses the
ANTLR-generated classes, but that's probably not relevant. I'm pretty
sure the issue is in one of my grammar files.

-- 
R. Mark Volkmann
Object Computing, Inc.