[antlr-interest] difference between tokens and string literals

Mark Volkmann r.mark.volkmann at gmail.com
Mon Dec 31 14:40:05 PST 2007


On Dec 31, 2007 3:07 PM, Gavin Lambert <antlr at mirality.co.nz> wrote:
> At 04:34 1/01/2008, Mark Volkmann wrote:
>  >
>  >I'm confused about how to detail with the difference between
>  >tokens and string literals.
>
> Tokens are tokens, and string literals are mere sequences of
> characters.  In certain cases, string literals can be promoted
> automatically to tokens, but I usually find that more confusing
> than helpful.
>
>  >I don't know the correct way to write the "list" rule in the
> tree
>  >grammar. When I give this the input "list variables", why is the
>  >output "failed"?
>
> It's because the actual tree from the rewrite is ^(LIST
> LIST_OPTION), which matches option 3.  In fact options 1 and 2
> really ought to give you a compiler error since it's impossible
> for those to ever match anything, even if you did keep the actual
> text of the option around -- but ANTLR 3's grammar error checking
> is a little flaky at the moment because it's still using ANTLR 2
> to do much of the work.
>
> You could use a semantic predicate to compare the contents of the
> LIST_OPTION token (assuming you modified the parser to actually
> give it some contents), but personally I think it'd make things
> much simpler if you broke this up into multiple tokens anyway:
>
> lexer grammar BasicLexer;
>
> tokens {
>    LIST = 'list';
>    FUNCTIONS = 'functions';
>    VARIABLES = 'variables';
> }

I found out that you can't assign literal values to tokens in a
lexer-only grammar. ANTLR outputs an error message when you do that.
As far as I can tell, there's no point in using a tokens specification
(section 4.4 in the book) in a lexer-only grammar. I fixed this by
putting the following in my lexer-only grammar.

LIST: 'list';
FUNCTIONS: 'functions';
VARIABLES: 'variables';

> NEWLINE: ('\r'? '\n')+;
> WHITESPACE: ' '+ { $channel = HIDDEN; };
>
> parser grammar BasicParser;
> options {
>    output = AST;
>    tokenVocab = BasicLexer;
> }
>
> list: LIST list_option terminator -> ^(LIST[$LIST] $list_option);

Instead of the line above, I was able to use the following.

list: LIST list_option terminator -> ^(LIST listOption);

> list_option: FUNCTIONS | VARIABLES;
> terminator: NEWLINE | EOF;
>
> tree grammar BasicTree;
> options {
>    ASTLabelType = CommonTree;
>    tokenVocab = BasicLexer;
>    output = template;
> }
>
> list
>    : ^(LIST FUNCTIONS) { System.out.println("list functions isn't
> supported yet"); }
>    | ^(LIST VARIABLES) { System.out.println("list variables isn't
> supported yet"); }
>    | ^(LIST .) { System.out.println("failed"); }
>    ;

-- 
R. Mark Volkmann
Object Computing, Inc.


More information about the antlr-interest mailing list