[antlr-interest] Re: literals, identifiers, tokens oh my

Fri Apr 9 14:58:04 PDT 2004

--- In antlr-interest at yahoogroups.com, ronald.petty at m... wrote:
> I have the following parser rule
> 
> type
>         :       "string"
>         ;

Do you really intend to list all literals that could possible be types
(hint: it's impossible since Class/Type in VB6 creates new types)

> and the following lexer rules
> 
> ID
>         options {
>                 testLiterals=true;
>                 paraphrase = "an identifier";
> 
>         }
>         :       ('a'..'z') ('a'..'z'|'0'..'9'|'_'|'.')*
>         ;

> When I run my parser and give the following input (string)
> ANTLR Parser Generator   Version 2.7.3   1989-2004 jGuru.com
> ANTLR Parser Generator   Version 2.7.3   1989-2004 jGuru.com
> Note: * uses or overrides a deprecated API.
> Note: Recompile with -Xlint:deprecation for details.
>  > program; string
>  > lexer mID; c==s
>  < lexer mID; c==
> LA(1)==string
>  < program; LA(1)==string
> 
> Doesn't "string" become a literal because it is in a parser rule?

Look in your XXXTokenTypes[.txt|.java] file. You should see an entry
for LITERAL_string.

> I have 
> the start up parser rule of
> 
> start
>         :       (type (WS)+)+
>         ;
> 
> So I assume I would be able to type in string  whitespace string 
> whitespace etc...

You should.

> Could someone clear up using Tokens, Literals, and seperating them from 
> Identifiers.  I have read the docs about 3 times now, and will start
the 
> 4th run now.

Identifiers are tokens that have the same ID but may have different
semantic attributes (i.e. the text of the identifier).

Literals have a fixed ID and sematic info. The "tokens" feature of
ANTLR allows you to declare literals.

Cheers,

Micheal
ANTLR/C#

PS More below...

> 
> Thanks
> Ron
> 
> ps.  Here is the parser / lex in the flesh incase I did something wrong 
> describing (which I do sometimes).  Also if any has other advice on
how to 
> split languages up into subrules, I like to hear it.

Quick comments after an even quicker scan of your grammars.

1. Rule "program" needs EOF at the end.
2. You'd normally skip whitespace and newlines in the lexer (this
makes your parser MUCH less complex):
   { $setType(Token.Skip); }
3. Study the Java and TinyBasic/TinyC sample grammars.
4. Rule "type" would normally match ID and any literals that could be
a type like DIM, "string" etc in your grammars.

Question: Are you planning to build a recognizer for VB6 syntax only
or for the various VB6 source file formats?

Yahoo! Groups Links

<*> To visit your group on the web, go to:
     http://groups.yahoo.com/group/antlr-interest/

<*> To unsubscribe from this group, send an email to:
     antlr-interest-unsubscribe at yahoogroups.com

<*> Your use of Yahoo! Groups is subject to:
     http://docs.yahoo.com/info/terms/