[antlr-interest] Re: strings and vocab?

Mon Apr 12 17:34:58 PDT 2004

What you would have to do is

SEMI
options { testLiterals = true; }
     :  ';'  ;

in the lexer.  The testLiterals flag is tested when deciding whether or not to do a table lookup, and the local option overrides the global 
option within the lexer rule.

The ANTLR manual is the specification; we might get around to having a separate spec for ANTLR 3, but I would not count on that.  
ANTLR through 2.7.x is public domain; starting with 2.8, it will be under a minimally restrictive open source license.

--Loring

--- In antlr-interest at yahoogroups.com, "idontwantanidwith2000init" <idontwantanidwith2000init at y...> wrote:
> What if I have token that lex ';' in the lexer
> and in the parser ";"
> Will that work?
> Would I get LITERAL_;?
> Moreover, it seems like testLiterals can be set in options and in a 
> lexer rule. What is it translated to? 
> If I'll specify testLiterals=false in the options and in a specific 
> rule I'll set it to true again then it will stay true. Is it so?
> Before debugging it, what does the specification says about it?
> Is there such spec? 
> ANTLR is not an open source, right?
> 
> Tal.
> 
> --- In antlr-interest at yahoogroups.com, "lgcraymer" <lgc at m...> wrote:
> > This one has to be thought of in implementation terms.  For any 
> lexer rule in which testLiterals is true:  tokens are constructed 
> and 
> > then checked against a hash table of literals.  If the table 
> contains a corresponding literal definition, then the token type is 
> changed to 
> > match the literal; if not, it is given the default token type for 
> that rule.  Note that this is independent of the parser.  I believe 
> that the 
> > current implementation requires that all literals be defined in 
> the same file as the lexer grammar.
> > 
> > Rules for which testLiterals=false are not checked against the 
> hash table.  So if you have a rule
> > SEMI : ':' ;
> > and the literal ";" in the parser grammar, you will get strange 
> results--the literal ";" has a different token type than the SEMI 
> rule; since 
> > table lookup does not occur, you will never see the LITERAL_; 
> value in the parser.
> > 
> > --Loring
> > 
> > 
> > --- In antlr-interest at yahoogroups.com, ronald.petty at m... wrote:
> > > Alright, I give up :(.  What is the secret to Antlr, jk.  I am 
> still 
> > > having some trouble getting started with Antlr, and I believe 
> most of my 
> > > confusion comes from how strings/tokens/vocab is done.
> > > 
> > > I was reading the java.g grammar and was wonding, in the parser 
> there is 
> > > the rule
> > > 
> > > builtInType
> > >         :       "void"
> > >         |       "boolean"
> > >         |       "byte"
> > >         ..
> > >         ;
> > > 
> > > Then in the Lexer there is
> > > 
> > > IDENT
> > > options { testLiterals=true; }
> > >         : ('a'..'z'|'A'..'Z'|'_'|'$')
> ('a'..'z'|'A'..'Z'|'_'|'0'..'9'|'$')*
> > >         ;
> > > 
> > > NUM_INT
> > > {boolean isDecimal=false; Token t=null;}
> > >         :       '.' {_ttype=DOT;}
> > >                 (       ('0'..'9')+ (EXPONENT)? (f1:FLOAT_SUFFIX 
> {t=f1;})?
> > >                         {
> > >                                 ......
> > > 
> > > protected 
> > > FLOAT_SUFFIX
> > >         :       'f'|'F'|'d'|'D'
> > >         ;
> > > 
> > > 
> > > When the parser says, give me next token (nextToken), the Lexer 
> will eat 
> > > the next token based on the Lexer rules.  Now if the 
> string "void" comes 
> > > in, the Lexer says, let me check if there is a literal yet for 
> this token. 
> > >  However I do not see what is going on here.  The word "void" in 
> the 
> > > parser may not have been seen yet (calling builtinType).  I have 
> read teh 
> > > vocab document, but still don't think I understand.  I have 
> tried using 
> > > tokens {} and don't understand why that works.  Could someone 
> explain 
> > > these simple concepts?  I know I am missing something very 
> simple here.  I 
> > > can follow along the grammars just fine, but I don't understand 
> real 
> > > workings on these issues, espically how or where you check 
> Identifiers vs. 
> > > Keywords (I have read a dozen things, and none of them seem to 
> explain it 
> > > in a way I can follow).
> > > 
> > > Also does protected mean that the Lexer will never call 
> FLOAT_SUFFIX 
> > > directly,if it is trying to get the nextToken, it will only try 
> to get it 
> > > from the FLOAT_SUFFIX call in NUM_INT.  Correct?  Is this to 
> keep similiar 
> > > issues like (IDENT vs Keywords) from happening?
> > > 
> > > Thanks Ron
> > > 
> > > ps.  When I get this all figured out, I will write another 
> tutorial 
> > > hopefully documenting the same issues I have, maybe help someone 
> one day 
> > > :)
> > > 
> > > 
> *********************************************************************
> *****************
> > > This communication is intended solely for the addressee and is
> > > confidential. If you are not the intended recipient, any 
> disclosure, 
> > > copying, distribution or any action taken or omitted to be taken 
> in
> > > reliance on it, is prohibited and may be unlawful. Unless 
> indicated
> > > to the contrary: it does not constitute professional advice or 
> > > opinions upon which reliance may be made by the addressee or any
> > > other party, and it should be considered to be a work in 
> progress.
> > > 
> *********************************************************************
> *****************

Yahoo! Groups Links

<*> To visit your group on the web, go to:
     http://groups.yahoo.com/group/antlr-interest/

<*> To unsubscribe from this group, send an email to:
     antlr-interest-unsubscribe at yahoogroups.com

<*> Your use of Yahoo! Groups is subject to:
     http://docs.yahoo.com/info/terms/