[antlr-interest] Re: strings and vocab?

idontwantanidwith2000init idontwantanidwith2000init at yahoo.com
Mon Apr 12 17:06:40 PDT 2004


What if I have token that lex ';' in the lexer
and in the parser ";"
Will that work?
Would I get LITERAL_;?
Moreover, it seems like testLiterals can be set in options and in a 
lexer rule. What is it translated to? 
If I'll specify testLiterals=false in the options and in a specific 
rule I'll set it to true again then it will stay true. Is it so?
Before debugging it, what does the specification says about it?
Is there such spec? 
ANTLR is not an open source, right?

Tal.

--- In antlr-interest at yahoogroups.com, "lgcraymer" <lgc at m...> wrote:
> This one has to be thought of in implementation terms.  For any 
lexer rule in which testLiterals is true:  tokens are constructed 
and 
> then checked against a hash table of literals.  If the table 
contains a corresponding literal definition, then the token type is 
changed to 
> match the literal; if not, it is given the default token type for 
that rule.  Note that this is independent of the parser.  I believe 
that the 
> current implementation requires that all literals be defined in 
the same file as the lexer grammar.
> 
> Rules for which testLiterals=false are not checked against the 
hash table.  So if you have a rule
> SEMI : ':' ;
> and the literal ";" in the parser grammar, you will get strange 
results--the literal ";" has a different token type than the SEMI 
rule; since 
> table lookup does not occur, you will never see the LITERAL_; 
value in the parser.
> 
> --Loring
> 
> 
> --- In antlr-interest at yahoogroups.com, ronald.petty at m... wrote:
> > Alright, I give up :(.  What is the secret to Antlr, jk.  I am 
still 
> > having some trouble getting started with Antlr, and I believe 
most of my 
> > confusion comes from how strings/tokens/vocab is done.
> > 
> > I was reading the java.g grammar and was wonding, in the parser 
there is 
> > the rule
> > 
> > builtInType
> >         :       "void"
> >         |       "boolean"
> >         |       "byte"
> >         ..
> >         ;
> > 
> > Then in the Lexer there is
> > 
> > IDENT
> > options { testLiterals=true; }
> >         : ('a'..'z'|'A'..'Z'|'_'|'$')
('a'..'z'|'A'..'Z'|'_'|'0'..'9'|'$')*
> >         ;
> > 
> > NUM_INT
> > {boolean isDecimal=false; Token t=null;}
> >         :       '.' {_ttype=DOT;}
> >                 (       ('0'..'9')+ (EXPONENT)? (f1:FLOAT_SUFFIX 
{t=f1;})?
> >                         {
> >                                 ......
> > 
> > protected 
> > FLOAT_SUFFIX
> >         :       'f'|'F'|'d'|'D'
> >         ;
> > 
> > 
> > When the parser says, give me next token (nextToken), the Lexer 
will eat 
> > the next token based on the Lexer rules.  Now if the 
string "void" comes 
> > in, the Lexer says, let me check if there is a literal yet for 
this token. 
> >  However I do not see what is going on here.  The word "void" in 
the 
> > parser may not have been seen yet (calling builtinType).  I have 
read teh 
> > vocab document, but still don't think I understand.  I have 
tried using 
> > tokens {} and don't understand why that works.  Could someone 
explain 
> > these simple concepts?  I know I am missing something very 
simple here.  I 
> > can follow along the grammars just fine, but I don't understand 
real 
> > workings on these issues, espically how or where you check 
Identifiers vs. 
> > Keywords (I have read a dozen things, and none of them seem to 
explain it 
> > in a way I can follow).
> > 
> > Also does protected mean that the Lexer will never call 
FLOAT_SUFFIX 
> > directly,if it is trying to get the nextToken, it will only try 
to get it 
> > from the FLOAT_SUFFIX call in NUM_INT.  Correct?  Is this to 
keep similiar 
> > issues like (IDENT vs Keywords) from happening?
> > 
> > Thanks Ron
> > 
> > ps.  When I get this all figured out, I will write another 
tutorial 
> > hopefully documenting the same issues I have, maybe help someone 
one day 
> > :)
> > 
> > 
*********************************************************************
*****************
> > This communication is intended solely for the addressee and is
> > confidential. If you are not the intended recipient, any 
disclosure, 
> > copying, distribution or any action taken or omitted to be taken 
in
> > reliance on it, is prohibited and may be unlawful. Unless 
indicated
> > to the contrary: it does not constitute professional advice or 
> > opinions upon which reliance may be made by the addressee or any
> > other party, and it should be considered to be a work in 
progress.
> > 
*********************************************************************
*****************



 
Yahoo! Groups Links

<*> To visit your group on the web, go to:
     http://groups.yahoo.com/group/antlr-interest/

<*> To unsubscribe from this group, send an email to:
     antlr-interest-unsubscribe at yahoogroups.com

<*> Your use of Yahoo! Groups is subject to:
     http://docs.yahoo.com/info/terms/
 



More information about the antlr-interest mailing list