[antlr-interest] ANTLR3ea+ANTLWorks is *really cool*, but how do I insert PythonTokenStream.java

Elden Crom eldencrom at comcast.net
Sat Jul 23 11:59:43 PDT 2005


Same here for now (putting a class between lexor and parser, works 
fine), but with respect to your (Terence's) quote
"Why program by hand in five days what you can spend five years of your 
life automating."
are we out of time yet?  :-)
perhaps it could be added to ANTLR 3's grammar some how....
======================================
======================================
<example that covers most of my cases...(through it wouldn't pass coding 
standards!) (make sure its in a non-proportional font to read..).>
  ...
  for x (0,2,12 //no DENT
 34,43):
    y +=x*2;if(x>12):p+=y; i++;  //INDENT before 'p' but not before 'i'
                     p/=z;
      /*this too?*/  p/=2;  //all of the p assignments are in the 
if(...) block
                             //comments don't gen dent errors or cause 
indention to be aquired
                       // the  /*..*/ comment has no effect but the 
spacing is preserved
    q*=p;
  while(next()):; //gen INDENT ";" and DEDENT
  func(SomeObnoxiouslyLongIdentifier,   //no DENT here
       AnotherObnoxiouslyLongIdentifier);
   func2();  //error -- alignment wrong
   if(b):
   c=1;   //error -- indent of 'c' must be greater than 'if'

======================================
======================================

The first question is does it belong in the parser or the lexor? (it 
appears to usually sit in between, as a matter of necessity now)
I know that for my grammar I'm designing, I only want to generate an 
error if the beginning of a statement is not aligned with the previous 
and it is not preceded with a ":".  Same for Python.  My tendency is to 
think it (control) belongs in the parser.
Some options (certainly not exhaustive):
1) Have a grammar flag that says always generates a INDENT and DEDENT 
token after 'newline()' has been called.
     CON: every statement that can contains a DENT token (arrays, list, 
between if and "(" etc) must have "(DENT)? {ignoreDent()}" all over it.
     PRO: maybe not so bad just make a rule "ig_dent: (INDENT | DEDENT)? 
{ignoreDent()}" ... still ugly
2) Have a lexor functions that say look for new dent after a certain 
point (":" in my case) look for new indention
     CON: Somewhat restricts how it can be used.
     PRO: Relatively easy to describe in the lexor and does not put 
allot of jiberish in the parser chunk.
3) have the Parser mess with the Lexor's head, by saying where changes 
of indent are allowed
     CON: parser and lexor are no-long stand alone (colorization for 
grammars becomes more difficult, etc.)
     PRO: most flexible

(the grammar for my language.....)
See 1) above -- in the parser
statement: statment ';' {checkNextIndent=true;}
block: ":" {aquireIndent();} compundStatement;
if: "if" ig_dent expersion ig_dent block
while: "while" ig_dent expersion ig_dent block
....

see 2) above -- in the lexor
SEMI: ';' {checkIndent=true;};
COLON: ':' {Indent(),looking_for_indent=true};
ID:  IDENT { 
(looking_for_indent)?aquireIndent():(checkIndent)?verifysameline_or_same_column=true 
}

See 3) above
{lexor.setIndentionMustBeEqual();}
statement: statment ';' {lexor.setCheckNextIndent(true);}
block: ":" {lexor.aquireIndent();} compundStatement;

Of coarse what we have now in 2.7.5 works, ANTLRWorks will need to allow 
for a thing to be inserted between the lexor and parser.
Maybe just allow for this in the grammar
<parser>
{ option: insertPostLexor="IndentionSensitiveLexor"}
<lexor>

Just Musing........................
(sorry for the length....)
Elden



Terence Parr wrote:

>
> On Jul 11, 2005, at 2:40 PM, Rodrigo B. de Oliveira wrote:
>
>> On 7/11/05, Terence Parr <parrt at cs.usfca.edu> wrote:
>>
>> Boo also needs virtual indent/dedent tokens. Our current approach is
>> to insert a IndentTokenStreamFilter that preprocesses white space
>> tokens and generates indent/dedent virtual tokens as necessary. Will
>> this approach still be supported in antlr 3.0?
>
>
> Yes, as long as ANTLR's lexer sees imaginary indent/dedent tokens,  
> it's cool :)  We'll have to find a way to have this automatically  
> detected and added to the input stream.
>
> Ter
> -- 
> CS Professor & Grad Director, University of San Francisco
> Creator, ANTLR Parser Generator, http://www.antlr.org
> Cofounder, http://www.jguru.com
>
>
>




More information about the antlr-interest mailing list