[antlr-interest] Multipass Design Dilemma

Thu Apr 28 06:30:24 PDT 2011

> Are you doing a second pass over the original input text?  Or are you
> writing a tree grammar to walk your already parsed AST that you
> generated from your first pass?  In which case your secondary stuff
> should be matching trees, and not text.

I'm not sure what a second pass over the would entail.

Right now I'm pursuing the tree grammar route.  I only really have 
experience with lexers and parsers so far so this is a step outside my 
comfort area.  I worked up these examples yesterday:

grammar StepOne;

options { output=AST; }

tokens { MATCHED; UNMATCHED; }

many_parts : single_part+;

single_part
     : unspecified -> ^(UNMATCHED unspecified)
     | whitespace -> ^(MATCHED whitespace)
     | whitespace -> ^(MATCHED whitespace)
     | whitespace -> ^(MATCHED whitespace)
     ;

tree grammar StepTwo;

tokens { NOTHING; }

options {
     backtrack=true;
     tokenVocab=StepOne;
}

tree : (matched | unmatched)+;

matched : ^(MATCHED NOTHING);

unmatched : ^(UNMATCHED rematch);

rematch
     : 'one'
     | 'two'
     | 'three'
     | .*
     ;

The problem I hit is that the rematch rule always matches ".*" and none 
of the preceding literals.

Courtney Falk