[antlr-interest] context information through rule parameters

Fri Jul 4 18:37:13 PDT 2008

I'm not sure what happened with your reply but it didn't come up 
quoted properly on my system.  Anyhow:

At 19:53 4/07/2008, Gerard van de Glind wrote:
 >Yes, you are right about that. I know that backtracking inserts
 >syntactic predicates under the hood. But is this the only thing 
it
 >does?
 >I have seen that ANTLR also generates the following statement: 
if
 >(backtracking==0 ) So is the behavior of ANTLR with the 
backtrack
 >option compared to ANTLR with my own syntactic predicates really 

 >the same?

I haven't looked at the specifics in a while, but IIRC all this 
does is to prevent it from raising errors while in backtracking 
mode -- it simply backs out of the rule.  As long as you use 
left-edge synpreds or gated sempreds then it should have the same 
effect, since those will normally be hoisted to the parent rule 
anyway.

 >And is it always possible to replace the backtrack option with 
my own
 >syntactic predicates in such a way that it accepts the same 
language?

As far as I know, yes.

 >Please explain, can you give me a hint of what I should do? 
Sofar,
 >I didn't succeed in being able to make my grammar to accept the
 >same language with my own syntactic predicates compared to the
 >backtrack option.

Basically you'll need to work out the alternative paths that the 
parser could follow from any given point; where there's ambiguity, 
you need to insert a synpred or sempred to guide it to take the 
path you want it to follow.  Synpreds basically let you define 
exactly how far it needs to look ahead (which is critical to get 
it to look past loop constructs), while synpreds let you do more 
esoteric checks, perhaps based on some kind of semantic knowledge 
of the input state.

The predicates need to be inserted early enough in the recognition 
sequence such that ANTLR never commits itself to the "wrong" path.

 >I am aware of the fact that my grammar is ambiguous, but that's
 >something I have to live with.
 >I don't want to resolve the ambiguity, I want to recognize it 
and
 >give a warning to the end users.

I think you missed my point.

The first alt of relationalExpression is this:
   formula[true] (LET^ | GET^ | LT^ | GT^) formula[true]

The second alt is this:
   dateAtom[true] (LET^ | GET^ | LT^ | GT^) dateAtom[true]

Both formula and dateAtom are defined as a single IDENTIFIER, so 
this means that the first alt is:
   IDENTIFIER (LET | GET | LT | GT) IDENTIFIER
And the second alt is:
   IDENTIFIER (LET | GET | LT | GT) IDENTIFIER

There is absolutely no difference and thus no possible way that 
ANTLR can decide which alt to use.  This is an example of a case 
where you'd need to use a sempred to disambiguate (you can't use a 
synpred because the token sequences are identical).