[antlr-interest] determining tokens at runtime

Mon Jun 21 13:25:26 PDT 2010

> See TDAR pg 285.
> 
> A parser rule is going to work well.

I tweaked the grammar (below) to get it to compile but am still not seeing the dynamic ED behavior when I use "test data 2" below with "^" as the ED delimiter instead of "*".

The ED lexer rule appears to hard code the choices and the capture action + semantic predicate seems only to validate rather than provide dynamic token behavior.

I get the following errors:

line 1:3 no viable alternative at character '^'
...
line 1:4 missing ED at 'HEADER'
...

I'm guessing I didn't implement your suggestion correctly or fully.  Ideas?

Jon

// Simple.g
grammar Simple;

tokens {
  STA = 'STA';
  BEG = 'BEG';
  END = 'END';
}

@members {
  char sep;
}

transaction : header beg_segment footer;

header : STA ed=ED { sep = $ed.text.toCharArray()[0]; } DATA
         ed=ED { $ed.text.toCharArray()[0] == sep }? DATA SD
       ;
beg_segment : BEG segment_body;
footer : END segment_body;
segment_body : ED DATA ED DATA SD;

DATA : (ALPHA_CAPS | DIGIT | '_')+;
ED : '*';
SD : '\r'? '\n';
fragment DIGIT : '0'..'9';
fragment ALPHA_CAPS : 'A'..'Z';

// test data 2 (fail)
STA^HEADER^SEGMENT
BEG^TRANSACTION^HEADER
END^FOOTER^SEGMENT