[antlr-interest] Parsing whole-line comments?
Junkman
j at junkwallah.org
Sun Jun 6 08:19:29 PDT 2010
Christian Convey wrote:
>> ----------------
>> /* Tokens */
>> NEWLINE: '\n' ;
>> E: 'E';
>> C: 'C';
>> CALL: 'CALL';
>> // default greediness ensures "CALL" is matched as CALL instead of C.
>
> Thanks, but 'C' can also be the name of a variable, as long as it's
> not in the first column. So I don't think greediness solves the whole
> problem.
>
I wonder if this would work better in that case:
---------------------------
/* Tokens */
NEWLINE: '\n' ;
/* Parsing rules */
stmt : 'E' ... NEWLINE
| 'C' ... NEWLINE
| 'CALL' ... NEWLINE
;
---------------------------
Nor sure since I don't know how explicitly defined tokens are treated
differently from tokens implicitly defined in parsing rules.
Alternatively, you can apply semantic predicate to lexer rules like this:
------------------------
C: { $pos == 0 }?=> 'C' ;
------------------------
It should only match "C" at the beginning of the line, but I found (in
my noob experiences) semantic predicate can be pretty tricky due to
"hoisting out" business and how it affects prediction DFA construction -
I'm sure more experienced hands can tell you better.
Good luck.
More information about the antlr-interest
mailing list