[antlr-interest] syntactic predicates in the lexer
Matt Barringer
mbarringer at suse.de
Sat Aug 11 17:34:22 PDT 2007
Hi,
I'm trying to parse some strange syntax that looks like this:
# Comment
#Comment
#include <file>
include <file>
# include (this is a valid comment)
Where lines 1, 2, and 5 should be COMMENT tokens, as they need to remain
on the main token stream with all the others, and lines 3 & 4 need to be
INCLUDE tokens.
With ANTLR2, I used a predicate like this, which worked fine:
COMMENT_OR_INCLUDE
:
( '#' "include" (' '|'<'))=>INCLUDE
{ $setType(INCLUDE); }
| ( COMMENT{ $setType(COMMENT); } )
;
Trying that predicate using the C target of ANTLR 3 causes a compiler
error about a missing REWINDFULL() function or something, so I tried this
with no success, as COMMENT tokens are all that are found:
COMMENT_OR_INCLUDE
: '#' ('include')=>INCLUDE
{ $type = INCLUDE; }
| '#' COMMENT
{ $type=COMMENT; }
;
fragment
COMMENT
: (~('\n'|'\r'))* ('\n'|'\r'('\n')?)
;
Trying variations on this didn't work, either:
COMMENT_OR_INCLUDE
: '#'
( INCLUDE
| COMMENT )
;
Does the lexer no longer support syntactic predicates? Is there a better
way to distinguish '# include' from '#include' in the lexer?
Thanks,
Matt
More information about the antlr-interest
mailing list