[antlr-interest] C-style includes: problem with parser vs. lexer rules

Bjoern Doebel doebel at tudos.org
Mon Aug 27 03:52:47 PDT 2007


Hi,

I want to parse C-style #include statements and got a working version like
this:

fragment DIGIT  : '0'..'9';
fragment CHAR : 'a'..'z' | 'A'..'Z';

IMPORT : '#include' ;
GT : '>' ;
LT : '<' ;
WORD : CHAR (CHAR|DIGIT|'_'|'-')*;
WS     : (' '|'\t'|'\n'|'\r')+ { self.skip(); } ;

filename : WORD ('/' WORD)* '.' WORD ;

import_r : IMPORT LT filename GT ;


This works, but now I'd like to transfer the filename rule into a lexer
rule, so I get only one single token from it. Therefore, I change the last
two rules:

FNAME : WORD ('/' WORD)* '.' WORD ;

import_r : IMPORT LT FNAME GT;

But when I run it with e.g., "#include <foo/bar/baz.h>", I get an error:
line 1:8 mismatched input 'foo/baz/bar.h' expecting FNAME

What am I doing wrong and why does the lexer not recognize the filename as
FNAME?

Regards,
Bjoern


More information about the antlr-interest mailing list