[antlr-interest] C-style includes: problem with parser vs. lexer rules
Johannes Luber
jaluber at gmx.de
Mon Aug 27 05:32:04 PDT 2007
Bjoern Doebel wrote:
> Hi,
>
> I want to parse C-style #include statements and got a working version like
> this:
>
> fragment DIGIT : '0'..'9';
> fragment CHAR : 'a'..'z' | 'A'..'Z';
>
> IMPORT : '#include' ;
> GT : '>' ;
> LT : '<' ;
> WORD : CHAR (CHAR|DIGIT|'_'|'-')*;
> WS : (' '|'\t'|'\n'|'\r')+ { self.skip(); } ;
>
> filename : WORD ('/' WORD)* '.' WORD ;
>
> import_r : IMPORT LT filename GT ;
>
>
> This works, but now I'd like to transfer the filename rule into a lexer
> rule, so I get only one single token from it. Therefore, I change the last
> two rules:
>
> FNAME : WORD ('/' WORD)* '.' WORD ;
>
> import_r : IMPORT LT FNAME GT;
>
> But when I run it with e.g., "#include <foo/bar/baz.h>", I get an error:
> line 1:8 mismatched input 'foo/baz/bar.h' expecting FNAME
>
> What am I doing wrong and why does the lexer not recognize the filename as
> FNAME?
>
> Regards,
> Bjoern
>
My guess is that FNAME should be a parser rule, not a lexer rule. Or
WORD hat do be changed into a fragment rule.
Best regards,
Johannes Luber
More information about the antlr-interest
mailing list