[antlr-interest] Grammar issue

Zachary Palmer zep_antlr at bahj.com
Sun Nov 7 09:04:36 PST 2010


I'm going to guess that this has to do with the fact that you are 
creating a RESTOFLINE token.  Suppose the tokenizer encounters the 
following sequence of tokens:

// hello

This will match both LINE_COMMENT and RESTOFLINE.

I see two choices.  You could create a single token for DEFINESTATEMENT 
(where RESTOFLINE is now a tokenizer fragment); the presence of the 
#define while lexing would be sufficient to disambiguate.  Or you could 
pull the whole thing up into the parser (like the C preprocessor 
actually does).  The only trick with the latter case is that something 
as trivial as

definestatement: '#define' (~NEWLINE)* NEWLINE;

probably won't work because you're almost certainly skipping newlines as 
whitespace in your grammar.  My guess is that the approach here would be 
to fiddle with the token stream so that you can react to the newline 
token on the hidden stream.

For what you want to do, though, it sounds like simply treating the 
define as a token would get you what you want.

Cheers,

Zach
> I have a very simple grammar where I am attempting to parse some C++ code.  The input is very simple and I am having trouble figuring out how to parse (lex?) a line.  What I want to do is match a '#define" and then the rest of the line.  I don't care what is in the rest of the line (even if empty) but I do want it passed to a processing function where I can examine its contents.  The code snippet I have used is
>
> definestatement
>      : '#define' defineoption
>      ;
>
> defineoption
>      : RESTOFLINE
>      ;
> ...
> COMMENT
>      :   '/*' ( options {greedy=false;} : . )* '*/' {$channel=HIDDEN;}
>      ;
>
> LINE_COMMENT
>      : '//' ~('\n'|'\r')* '\r'? '\n' {$channel=HIDDEN;}
>      ;
>
> RESTOFLINE
>   : ~('\n'|'\r')* '\r'? '\n'
>   ;
>
> This was taken out of the example grammar for 'C' and modified.
>
> The problem is that when I attempt to use a RESTOFLINE in the grammar, the parser stops with an Unexpected Token at the terminal */ of the comment in the header.  It doesn't seem to make any difference if I modify LINE_COMMENT to contain the RESTOFLINE item or not.
>
> Questions:
> 1.  How can I capture the rest of the line into a string that I can examine in the function handling that expression?
> 2.  Why doesn't the above construct work?
>
> The grammar generates and compiles ok in Visual Studio 2008.
>
> Thanks
> Sterling
>    		 	   		
>
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address
>
>    



More information about the antlr-interest mailing list