[antlr-interest] Interface between a C preprocessor and the C grammar parsers

Vincent De Groote list.encelade at gmail.com
Mon Mar 16 11:32:49 PDT 2009


Hello,

I need to build a program which reads C language files, modifies the 
files (code transformations), then save the files back to disk. The 
saved files must have the original file structure, with unexpanded 
include files,  unexpanded macros, inactive lines (skipped by the 
preprocessor), ...  Beside the code rewrite functionnality, the program 
must also be able to reformat source code, based on its syntactic structure.

This means that the tokens hidden to the grammatical parser must be 
accessible to the final application.

I'm really a newbie in parsing, and I need some advices on how to do this.

My first questions are about the interface between the preprocessor and 
the C grammar parser:

- Should the preprocessor parser be embedded in the C grammar ?  (This 
seems a little ugly)
- Should the preprocessor parser be a syntaxical parser (with 
productions like active/incative lines, start and end of includes, ...), 
or a lexical parser ?
-  What should this preprocessor parser return ? 
   - A list of tokens (with their channel set to hidden / visible) (is 
it possible for a grammatical parser to return a token list) ?
   - A tree structure with the structure of the file ?
   - Something other ?

Other questions about the C grammar parser:

In the reference book (The Definitive ANTLR Reference: Building Domain 
Specific languages), I read that an AST should not contain syntax-only 
tokens, like the ';' statement separator, parentheses used to change 
operation precedence ...  I do not understand why  an AST should not 
contain such tokens.  I suppose they are just useless in an AST.  Are 
there other reasons ?

This book is well written, but I'm not sure to be able to select the 
best choice  between AST, Tree, custom made structures ...

If the AST is not the good structure to return the parsed grammar to the 
caller, I suppose I could use custom made structures.  But is that the 
best choice ?

I do not understand very well the differences between an abstract tree 
and a concrete tree (I'm really a newbie ...). 
Some hints about these differences are welcome.


Thanks for your replies,

Vincent De Groote




 


 





 


More information about the antlr-interest mailing list