[antlr-interest] How to do preprocessing in antlr v4?
Bernard Kaiflin
bkaiflin.ruby at gmail.com
Mon Nov 19 11:35:31 PST 2012
Hi,
great, the CHUNK token. I had always trouble when I wanted to ignore (part
of) lines.
The code/extras/CPPBaseLexer.g4 and Co. is worth studying. Programming this
way gives great flexibility and power.
Nevertheless I find it more difficult to work in the lexer than in the
parser. It took me a couple of hours until I obtained what I wanted.
One token too much, as in CHUNK : ~'#'+ '\n' ; and it fails with a # inside
a string, one token less, as in
'#define' ID REPLACE and you get a token recognition error at: '#define '.
Without adding ~'d' in OTHER_CMD, all preprocessor statements were captured
by OTHER_CMD. It gives a feeling of fragility.
Following is the grammar rewritten in "lexer style", a sample input and
execution.
grammar Cmacros_d;
/* Process #define statements in a C file.
TODO : extract information from DEFINE_PARAM.
*/
program
@init {System.out.println("Cmacros_d last update 2013");}
: ( DEFINE_PARAM
{System.out.print(">>>macro(parameters) " +
$DEFINE_PARAM.text);}
| DEFINE_SIMPLE
{System.out.print(">>>simple macro : " +
$DEFINE_SIMPLE.text);}
| OTHER_CMD
| CHUNK
)+
;
DEFINE_PARAM
: '#define' WS ID '(' WS? ID ( WS? ',' WS? ID )* WS? ')' REPLACE
;
DEFINE_SIMPLE
: '#define' WS ID WS REPLACE
;
OTHER_CMD
: '#' ~'d' ~[\r\n]* '\r'? '\n' ;// can't use .*; scarfs \n\n after
include
WS : [ \t]+ -> channel(HIDDEN) ;
CHUNK : ~'#'+ ; // anything else
fragment ID : ( ID_FIRST (ID_FIRST | DIGIT)* ) ;
fragment DIGIT : [0-9] ;
fragment ID_FIRST : LETTER | '_' ;
fragment LETTER : [a-zA-Z] ;
fragment REPLACE : ~[\r\n]* '\r'? '\n' ;
static char *usage_msg[] = {"-x[directory] strip off text before #!ruby
line ..."};
#ifndef CharNext
#define CharNext(p) ((p) + mblen(p, RUBY_MBCHAR_MAXSIZE))
#define CharNext simple replacement
#endif
#define BITSTACK_PUSH(stack, n) (stack = (stack<<1)|((n)&1))
$ grun Cmacros_d program -tokens -diagnostics tcpreproc.c
[@0,0:66='static char *usage_msg[] = {"-x[directory] strip off text
before ',<5>,1:0]
[@1,67:85='#!ruby line ..."};\n',<3>,1:67]
[@2,86:102='#ifndef CharNext\n',<3>,2:0]
[@3,103:160='#define CharNext(p) ((p) + mblen(p,
RUBY_MBCHAR_MAXSIZE))\n',<1>,3:0]
[@4,161:199='#define CharNext simple replacement\n',<2>,4:0]
[@5,200:206='#endif\n',<3>,5:0]
[@6,207:267='#define BITSTACK_PUSH(stack, n)\t(stack =
(stack<<1)|((n)&1))\n',<1>,6:0]
[@7,268:267='<EOF>',<-1>,7:61]
Cmacros_d last update 2013
>>>macro(parameters) #define CharNext(p) ((p) + mblen(p,
RUBY_MBCHAR_MAXSIZE))
>>>simple macro : #define CharNext simple replacement
>>>macro(parameters) #define BITSTACK_PUSH(stack, n) (stack =
(stack<<1)|((n)&1))
2012/11/19 Terence Parr <parrt at cs.usfca.edu>
> Hi. in the extras code dir from book you'll find a C preprocessor like
> sample.
> Ter
More information about the antlr-interest
mailing list