[antlr-interest] Re: How to set filename in parser
David Wigg
wiggjd at lsbu.ac.uk
Fri Jul 15 05:14:05 PDT 2005
Original message.
Message: 2
Date: Wed, 13 Jul 2005 19:26:29 +0300
From: shmuel siegel <antlr at shmuelhome.mine.nu>
Subject: [antlr-interest] How to set filename in parser
To: 'ANTLR Interest' <antlr-interest at antlr.org>
Message-ID: <42D540B5.6000705 at shmuelhome.mine.nu>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
My grammar parses the output of a c preprocessor. It interprets
the #line directives so that syntax errors can be easily tracked
back to the original file. My lexer sets the filename and
linenumber parameters appropriately and then treats the
directive like a single line comment.
This works fine for the lexer; when an error occurs, the proper
error message is printed. But the parser never finds out about
this resequencing, so syntactical errors at the parser level do
not reflect the original file information. Short of overriding
the various match routines, is there any simple way to propagate
the #line information from the lexer to the parser?
Reply.
I agree this is a real problem.
At one time I was hoping to be able to pretty print the
preprocessed input file as well but it turned out to be so
difficult that I gave up (it would only have been a by-product
of what we wanted to do).
However, we do need to know when the file of the source code we
are interested in is being read so when we parse the line
directives we call a module to extract the data from the line
and to store it in main.cpp at the highest level where it is
available for other modules to see when the User file of
interest is being read.
This is too complex to describe in full here but to give you a
starting point here is a copy of the relevant productions from
the lexer,
PREPROC_DIRECTIVE
options{paraphrase = "a line directive";}
: '#' LineDirective
{_ttype = ANTLR_USE_NAMESPACE(antlr)Token::SKIP;
newline();}
;
protected
LineDirective
:
("line")? // this would be for if the directive
started "#line"
(Space)+
n:Decimal
(Space)+
(sl:StringLiteral)
((Space)+ Decimal)* // To support cpp flags (GNU)
{
process_line_directive((sl->getText()).data(),
(n->getText()).data()); // see main()
}
EndOfLine
;
Further information can be obtained from our C/C++ parser on the
Antlr website.
We were originally interested in including comments in the
pretty print but since they could turn up inside statements more
or less at random this was an even greater problem than line
directives.
I'm wondering now, if we are only interested in line directives,
whether we could pass them to the parser as a token or tokens.
Since they cannot split statements perhaps they could be treated
as just another statement and processed appropriately?
Any comments?
David Wigg
More information about the antlr-interest
mailing list