[antlr-interest] skipping whitespaces in code and avoiding it in comments
Maciej Gawinecki
mgawinecki at gmail.com
Mon Mar 9 02:37:10 PDT 2009
Hello,
Thanks for your response.
Sam Barnett-Cormack wrote:
[cut]
> It's far more common to make VALUE, ID, and COMMENT token types (and
> comment different to what you have now - from // to newline inclusive is
> more normal). Then you put the comments and the WS on the hidden
> channel.
If I put comments on the hidden channel, then how can I make a parser to
cache it ?
My goal is to associate single-line comments with "corresponding"
identifiers of schema elements in SQL.
The specification of the language does not define which comment relates
to what schema element (table or column). Moreover, SQL'92 standard
defines comments as yet another separator (similarly to whitespaces),
that as you said is -- by default sent -- to the hidden channel by a lexer.
Therefore I don't want within my grammar to define explicitly where
comments about the given identifiers should be (that would be narrowing
SQL standard) but rather cache (somehow) the comments and identifiers of
schema elements within rule actions and then apply also some kind of
heuristic, for instance:
1. if a comment is between <table_definition>s then associate it to the
following <table_definition>, the not previous one.
2. if a comment is inside of <table_definition> then:
(a) if a comment is in any line of a <column_definition> then
associate it with the <column_name> value of this
<colum_definition> (<column_definition>s can be spanned over more
then one line)
(b) otherwise, i.e. if a comment is in a separate line between
two <column-definition>s then associate it with the <column_name>
value of the following <column_definition>, not the previous one.
That would require caching line numbers of comments found by lexer and
passing them to the parser, isn't?
Or there is another way to do it?
Maciej
More information about the antlr-interest
mailing list