[antlr-interest] [C Target] Filter lexer...

Garry Iglesias garry.iglesias at gmail.com
Tue Aug 5 16:02:20 PDT 2008


Hi Jim,

  Thanks for your answer, sorry I'm not used with mailing list and don't
know how to answer below the previous messages. Anyway thanks for your
answer about the multiple return parameters.

  Now I have (again :) ) other suggestions/remarks....

  * I use a lot of 'filter lexer grammars' and I had problems to find the
information you just sent about overriding the error message. So I tried
your snippet but changing parser by lexer :

  Problem is that the macro 'RECOGNIZER' develops as 'ctx->pLexer->rec'
whereas it should expand to 'lexCtx->pLexer->rec' (or the local 'lexCtx'
might be renamed ctx maybe ?). This happen in the
  ANTLR3_API pMYLEXER MYLEXERNewSSD
 (pANTLR3_INPUT_STREAM instream, pANTLR3_RECOGNIZER_SHARED_STATE state)
{ ... }

  Another macro for the lexer is alright, but as the macro scope is the
'compilation unit' (defined in top of the .c file) it can be the same for
lexer and parser (and the recognizer anyway is the same component so using
the same macro makes sense...).

  Anyway for now I just don't use the macro and I'm doing it 'by hand'...

  * Also I just noticed that antlr returned error messages had wrong line
numbers, and I suspect the multiline preprocessor macros definitions in my
.g that use '\' (the C preprocessor split line character). I may be wrong
because I haven't tested the case separatedly but it might be a reason for
the wrong line number I see (it's visually credible...).

  * About ANTLR : the antlr compiler tries to replace the '$identifiers'
even when they are inside comments (ok it's target code specific so maybe
it's hard to say for the antlr generic part to know how to 'avoid target
language comments' but if the information could be used in a way or another
that would be helpfull too....).

  * General remarks about tests cases : on my point of view, lacks of 'lexer
filter' samples. Also some sample grammars doesn't parse what they should as
the 'input data' doesn't use the whole grammar... I'm not blocked anymore
but spend a lot of time to try to use the official :
   C_COMMENT : '/*' .* '*/' that doesn't work (greedy or not) and which is
also used in the C.g grammar, and I thought it *should* work until I saw the
input data doesn't contains any multiline comment (as it is itself a
preprocessed file result and comments have already been removed...)... At
the end I just used a good old regexp rule of the dragon book ages and it
works well... It's just that I spend a lots of time scratching my last hair
because I trusted the samples and the documentation with too much faith
:)...

  By the way, I post to send remarks, because I USE this nice tool, and I
use it because I like it, so sorry to manifest myself again only to
complains... :).

  I wish everybody a nice day. Because despite the remarks, ANTLR makes my
day more beautifull everyday ;).

  Thanks all for the nice job !

Regards

Garry.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.antlr.org/pipermail/antlr-interest/attachments/20080805/bf43eed1/attachment-0001.html 


More information about the antlr-interest mailing list