[antlr-interest] [C Target] Filter lexer...

Tue Aug 5 19:35:23 PDT 2008

On Tue, 2008-08-05 at 19:02 -0400, Garry Iglesias wrote:
> Hi Jim,
>  
>   Thanks for your answer, sorry I'm not used with mailing list and
> don't know how to answer below the previous messages. Anyway thanks
> for your answer about the multiple return parameters.
>  
>   Now I have (again :) ) other suggestions/remarks....
>  
>   * I use a lot of 'filter lexer grammars' and I had problems to find
> the information you just sent about overriding the error message. So I
> tried your snippet but changing parser by lexer :
>  
>   Problem is that the macro 'RECOGNIZER' develops as
> 'ctx->pLexer->rec' whereas it should expand to
> 'lexCtx->pLexer->rec' (or the local 'lexCtx' might be renamed ctx
> maybe ?).

Hmm - I think that everythign shoudl be ctx now - maybe this was missed
in filtering lexers...ah yes, it is because the constructor is using a
local varibel and it has been called lexCtx. I wiull change this to ctx
so that it is consistent everywhere....fixed.

By the way, rather than override error messages in a lexer, it is best
to construct it so it cannot throw errors. Have say:   BADCHAR : . ;  as
the last rule. In a filtering lexer then this should not be necessary of
course.

>  This happen in the 
>   ANTLR3_API pMYLEXER MYLEXERNewSSD         
>  (pANTLR3_INPUT_STREAM instream, pANTLR3_RECOGNIZER_SHARED_STATE
> state) { ... }
>  
>   Another macro for the lexer is alright, but as the macro scope is
> the 'compilation unit' (defined in top of the .c file) it can be the
> same for lexer and parser (and the recognizer anyway is the same
> component so using the same macro makes sense...).
>  
>   Anyway for now I just don't use the macro and I'm doing it 'by
> hand'...
>  
>   * Also I just noticed that antlr returned error messages had wrong
> line numbers, and I suspect the multiline preprocessor macros
> definitions in my .g that use '\' (the C preprocessor split line
> character). I may be wrong because I haven't tested the case
> separatedly but it might be a reason for the wrong line number I see
> (it's visually credible...).

The lexer should auto-increment the line number when it sees '\n' (by
default) - you can change the trigger character or if EOL is something
more complicated you can track it with your own counter. It could be
your multiline lexing comment, but I don't think so. I can check this
though.

>   
>   * About ANTLR : the antlr compiler tries to replace the
> '$identifiers' even when they are inside comments (ok it's target code
> specific so maybe it's hard to say for the antlr generic part to know
> how to 'avoid target language comments' but if the information could
> be used in a way or another that would be helpfull too....).

ANTLR does not know what comments are for the target language, hence it
just replaces any $ reference anywhere. I think you can use \$ to not do
that.
>  
>   * General remarks about tests cases : on my point of view, lacks of
> 'lexer filter' samples. Also some sample grammars doesn't parse what
> they should as the 'input data' doesn't use the whole grammar... I'm
> not blocked anymore but spend a lot of time to try to use the
> official :
>    C_COMMENT : '/*' .* '*/' that doesn't work (greedy or not) and
> which is also used in the C.g grammar, and I thought it *should* work
> until I saw the input data doesn't contains any multiline comment (as
> it is itself a preprocessed file result and comments have already been
> removed...)... At the end I just used a good old regexp rule of the
> dragon book ages and it works well... It's just that I spend a lots of
> time scratching my last hair because I trusted the samples and the
> documentation with too much faith :)...

I think you have something not installed correctly - all the examples
work as advertized if you ahve the correct runtime and correct antlr jar
AND the correct versions of the examples for the correct version of
ANTLR ;-). 3.1 will be released very shortly and all this will become
neat and tidy again unless you are trying to use the next beta :-)
>  
>   By the way, I post to send remarks, because I USE this nice tool,
> and I use it because I like it, so sorry to manifest myself again only
> to complains... :).

No problem. No one has any problem with people pointing out bugs/errors
or suggesting improvements. It is only the WAY that they are phrased
that matters. If people play nice then Je m'en fous.

So, first make sure that you have the current version of all the jars
(best to use 3.1 beta 2 now to be honest as it is so close to release.
You can update to the released versions in a few days (hopefully) time.

Jim
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.antlr.org/pipermail/antlr-interest/attachments/20080805/9d083cfd/attachment.html