[antlr-interest] Fuzzy parsing ('filter' option)

Jean-Christophe Bach jeanchristophe.bach at inria.fr
Thu Aug 26 01:39:50 PDT 2010


Hi list,

Has anyone any idea about this problem ?

We were using the 'filter' option in the old parser, and I wanted to use it here
since it seemed correct/easy/a good way. But maybe this is not the good way to
keep the host code and to parse only our specific constructs.
How do you handle this type of problem ? How do you build your tree when you
do not want to modify the existing host code and only to transform your own
specific constructs without parsing all the host language ?

I think that I miss something very simple and that the problem is classic, but I
have difficulties to point it.

Thanks in advance,

JC

PS : my previous emails to remember my problem are just below.


* Jean-Christophe Bach <jeanchristophe.bach at inria.fr> [23.08.2010. @11:35:28 +0200]:
> Hello antlr users,
> 
> I am a bit stuck to the problem of fuzzy parsing and filter option. I looked at the fuzzy example,
> but I am not sure it helps me. 
> 
> How do you usually handle it ? How do you keep all the the host code without any
> modification when you write a language you embed into Java (or any other
> language) ? I thought I could create a node containing a "big string" of host
> code in my tree, but it does not seem to to be so obvious/easy. The parser has
> difficulties to "recognize unknown constructs" (all but my specific
> constructs == host code). Have you ever had this problem ?
> 
> Thanks in advance,
> 
> JC
> 
> 
> * Jean-Christophe Bach <jeanchristophe.bach at inria.fr> [11.08.2010. @16:54:30 +0200]:
> > I have a combined grammar and I would like to do fuzzy parsing. I know that the
> > filter option is designed for the lexer part, but is there any way to use this
> > option by keeping my combined grammar ?
> > In our old  parser (antlr2), we used this option like this (in a combined
> > grammar) :
> > 
> > class HostParser extends Parser;
> > ...
> > // returns the current goal language code
> > private String getCode() {
> >   String result = targetlexer.target.toString();
> >   targetlexer.clearTarget();
> >   return result;
> > }
> > ...
> > and parser rules
> > 
> > ...
> > class HostLexer extends Lexer;
> > options {
> >   ...
> >   filter=TARGET;
> >   ...
> >  }
> > 
> > public StringBuilder target = new StringBuilder("");
> > // clear the buffer
> > public void clearTarget() { target.delete(0,target.length()); }
> > 
> > ...
> > 
> > protected
> > TARGET : (.) { target.append($getText); } ;
> > 
> > 
> > It was very easy to do fuzzy parsing and to use what was parsed : when needed,
> > we only had to call getCode() function to get a HostBlock code and to put it in
> > a node of our tree.  But with antlr3, I'm not sure to understand how to proceed.
> > I tried to give the option 'filter=true;' directly to my fragment TARGET rule,
> > but it does not seem to be a good idea (I obtain errors).
> > Would someone have any idea ?
> > 
> > Regards,
> > 
> > JC


More information about the antlr-interest mailing list