[antlr-interest] Re: special c/c++ parsing

lgcraymer lgc at mail1.jpl.nasa.gov
Wed May 14 13:25:12 PDT 2003


Ter--

Your algorithm (with improvements) is directly expressible in ANTLR.  
Let's say that you want to match rule1 or rule2 or skip a token:

filter :
        (       (rule1) => rule1
        |       (rule2) => rule2
        |        .!
        )+
        ;

This wouldn't fit the case at hand, but might be a good 
quick-and-dirty approach to extracting interesting data fragments 
from text.

For the case at hand,Joakim needs to recognize many of the C++ 
language features:
     Loops
     Other Conditionals
     Expressions--  3 * foo()
     nested function calls--  foo(1, foo(2, 0))

I hope that he can ignore indirect function calls-- (*foo)() would 
make counting messy, especially if foo could be reassigned.  Come to 
think of it, though, he may have exactly that problem--virtual methods 
are called through pointers.

--Loring

--- In antlr-interest at yahoogroups.com, Terence Parr <parrt at j...> 
wrote:
> 
> On Wednesday, May 14, 2003, at 11:41  AM, lgcraymer wrote:
> 
> > I'll echo Monty's comment.  Function calls can appear in enough 
places
> > (including complex expressions and argument lists to functions) 
that
> > it would be difficult to identify a subset grammar.  It is much 
easier
> > to prune, even when you are dealing with a language as cumbersome 
as
> > C++.
> 
> I've often wondered if something like the following (insanely slow) 
> approach would work:
> 
> 1. You provide a set of possible top-level match rules you are 
> interested in matching like expr and method.
> 
> 2. You provide a lexer that knows how to ignore comments and how to 
> identify all tokens that could be seen (not just ones you are 
> interested in).
> 
> 3. Start walking the input token-by-token, attempting to match one 
of 
> the top-level rules starting at token i.  If an attempt fails, try 
> another top-level rule.  Failing that, move to next token and try 
again.
> 
> This mirrors the naive string search algorithm done by freshman CS 
> students, but might actually work.  If you didn't care about speed, 
> just ease of building the translator, I wonder if this would work.  
It 
> sounds actually like a very simple TokenStream object :)
> 
> Anybody wanna comment on the cases where this would fail?
> 
> Ter
> --
> Co-founder, http://www.jguru.com
> Creator, ANTLR Parser Generator: http://www.antlr.org
> Co-founder, http://www.peerscope.com link sharing, pure-n-simple
> Lecturer in Comp. Sci., University of San Francisco


 

Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/ 




More information about the antlr-interest mailing list