[antlr-interest] Re: special c/c++ parsing
lgcraymer
lgc at mail1.jpl.nasa.gov
Wed May 14 13:25:12 PDT 2003
Ter--
Your algorithm (with improvements) is directly expressible in ANTLR.
Let's say that you want to match rule1 or rule2 or skip a token:
filter :
( (rule1) => rule1
| (rule2) => rule2
| .!
)+
;
This wouldn't fit the case at hand, but might be a good
quick-and-dirty approach to extracting interesting data fragments
from text.
For the case at hand,Joakim needs to recognize many of the C++
language features:
Loops
Other Conditionals
Expressions-- 3 * foo()
nested function calls-- foo(1, foo(2, 0))
I hope that he can ignore indirect function calls-- (*foo)() would
make counting messy, especially if foo could be reassigned. Come to
think of it, though, he may have exactly that problem--virtual methods
are called through pointers.
--Loring
--- In antlr-interest at yahoogroups.com, Terence Parr <parrt at j...>
wrote:
>
> On Wednesday, May 14, 2003, at 11:41 AM, lgcraymer wrote:
>
> > I'll echo Monty's comment. Function calls can appear in enough
places
> > (including complex expressions and argument lists to functions)
that
> > it would be difficult to identify a subset grammar. It is much
easier
> > to prune, even when you are dealing with a language as cumbersome
as
> > C++.
>
> I've often wondered if something like the following (insanely slow)
> approach would work:
>
> 1. You provide a set of possible top-level match rules you are
> interested in matching like expr and method.
>
> 2. You provide a lexer that knows how to ignore comments and how to
> identify all tokens that could be seen (not just ones you are
> interested in).
>
> 3. Start walking the input token-by-token, attempting to match one
of
> the top-level rules starting at token i. If an attempt fails, try
> another top-level rule. Failing that, move to next token and try
again.
>
> This mirrors the naive string search algorithm done by freshman CS
> students, but might actually work. If you didn't care about speed,
> just ease of building the translator, I wonder if this would work.
It
> sounds actually like a very simple TokenStream object :)
>
> Anybody wanna comment on the cases where this would fail?
>
> Ter
> --
> Co-founder, http://www.jguru.com
> Creator, ANTLR Parser Generator: http://www.antlr.org
> Co-founder, http://www.peerscope.com link sharing, pure-n-simple
> Lecturer in Comp. Sci., University of San Francisco
Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/
More information about the antlr-interest
mailing list