[antlr-interest] Fast Token-Recognition (Possible in ANTLR?)
Timmy Turner
timm.turn at gmail.com
Sun Feb 24 02:08:34 PST 2008
Thanks for your replies!
> IMHO it's much simpler to
> use a standalone component implementing fast search for multiple strings
> (there's a lot of them available, in various languages) than ANTLR.
Hm... I couldn't find any trie implementations for Java that would fit
my needs, maybe you could provide me with some links?
2008/2/24, Vaclav Barta <vbar at comp.cz>:
> On Sunday 24 February 2008 00:44:32 Timmy Turner wrote:
> > I want to find occurences of certain strings in a rather large text.
> > Like finding {"sdf","foo","bar"} in "asdffoosomerandomtextbar".
> >
> > Of course I could try myLongString.indexOf(myListOfKeywords.get(i))
> > but I was hoping to find a faster way. Then I found this here:
> > http://en.wikipedia.org/wiki/Trie were I couldn't help it but think of
> > ANTLR (and that chapter about the maze in Terence Parr's book).
> >
> > So I thought of creating a grammar consiting just of tokens, and
> > letting ANTLR to generate a lexer for it which then I could use to
> > spot these strings.
> >
> > The question remaining is just wther ANTLR really uses tries - or
> > maybe there's an even better approach?
>
> AFAIK it does (it's really the standard way), but it might be too heavy for
> the task you describe. Consider the case when you're looking for
> {"ab","abc"} - do you want to find the "ab" in "abc"? ANTLR by default doesn't
> (because programming languages generally don't), and while it might be
> possible to recognize common prefixes with ANTLR, IMHO it's much simpler to
> use a standalone component implementing fast search for multiple strings
> (there's a lot of them available, in various languages) than ANTLR.
>
> Bye
> Vasek
>
> --
> http://www.mangrove.cz/
> Open Source integration
>
More information about the antlr-interest
mailing list