[antlr-interest] Re: Handling Lots of Keywords?
Thomas Brandon
tom at psy.unsw.edu.au
Mon Oct 6 20:28:39 PDT 2003
As long as you use actual keywords (defined) in the tokens section
Antlr should scale OK (see
http://www.antlr.org/doc/metalang.html#TokensSection). Having 1000
rules for your keywords would probably be a rather large performance
hit due to large bitsets (depending where they are used I guess). But
if you have keywords then all Antlr does is add them to a Hashtable
and test them in the testLiterals routine.
However, it might be better to use your own checking code to avoid
having to put all the keywords in the grammar. If you maintain your
own Hashtable and use a semantic action like:
IDENT_OR_KEYWORD:
IDENT
{ if(isBrailleKeyword($getText)) $setType(BRAILLE_KEYWORD); }
;
where boolean isBrailleKeyword(String) is you function to check
against your hashtable. That way you just need to maintain your
hashtable and don't need to maintain keywords in your grammar. I did
something similar with the data stored into an XML file. That way you
can associate other info with the keywords all in one place. This
should scale as well as a Hashtable scales which for only 1000 items
shouldn't be too bad.
Tom.
--- In antlr-interest at yahoogroups.com, "dotlessbraille"
<easjolly at i...> wrote:
> I am trying to analyze braille texts using the current US standard
> representation for braille math. Braille uses 63 characters (and
> the space). It is typically represented electronically with the 63
> ASCII codes corresponding to the small (xor capital) letters and
all
> but 5 of the special characters so the input is well-defined.
>
> If the tokenization is treated as a lexical problem, there is the
> unusual feature that there are more than 1000 keywords, some with
> more than a dozen characters. (The keywords are mainly used to
> represent mathematical symbols by a notation more intuitive than
> Unicode character codes.)
>
> If any of you have ever dealt with this number of keywords, I'd be
> grateful for advice.
>
> Thanks,
> Susan
Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/
More information about the antlr-interest
mailing list