[antlr-interest] Mutlipass Design Dilemma
Kevin J. Cummings
cummings at kjchome.homeip.net
Wed Apr 27 07:14:56 PDT 2011
On 04/27/2011 09:08 AM, Courtney Falk wrote:
> All:
>
> I've been banging my head against a design problem for over a week now,
> and I thought I'd finally ask for help.
>
> I have a fuzzy parser in place that breaks a stream up into tokens based
> on whitespace and punctuation, preserving both. All other characters
> are grouped together into tokens. So: "Gratuitous reply?" might yield
> something like: "Gratuitous" WS "reply" QUESTION_MARK.
>
> Here's my problem! I want to then take all the secondary tokens (i.e.
> "Gratuitous" and "reply") and perform a second pass to see if these
> tokens match a second set of patterns. I'm building additional parsing
> into these secondary rules. They could look like:
>
> secondary_pattern : numeral | ordinal;
> numeral returns [int i] : 'two' { $i = 2; }
> ordinal returns [int o] : 'second' { $i = 2; }
>
> So the final result of "Second gratuitous reply?" could look like:
> NUMERAL WS "gratuitous" WS "reply" QUESTION_MARK.
>
> Thoughts? Suggestions?
Are you doing a second pass over the original input text? Or are you
writing a tree grammar to walk your already parsed AST that you
generated from your first pass? In which case your secondary stuff
should be matching trees, and not text.
> Courtney Falk
> court at infiauto.com
>
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address
--
Kevin J. Cummings
kjchome at verizon.net
cummings at kjchome.homeip.net
cummings at kjc386.framingham.ma.us
Registered Linux User #1232 (http://counter.li.org)
More information about the antlr-interest
mailing list