[antlr-interest] Mutlipass Design Dilemma

Courtney Falk court at infiauto.com
Wed Apr 27 06:08:10 PDT 2011


   All:

I've been banging my head against a design problem for over a week now, 
and I thought I'd finally ask for help.

I have a fuzzy parser in place that breaks a stream up into tokens based 
on whitespace and punctuation, preserving both.  All other characters 
are grouped together into tokens.  So: "Gratuitous reply?" might yield 
something like: "Gratuitous" WS "reply" QUESTION_MARK.

Here's my problem!  I want to then take all the secondary tokens (i.e. 
"Gratuitous" and "reply") and perform a second pass to see if these 
tokens match a second set of patterns.  I'm building additional parsing 
into these secondary rules.  They could look like:

secondary_pattern : numeral | ordinal;
numeral returns [int i] : 'two' { $i = 2; }
ordinal returns [int o] : 'second' { $i = 2; }

So the final result of "Second gratuitous reply?" could look like: 
NUMERAL WS "gratuitous" WS "reply" QUESTION_MARK.

Thoughts?  Suggestions?


Courtney Falk
court at infiauto.com


More information about the antlr-interest mailing list