[antlr-interest] Identifying certain patterns and capturing everything else

Nick Vincent nick at vtype.com
Sat Mar 6 02:31:39 PST 2010


Hi,

Apologies in advance for the really basic question.  I've been working on
putting together a css preprocessor, which isn't meant to be a fully
validating parser, but in some places acts almost like a filter outputting
most data verbatim but performing calculations where they exist.  I've got
the structural parsing working, but when I parse a property value I'm
interested in recognising mathematical expressions within the value, and
capturing everything else as a literal.  I am having trouble working out how
to achieve this.  I think the following trivial example illustrates what I'm
trying to do:

grammar trivialambig;

propertyvalue
: (expr | anything)* EOF
;

expr
: NUM '*'
;
anything
: .
;

NUM: '0'..'9';
CHAR: 'a'..'z';


With an input stream of "123" the "anything" rule is disabled as ambiguous,
thus never gets a chance to match.   Where "expr" fails to parse I am trying
to get "anything" to act as a fallback and capture a single character then
repeat the process. Is there a way of achieving this precendence?  I have
tried using syntactic predicates as the book suggests:

propertyvalue
: ((expr)=>expr | anything)* EOF
;

but this produces a mismatched token exception in the antlrworks interpreter
and won't compile for debugging due to RecognitionException being caught
where it's not thrown, and now I'm a bit stuck.  All of the other solutions
I've seen to this problem use a peg based parser that considers whitespace,
but I'm sure this must be possible in antlr somehow!

Any help is much appreciated,

Thanks,

Nick


More information about the antlr-interest mailing list