[antlr-interest] onerous lex pattern

Jeff Barnes jbarnesweb at yahoo.com
Sat Jan 21 19:28:58 PST 2006


Hi all,

I'm creating a .mdl file parser. Mostly
straightforward, but one thing is getting past me...

Rose serializes strings that have a quote or a newline
in them by starting them at column 1 and beginning
each line of the string with a '|'. So my lexer rule
looks like this:

MULTILINESTRING:
    ({inputState.guessing != 0 || getColumn() == 1}?
'|'!)
    ( options { greedy = false; }:
        ~('\r' | '\n')
        )*
        (NL)+
;

NL:
    (
        '\r' 
    |   '\n' {newline();}
    ) { _ttype = Token.SKIP; }
;

The thing is, I don't want the multi-line string to
use more than one token. It's only one string, just
many lines. But right now, my parser rule looks like
this:

value
{
}
:
        list
    |   object
    |   STRING
    |   (MULTILINESTRING)+ 
    |   INT 
    |   DOUBLE 
    |   BOOLEAN
    |   REFERENCE 
    |   valueSet 
    |   point 
;

I want to get rid of the '+'.

Any help appreciated.

Jeff




More information about the antlr-interest mailing list