[antlr-interest] Lexer question

Johannes Luber jaluber at gmx.de
Mon Apr 9 03:15:24 PDT 2007


John Howard wrote:
> With Gavin Lambert input (Many thanks Gavin) I have moved my grammar
> forward, but still have an issue with one aspect.  I'm trying to parse
> tokens such as '53xx' '6xxx' '3334' and the following simple grammar
> works if I have token SHAPE defined, but if I use shapeDist I get a
> mis-match of against ID for the first 'x'.  333x parses OK, but 33xx
> doesn't.  I can's use SHAPE, because that causes other problems with the
> grammar.  Is there any way I can get shapeDist to work?
> 
> Thanks,
> 
> John
> 
> // This works
> dist    :    '^' SHAPE
> ;
> 
> ID : ('a'..'z'|'A'..'Z'|'_')('a'..'z'|'A'..'Z'|'_'|'0'..'9')+;
> SHAPE     :(DIGIT (DIGIT|'*'|'x'|'X') (DIGIT|'*'|'x'|'X')
> (DIGIT|'*'|'x'|'X'));    DIGIT     : ('0'..'9')    ;
> WS  : (' '|'\r'|'\t'|'\n')+{$channel=HIDDEN;} ;
> 
> 
> 
> // This fails
> dist    :    '^' shapeDist
> ;
> 
> shapeDist
> : (DIGIT (DIGIT|'*'|'x'|'X') (DIGIT|'*'|'x'|'X') (DIGIT|'*'|'x'|'X'))
> ;
> 
> ID : ('a'..'z'|'A'..'Z'|'_')('a'..'z'|'A'..'Z'|'_'|'0'..'9')+;
> DIGIT     : ('0'..'9')    ;
> WS  : (' '|'\r'|'\t'|'\n')+{$channel=HIDDEN;} ;

I haven't test my suggestions (without the whole grammar it may be
useless anyway), but maybe the problem is a non-determinism or an
ambiguity as described on page 287 of the Beta Book. The difference
between SHAPE and shapeDist is, that SHAPE is a lexer rule and shapeDist
is a parser rule. When using SHAPE, DIGIT may have to be a fragment rule.

Three other things I've noticed in your grammar: The first one is that
ID doesn't allow single character identifiers, as you use + and not *.
This looks as an oversight to me. The second thing that you should
factor (DIGIT|'*'|'x'|'X') out into another rule (possibly making it
also fragment). Lastly, you shouldn't use parentheses to group rules
elements, unless necessary. It is distracting over long rules like
SHAPE/shapeDist.

Best regards,
Johannes Luber


More information about the antlr-interest mailing list