[antlr-interest] Why doesn't this work?

Thomas Brandon tbrandonau at gmail.com
Wed Jun 6 21:04:23 PDT 2007


All lexing occurs before parsing. "UW" is being matched by the IDENTIFIER
rule rather than the 'U' and 'W' rules generated by the parser literals. I
think because Antlr will prefer a longer match this cannot be fixed with
precedence (putting the 'U' and 'W' rules above IDENTIFIER, which I think
they already should be), "100.W" will correctly parse as then the higher
precedence of the 'W' rule allows it to win out over the IDENTIFIER rule
(meaning according to this grammar "U", "W" etc cannot be used as
identifiers).  Assuming that "UW" is a valid identifier (and so can't be
excluded from IDENTIFIER) you might need to make your size_qualifier rule
deal with IDENTIFIERs and use predicates to test the text. Something like:
size_qualifier:
    '.' {LT(1).text == "UW" || LT(1).text == "W"}? IDENTIFIER -> SIZE_16
;
Though I'm not sure of the exact syntax there.
Or you could exclude the size quantifiers from IDENTIFIER and then have a
rule including them in you parser. Something like:
size_qualifier:
'.' ('U'|'UW') -> SIZE_16
;
identifier:
    (IDENTIFIER|'U'|'UW')
    ;
Now 'UW' should win out over IDENTIFIER and in your parser you can use
identifier rather than IDENTIFIER to also match "U" and "UW".

Tom.

On 6/7/07, Cameron Esfahani <dirty at apple.com> wrote:
>
> I've spent a few hours trying to figure out why this test grammar doesn't
> work.  If I use ANTLRWorks to debug it, starting at rule "number_size", with
> an input of 100.UW, I end up getting a NoViableAltException error.
> If I remove the IDENTIFIER rule, it seems to work.
>
> And, if I add a space to the input (100.U W), then it seems to work as
> well.  Obviously I don't want that to work, 100.U W shouldn't be legal.
> Why is it allowing it?
>
> I can't wrap my head around it, but it seems like the UW portion of the
> input isn't matching to size_qualifier, it's matching to IDENTIFIER?  That
> can't be right, as IDENTIFIER doesn't contain a period character.
>
> grammar Test;
>
> options {
> output = AST;
> ASTLabelType = CommonTree;
> }
>
> tokens {
> SIZE_DEFAULT;
> SIZE_8;
> SIZE_16;
> SIZE_32;
> SIZE_64;
> }
>
> size_qualifier
> : '.' ('U')? ('B') -> SIZE_8
> | '.' ('U')? ('W') -> SIZE_16
> | '.' ('U')? ('L') -> SIZE_32
> | '.' ('U')? ('Q') -> SIZE_64
> ;
>
> number_size
> : NUMBER size_qualifier -> size_qualifier NUMBER
> | NUMBER -> SIZE_DEFAULT NUMBER
> ;
>
> NUMBER
> : '-'? ( '0' | '1'..'9' '0'..'9'*)
> ;
>
> fragment LETTER
> : 'a'..'z'
> | 'A'..'Z'
> ;
>
> IDENTIFIER
> : LETTER ( LETTER | '-' | '_' | '0'..'9' )*
> ;
>
> Cameron Esfahani
> dirty at apple.com
>
> "In the elder days of Art, Builders wrought with greatest care each minute
> and unseen part; For the gods see everywhere."
>
> "The Builders", H. W. Longfellow
>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.antlr.org/pipermail/antlr-interest/attachments/20070607/ed259815/attachment-0001.html 


More information about the antlr-interest mailing list