[antlr-interest] New user issues / questions

Thu Jul 19 20:02:22 PDT 2007

I’m new to ANTLR, and am using v3.  I do intend to buy the book, for what that’s worth, but it’s not in my hands at the moment.  (And I’ll probably purchase some support if I decide to use this in production)…

I’m having just enough success to get excited about using ANTLR…but am running into some things I don’t quite understand.  

I have a moderately sophisticated grammar almost working, but have seen some odd things that I don’t understand.  So, I’ve tried to recreate some simple examples that might get to the core of the problem so I could post them here. 

I think I’m having three real issues.  The first is not critical, as I can work around it.

1)       Ranges don’t seem to work.  I get different behavior in my regular ant compile vs testing in ANTLRWorks, but neither works.  In my compiled project, the ANTLR generated code calls matchRange(), which is an undefined method on the parser class.  Is it supposed to be present in a base class (org.antlr.runtime.Parser?).  On the other hand, if I try the following in ANTLRWorks, and I try to interpret input abc using rule value with the below grammar, I get a NoViableAltException.  It sure seems like it ought to match.

grammar Test;

options {output=AST; k=2; backtrack=true; memoize=true;}

value    :           ('a'..'z'|'A'..'Z')*;

2)       I can’t get the following to work in ANTLRWorks.  When parsing input record.abc against rule “expression” I get a “FailedPredicateException”.  It does, however parse ’foo’ properly as a literal against the same rule.  record.abc should match the first rule, I would have thought.  I have a feeling that if I understand this better, it’ll help with my third issue.

grammar Test;

options {output=AST; k=2; backtrack=true; memoize=true;}

expression 

            :           field | literal

            ;

field      :           'record.' ('a'|'b'|'c'|'d'|'e'|'f'|'g'|'h'|'j'|'k'|'l'|'m'|'n'|'o'|'p'|'q'|'r'|'s'|'t'|'u'|'v'|'w'|'x'|'y'|'z')+

            ;

literal :  '\'' (Escape | ~('\''|'\\'))* '\''

            ;

Escape :           '\\' ('\\'|'\'')

            ;

3)       This is a bit hard to explain, but I keep seeing ANTLR completely skip what I’d call “non-matching” tokens and try to use a rule anyway, when the “next” rule would have matched the entire sequence.  For example, when presented with a single-quoted literal for the grammar of example #2 above.  If I try to interpret expression ‘foorecord’ using that grammar, I get ‘ f o o and then NoViableAltException.   ‘foorecords’ works though.  If it matched the first single-quote against the literal rule and grabbed everything up to the next unescaped single-quote, I’d get what I want.  I’ve tried adding additional discriminators to my grammar, such as starting fields with {, functions with $, string literals with ‘, etc., but I still see cases where this type of thing is causing me problems.

I apologize for what I’m sure are very rudimentary questions, and appreciate any help.  I now realize I probably should have taken that extra compiler course instead of that course on African drumming….

Regards,

Dan

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.antlr.org/pipermail/antlr-interest/attachments/20070719/56f0d732/attachment.html