[antlr-interest] New user issues / questions

Jim Idle jimi at temporal-wave.com
Thu Jul 19 20:43:29 PDT 2007

Not sure about your ANTLRWorks vs command line differences, I don’t think there are any unless you have a much newer ANTLR tool jar than the one in ANTLRWorks. However, there is a runtime jar and a tool jar. It looks like you are not including the runtime jar in your command line compile somehow. Make sure you use the tool+runtime binary downloads.


However your rule, value, allows nothing to math and it looks like you have confused a parser rule (start with lower case) with a LEXER rule (Start with UpperCase).


Try this:


value : LETTER+;


LETTER : (‘a’..’z’| ‘A’..’Z’);


Or even:


value : WORD;



LETTER : (‘a’..’z’| ‘A’..’Z’);




2) The same thing is happening here, you are confusing lexing and parsing. Until you get the hang of it more, don’t use the literals directly in the parser rules (rule names starting lower case), just UPPER CASE token names and make lexer rules to make the tokens. Then don’t try parsing thing in the lexer, just define the most raw non ambiguous TOKENS and use parser rules to work out how they combine.


3) You are seeing the byproducts of your confusion – I suggest a 7/6 time bass rhythm mixed with a 14/2 tom tom and camping on the river bank opposite my house with the other hippies J Other than that, I think that you are seeing problems because you are using the interpreter rather than the debugger in ANTLRWorks, but you have turned on all the predicates you could possibly think of ;-). While backtracking makes the errors seem to go away, there is then a need for a quad core QX6800 to parse it and you can’t run predicates in interpretive mode. I suggest a brand new drumstick and turning off the backtracking stuff and paying close attention to the ambiguity errors ANTLR tells  you about. J





From: antlr-interest-bounces at antlr.org [mailto:antlr-interest-bounces at antlr.org] On Behalf Of Dan Hardy
Sent: Thursday, July 19, 2007 8:02 PM
To: antlr-interest at antlr.org
Subject: [antlr-interest] New user issues / questions


I’m new to ANTLR, and am using v3.  I do intend to buy the book, for what that’s worth, but it’s not in my hands at the moment.  (And I’ll probably purchase some support if I decide to use this in production)…


I’m having just enough success to get excited about using ANTLR…but am running into some things I don’t quite understand.  


I have a moderately sophisticated grammar almost working, but have seen some odd things that I don’t understand.  So, I’ve tried to recreate some simple examples that might get to the core of the problem so I could post them here. 


I think I’m having three real issues.  The first is not critical, as I can work around it.


1)  Ranges don’t seem to work.  I get different behavior in my regular ant compile vs testing in ANTLRWorks, but neither works.  In my compiled project, the ANTLR generated code calls matchRange(), which is an undefined method on the parser class.  Is it supposed to be present in a base class (org.antlr.runtime.Parser?).  On the other hand, if I try the following in ANTLRWorks, and I try to interpret input abc using rule value with the below grammar, I get a NoViableAltException.  It sure seems like it ought to match.


grammar Test;


options {output=AST; k=2; backtrack=true; memoize=true;}


value    :           ('a'..'z'|'A'..'Z')*;



2)  I can’t get the following to work in ANTLRWorks.  When parsing input record.abc against rule “expression” I get a “FailedPredicateException”.  It does, however parse ’foo’ properly as a literal against the same rule.  record.abc should match the first rule, I would have thought.  I have a feeling that if I understand this better, it’ll help with my third issue.


grammar Test;


options {output=AST; k=2; backtrack=true; memoize=true;}



            :           field | literal



field      :           'record.' ('a'|'b'|'c'|'d'|'e'|'f'|'g'|'h'|'j'|'k'|'l'|'m'|'n'|'o'|'p'|'q'|'r'|'s'|'t'|'u'|'v'|'w'|'x'|'y'|'z')+



literal :  '\'' (Escape | ~('\''|'\\'))* '\''



Escape :           '\\' ('\\'|'\'')



3)  This is a bit hard to explain, but I keep seeing ANTLR completely skip what I’d call “non-matching” tokens and try to use a rule anyway, when the “next” rule would have matched the entire sequence.  For example, when presented with a single-quoted literal for the grammar of example #2 above.  If I try to interpret expression ‘foorecord’ using that grammar, I get ‘ f o o and then NoViableAltException.   ‘foorecords’ works though.  If it matched the first single-quote against the literal rule and grabbed everything up to the next unescaped single-quote, I’d get what I want.  I’ve tried adding additional discriminators to my grammar, such as starting fields with {, functions with $, string literals with ‘, etc., but I still see cases where this type of thing is causing me problems.



I apologize for what I’m sure are very rudimentary questions, and appreciate any help.  I now realize I probably should have taken that extra compiler course instead of that course on African drumming….






-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.antlr.org/pipermail/antlr-interest/attachments/20070719/e37d67a6/attachment-0001.html 

More information about the antlr-interest mailing list