[antlr-interest] Simple parsing question
George J. Shannon
George.Shannon at raphaelanalytics.com
Sat Sep 6 15:53:13 PDT 2008
John:
Eureka! It worked. Thanks so much for the great help.
George
-----Original Message-----
From: antlr-interest-bounces at antlr.org
[mailto:antlr-interest-bounces at antlr.org] On Behalf Of John B. Brodie
Sent: Friday, September 05, 2008 10:57 PM
To: antlr-interest at antlr.org
Subject: Re: [antlr-interest] Simple parsing question
Greetings!
On Friday 05 September 2008 10:58:35 pm George J. Shannon wrote:
> Attached is a snippet of the grammar in question, where tagCommentNbr is
> the integer value enclosed in brackets that I referred to in my email
post.
> George
>
> tagCommentElement returns [ParserTagCommentElement pTagCommentElement]
> @init {
> pTagCommentElement = new ParserTagCommentElement(); //db not req'd
> }
>
> tagCommentNbr (elementName)?
> {
> pTagCommentElement.tagCommentNbr = $tagCommentNbr.text;
> pTagCommentElement.elementName = $elementName.text;
> }
> ;
>
> tagCommentNbr
>
> '[' IntValue ']'
> ;
>
> elementName
>
> '.' alphaN
> ;
>
> IntValue
>
> ('0'..'9')+
> ;
>
The above snippets from your Grammar are semi-useful. It would be best if
you
post the smallest, simplest, yet *COMPLETE* Grammar that exhibits your
problem at hand. That way others may be better able to simply try the
grammar in order to work out where the problem lies.
However I see a reference to a Parser Rule - alphaN - in your snippet above
which leads me to speculate that you have utilized a '0' in that rule (or
perhaps elsewhere).
If you have used '0' in a Parser Rule then that means that a single 0 is a
KEYWORD in your language, e.g. a separate Token that will be emitted by your
Lexer.
Recall that ANTLR Lexers are greedy and will match the longest sequence
possible. But when a given sequence matches more than one lexer rule, the
rule that appears first wins.
So a sequence of "00" is greedily identified as an InvValue token. But a
single "0" might match both the IntValue and the '0' token from a Parser
Rule
(this is speculation on my part based on your above grammar snippet that
alphaN might refer to a '0' inside). So now I postulate that you have two
lexer rules that can match the single "0" - the explicit IntValue and the
implicit '0' parser ref. As it happens the implicit tokens introduced by
using quoted strings in the parser (e.g. the postulated '0') are considered
to be first when breaking such a tie. So your tagCommentNbr when given the
string "[0]" sees the three tokens '[', <implicit '0'>. and ']' rather
than '[', IntValue, and ']'.
The mismatched token error message you are getting should be something
like: "expecting IntValue, got '0'" or something similar.
Try making alphaN (and any other Parser Rule that involves single
characters)
into a Lexer Rule(s).
Hope this helps
-jbb
List: http://www.antlr.org/mailman/listinfo/antlr-interest
Unsubscribe:
http://www.antlr.org/mailman/options/antlr-interest/your-email-address
More information about the antlr-interest
mailing list