[antlr-interest] Problem using predicates in V3
Dr. Hartmut Kocher
hwk.cortex-brainware at t-online.de
Sun Feb 11 13:38:39 PST 2007
No it doesn't work. I appended the C# code (which ist he same for Java).
Here's the rule:
ID : ('abc' DIGIT) => 'abc' | ('a'..'z'|'A'..'Z')
('a'..'z'|'A'..'Z'|'_'|'0'..'9')+ ;
If you look at the code, you see that it correctly checks for the predicate.
It then matches the string, but this does not emit a token!
Instead it emits a token at the end of the ID rule, with type ID.
This is the part I don't understand, or where a bug might hide...
Hartmut
Here comes the code:
if ( (LA3_0 == 'a') )
{
int LA3_1 = input.LA(2);
if ( (LA3_1 == 'b') )
{
int LA3_3 = input.LA(3);
if ( (LA3_3 == 'c') )
{
if ( (synpred1()) )
{
alt3 = 1;
}
This matches the predicate. => alt3 = 1.
Now the rule is processed:
switch (alt3)
{
case 1 :
// Test.g:30:7: ( 'abc' DIGIT )=> 'abc'
{
Match("abc"); if (failed) return ;
}
break;
This matches the string, but does not set the token type.
Then the token is created:
if ( backtracking == 0 )
{
if ( (token == null) && (ruleNestingLevel == 1) )
{
Emit(_type, _line, _charPosition, _channel,
_start, CharIndex-1);
}
} }
Which is "abc" but of type ID!!!
Hope this helps.
-----Ursprüngliche Nachricht-----
Von: antlr-interest-bounces at antlr.org
[mailto:antlr-interest-bounces at antlr.org] Im Auftrag von Terence Parr
Gesendet: Sonntag, 11. Februar 2007 22:24
An: ANTLR Interest
Betreff: Re: [antlr-interest] Problem using predicates in V3
On Feb 11, 2007, at 1:15 PM, Dr. Hartmut Kocher wrote:
> The language is fixed :-( And no, I didn't invent it.
;)
> Your second solution is also not possible because then "t123a"
> would parse
> OK, but "t 123 a" too, which is not allowed. (Of course there's a
> whitespace
> rule)...
actually, it would only match "t 123" as part of tha rule, but no
biggie.
> In ANTLR2 I did the following:
>
> tokens {
> "abc";
> }
>
> IDENT
> options {
> testLiterals=true;
> }
> :
> ("abc" DIGIT) => "abc"
> | ('a'..'z') (LD | '_')*; // LD is letter or digit
>
> This worked quite well. Now I'm trying to accomplish the same with
> ANTLR 3.
> No such luck so far.
OHhhhhh....ok, abc is a keyword and for some reason it's not taken as
a special case; oh, because antlr matches longest it can.
IDENT : ("abc" DIGIT)=> "abc" | ('a'..'z') (LD | '_')* ;
should work. It doesn't?
Ter
More information about the antlr-interest
mailing list