[antlr-interest] Re: Syntactic predicates question
Artem Dmytrenko
admytren at engin.umich.edu
Mon Jan 30 14:36:50 PST 2006
Hmm, I'm really confused by the behavior then. "A12345" definitely doesn't
match rule 'A' so (1) should fail and not consume the first character of
the string. Shouldn't ANTLR examine at least k characters (in my case
k=2, so it should be looking at 'A' and '1') from input stream before
making a decision about which token matched? The generated code for
matching 'A' in lexer is as follows:
if ((LA(1)=='A') && (true)) {
match('A');
}
Shouldn't it be something similar to the following?
if ((LA(1)=='A') && (LA(2)==END_OF_TOKEN) {
match('A');
}
I'm trying to use syntactic predicates for parsing a language with
keywords that may be part of identifiers (e.g. keyword "Action",
identifier "Action/*/123"). Is there a better approach than syntactic
predicates to attack this scenario?
Thank you again for your help.
Sincerely,
Artem Dmytrenko
On Mon, 30 Jan 2006, Xue Yong Zhi wrote:
>
>
> Artem Dmytrenko wrote:
>
>>
>> line 1:94: expecting ID, found 'A'
>>
>> It appears that the match is stuck in the middle - e.g. ActionToken rule
>> rejected the string but ID did not match it. Is that the expected behavior
>> for syntactic predicates? Are there any workarounds for this problem?
>>
>
> Your parser is thinking this way when parsing "A12345":
>
> 1. Try ActionToken, and match the first 'A'.
> 2. Try ActionToken again with the rest of the input "123456", do not match.
> 3. Then try ID, still no match.
> 4. Give you the warning.
>
> Most of the time Antlr does not follow "the longest one that matches wins"
> rules.
>
> --
> Xue Yong Zhi
> http://seclib.blogspot.com
>
>
More information about the antlr-interest
mailing list