[antlr-interest] Lookahead problem
Ilia Kantor
ilia at obnovlenie.ru
Sun May 15 11:16:41 PDT 2005
I want to create a language with functions like ~func{}, variables #var and
arrays #var{1}{~func{}}.
Besides, curly bracers are allowed as plain-text: ~func{my list: {1,2,3} }
Here is a simple lexer/parser for a part of the task.
-----------------------------
class SimpleTaskLexer extends Lexer;
LCURL: '{';
RCURL: '}' ;
ANY: (~('~' | '{' | '}' | '#'))+;
protected NAME: ('A'..'Z' | 'a'..'z' | '0'..'9' | '_')+;
VARIABLE: '#' NAME;
FUNCTION: '~' NAME;
class SimpleTaskParser extends Parser;
expr: function EOF;
function: FUNCTION curly_text;
curly_text: LCURL entries RCURL;
entries: entry entries |;
entry: ANY | function | curly_text;
---------------------------
The problem is that I can't add array to read #var{...}{...}
Where curly braces after #var denote array member number.
Naturally it would be:
1. entry: ANY | function | curly_text | variable;
2. variable: VARIABLE (curly_text)*;
But that leads to (for string 2.)
warning:nondeterminism upon
k==1:LCURL
between alt 1 and exit branch of block
I guess that's because parser reads var then it does not know where array
members end and usual curly text begins.
The logical answer is simple: get as many {} after #var as possible as array
members.
How to implement that? I tried lookahead like:
entry: ANY | function | curly_text |
(VARIABLE (curly_text)*) => VARIABLE (curly_text)*
| VARIABLE;
But it did not work..
More information about the antlr-interest
mailing list