[antlr-interest] Tokens
Ronald Sok
ronald.sok at gmail.com
Thu Nov 26 19:47:36 PST 2009
Being not too familiar with language grammars and ANTLR
I ended up with a grammar which I am not too happy with.
I don't want to bother you with my entire grammar so I created
a very simple example demonstrating my problem.
I want to parse the following :
BEGIN_SOMETHING
Name: Pear
Type: Apple
END_SOMETHING
The tokens BEGIN_SOMETHING and END_SOMETHING indicate
the start and end markers of the block. The Name can have any value and
the Type can be one from the list Apple, Pear, Orange. The problem
I have is that the Name, as seen in the example, can also have the value
from one of the Type list, in this case Pear.
The grammar I have is this:
grammar Some;
someFile
: 'BEGIN_SOMETHING' NEWLINE someName someType 'END_SOMETHING' NEWLINE
;
someName
: 'Name:' ID NEWLINE
;
someType
: 'Type:' someTypeOption NEWLINE
;
someTypeOption
: APPLE
| PEAR
| ORANGE
;
APPLE
: 'Apple'
;
PEAR
: 'Pear'
;
ORANGE
: 'Orange'
;
ID : ('a'..'z'|'A'..'Z'|'_') ('a'..'z'|'A'..'Z'|'0'..'9'|'_')*
;
NEWLINE
: '\r'? '\n'
;
WS : ( ' '
| '\t'
| '\r'
| '\n'
) {$channel=HIDDEN;}
;
Obviously this grammar is unable to recognize the sequence 'Name: Pear',
because 'Pear' is matched by the token PEAR and not by ID. I can ofcourse
add the tokens APPLE,PEAR and ORANGE to the rule someName:
someName
: 'Name:' (APPLE|PEAR|ORANGE|ID) NEWLINE
;
But my feeling tells me this is not the way to go. I hope somebody can
clarify this for me.
Thanks.
Ronald
More information about the antlr-interest
mailing list