[antlr-interest] FEN grammar

Jonne Zutt jonne.zutt.ml at gmail.com
Sun Aug 28 13:26:25 PDT 2011


Hi all,

I made my first attempts to use antlr today.
Although I read some tutorials, example programs and a page about
common pitfalls, I stepped
into several pitfalls myself as well, I guess.
Is there anybody who wants to shed some light on the below grammar to
parse chess FEN strings
(see http://en.wikipedia.org/wiki/Forsyth%E2%80%93Edwards_Notation).

I am debugging with the string:
"rnbqkbnr/pppppppp/8/8/8/8/PPPPPPPP/RNBQKBNR w KQkq - 0 1"
without the quotes (this is the initial position for chess).

I have several problems:
- I was using more tokens, but several are overlapping (e.g., for the
enPassant rule I used to have FILE RANK
  where RANK was a lexer token '1'..'8', but that overlaps with the
NUMBER token and also with pieces).
  I'm not sure how to deal with tokens that have overlap? Should they
always be changed into fragments?
  I wanted to make tokens for each piece as well. Such as KNIGHT : 'n'
| 'N'; But the bishop turns out to be
  quite overloaded as well (with BLACK and FILE).

- For some reason, 0 seems to match my NUMBER, but 1 does not match.
This is what the debugger shows
  me. If I switch 0 1 into 1 0, the halfmoveClock is not matching.

- If I press ctrl-Y in the AntlrWorks plugin, I loose all my data!!
arghh. In IntelliJ that is my shortcut to delete
  a line.

Below is my grammer. Any help / comments would be nice :)
Thanks,
Jonne.


grammar Fen;

input
	:	fen EOF;

fen
	:	piecePlacement WS activeColor WS castling WS enPassant WS
halfmoveClock WS fullmoveNumber;

piecePlacement
	:	pieces SEP pieces SEP pieces SEP pieces SEP pieces SEP pieces SEP
pieces SEP pieces;

pieces
	:	('p'|'P' | 'n'|'N' | 'b'|'B' | 'r'|'R' | 'q'|'Q' | 'k'|'K' | '1'..'8')+;

activeColor
	:	'w' | 'b';
	
castling
	:	NONE
	|	('K' | 'Q' | 'k' | 'q')+;
	
enPassant
	:	NONE
	|	FILE '1'..'8';
		
halfmoveClock
	:	NUMBER;

fullmoveNumber
	:	NUMBER;	
	
// LEXER

WS	:	(' ' | '\t')+;
SEP 	:	'/';

NONE	:	'-';
FILE	:	'a'..'h';
NUMBER  :       '0' | ('1'..'9' ('0'..'9')*);


More information about the antlr-interest mailing list