[antlr-interest] same char but different context
codeman at bytefusion.de
codeman at bytefusion.de
Sat Nov 28 01:08:05 PST 2009
Given is a record-per-line format like this:
<single-char><sequence-of-chars><crlf>
<single-char> => single letter
<sequence-of-chars> => any except end-of-line
<crlf> => end of line
My problem is the following:
WHello World
"W" => recognized as single char
"Hello " is broken, W seems to be a new start char
Here is my grammer. Aimed target is to parse a quicken interchange
format file. Any ideas?
grammar myExample;
options {
output=AST;
}
tokens {
TYPE_DATE = 'D';
TYPE_AMOUNT = 'T';
TYPE_MEMO = 'M';
TYPE_CLEARED = 'C';
TYPE_CHECK_NUMBER = 'N';
TYPE_PAYEE = 'P';
TYPE_PAYEE_ADDRESS = 'A';
TYPE_CATEGORY = 'L';
TYPE_REIMBURSE = 'F';
TYPE_SPLIT_CATEGORY = 'S';
TYPE_SPLIT_MEMO = 'E';
TYPE_SPLIT_AMOUNT = '$';
TYPE_SPLIT_PERCENTAGE = '%';
TYPE_SECURITY_NAME = 'Y';
TYPE_PRICE = 'I';
TYPE_SHARE_QUANTITY = 'Q';
TYPE_COMMISSION_COSTS = 'O';
}
start : header record+ NEWLINE* EOF;
header : KEYWORD_TYPE description NEWLINE;
description : ANY+;
record : item+ END_OF_RECORD;
item : item_type description NEWLINE;
item_type : (TYPE_DATE
|TYPE_AMOUNT
|TYPE_MEMO
|TYPE_CLEARED
|TYPE_CHECK_NUMBER
|TYPE_PAYEE
|TYPE_PAYEE_ADDRESS
|TYPE_CATEGORY
|TYPE_REIMBURSE
|TYPE_SPLIT_CATEGORY
|TYPE_SPLIT_MEMO
|TYPE_SPLIT_AMOUNT
|TYPE_SPLIT_PERCENTAGE
|TYPE_SECURITY_NAME
|TYPE_PRICE
|TYPE_SHARE_QUANTITY
|TYPE_COMMISSION_COSTS
);
KEYWORD_TYPE : '!Type:';
NEWLINE : ('\r'|'\n'|'\r\n');
END_OF_RECORD : '^';
ANY : ~(NEWLINE);
More information about the antlr-interest
mailing list