[antlr-interest] MissingTokenException and skip tokens
Tobias Wunner
tobias.wunner at gmail.com
Wed Apr 29 03:47:31 PDT 2009
Hello,
I tried to generate some rules which match several numbers in a text
(i.e. several numbers in specific format within arbitrary token
sequences). My number rules work when assuming one number per line and
matching them with:
file: ('\n' number)*
When changing the newline to ".*" the numbers are not matched
correctly anymore. I tracked down the problem to a very simple ruleset
which can match things like
"one"
"two"
"oneandone"
"oneandthree"
"oneandoneplusoneandthree"
"oneandoneplustwo"
with "and" and "plus" acting as number connectors. The simple rule set
is
grammar simpleNumbers;
in : (.* numB)*;
numB : numA 'plus' numA | numA 'plus' | 'plus' numA | numA;
numA : num 'and' num | num;
num : 'one' | 'two' | 'three';
I assumed when having something like:
numA someTokens numA
this would match 2 times the last OR of rule numB. But in some cases
it matches the first OR of numB and returns a MissingTokenException as
in following examples.
(1) twoandone xx one
matches
-------------- next part --------------
A non-text attachment was scrubbed...
Name: parse_1.jpg
Type: image/jpeg
Size: 16428 bytes
Desc: not available
Url : http://www.antlr.org/pipermail/antlr-interest/attachments/20090429/f57ed694/attachment.jpg
-------------- next part --------------
numB( numA(num("two"),"and",num("one")),
MissingTokenException, numA(num("one")) )
where I would have expected to match two times the last OR of numA as
numB(numA(num("two"),"and",num("one"))) and
numB(numA(num("one"))).
(2) plus xx one
matches
-------------- next part --------------
A non-text attachment was scrubbed...
Name: parse_2.jpg
Type: image/jpeg
Size: 10551 bytes
Desc: not available
Url : http://www.antlr.org/pipermail/antlr-interest/attachments/20090429/f57ed694/attachment-0001.jpg
-------------- next part --------------
where I would have expected
numA(num("one"))
only and skip "plus".
For any ideas of a better solution to skip non-valid number tokens I
would be grateful.
Regards,
Toby
More information about the antlr-interest
mailing list