[antlr-interest] Non-disjoint tokens
Steve Bennett
stevagewp at gmail.com
Sun Dec 2 20:24:28 PST 2007
On 11/26/07, Gavin Lambert <antlr at mirality.co.nz> wrote:
> The usual trick with common-prefix literals (or perhaps the
> "other" usual trick, since Austin already posted the semantic
> predicate version) is to compose them into a single rule. The key
> point is to explicitly give ANTLR the alternatives so that it
> doesn't try to plunge ahead without looking first.
Hey, I just gave this a go on a similar problem and it works really well!
In this case, I want to recognise ISBN's in normal text and treat them
specially. However, if the ISBN is malformed (even slightly), I want
to treat it like any other sequence of random letters and numbers.
This solution is elegant enough for me:
----
ISBN_LINK:
((ISBN_LINK_ACTUAL (LETTERS | PUNCTUATION | N)) => ISBN_LINK_ACTUAL
| LETTERS { $type=LETTERS; }
);
fragment
ISBN_LINK_ACTUAL:
'ISBN'
' '+
('97' ('8' | '9'))?
((' ' | '-')? '0'..'9')
((' ' | '-')? '0'..'9')
((' ' | '-')? '0'..'9')
((' ' | '-')? '0'..'9')
((' ' | '-')? '0'..'9')
((' ' | '-')? '0'..'9')
((' ' | '-')? '0'..'9')
((' ' | '-')? '0'..'9')
((' ' | '-')? '0'..'9')
((' ' | '-')? '0'..'9' | 'X' | 'x');
LETTERS: ('a'..'z' | 'A'..'Z')+;
DIGITS: ('0'..'9')+;
PUNCTUATION: '-' | ' ' | '.' | ',';
N: '\r'? '\n';
----
And as an added bonus, it becomes possible to add trailing
requirements (ie, the ISBN must be followed by non-digits) because you
already have the syntactic predicate.
This has made my day :)
QUESTION: Why doesn't putting ~DIGITS in the syntactic predicate work?
Steve
More information about the antlr-interest
mailing list