[antlr-interest] use of semantic predicates and hoisting
Arthur Goldberg
goldberg at cbio.mskcc.org
Mon Nov 22 16:22:35 PST 2010
All
I've built a grammar that uses a couple of sets of keywords in multiple
places.
they're called dataTypeNames and dataTypeLevels (they're actually
genetic measurement data types, and levels for discrete values).
the grammar works -- ANTLR is cool -- but I'm having trouble making
satisfactory error messages.
Here's elided versions of some key rules.
dataTypeSpec
:
dataTypeName
| dataTypeLevel
| discreteDataType
;
discreteDataType
:
( dataTypeName comparisonOP dataTypeLevel ) |
( dataTypeName SIGNED_INT )
;
dataTypeName
:
{ DataTypeSpecEnumerations.isDataTypeName( input.LT(1).getText()) }?
ID
;
dataTypeLevel
:
{ DataTypeSpecEnumerations.isDataTypeLevel(input.LT(1).getText())}?
ID
;
comparisonOP
: COMPARISON_OP
{
// ACTION: convert to enumeration
$theComparisonOp = ComparisonOp.convertCode( $COMPARISON_OP.text );
}
;
COMPARISON_OP
// awkward to convert to enumeration in COMPARISON_OP cuz of char /
text distinction for 1/longer tokens; see bottom p. 139 T. Parr
: ( '<=' | '<' | '>' | '>=' )
;
ID : ('a'..'z'|'A'..'Z'|'_') ('a'..'z'|'A'..'Z'|'0'..'9'|'_')*
;
SIGNED_INT : ('-')? '0'..'9'+ ;
DataTypeSpecEnumerations.isDataTypeName and
DataTypeSpecEnumerations.isDataTypeLevel indicate whether a String is a
good dataTypeName or dataTypeLevel, respectively. those functions are a
little complex, so they cannot be hard-coded in the lexer.
the parser does recognize properly well-formed dataTypeSpecs. but when
the input is wrong, i want to be able to report errors like
<token> is not a valid <dataTypeName> or <token> is not a valid
<dataTypeLevel>.
(given that dataTypeName and dataTypeLevel are each just an ID, the same
token may get reported multiple times. that's OK.)
my thought was to override String
org.antlr.runtime.BaseRecognizer.getErrorMessage (
RecognitionException e, String[] tokenNames ) and report
errors when e is a FailedPredicateException.
but to my surprise, bad dataTypeNames or dataTypeLevels don't generate
FailedPredicateException, because they're hoisted into dataTypeSpec.
what's a good way to handle this?
i don't want to combine dataTypeName and dataTypeLevel into a single
production, because they're used in different places.
the predicates must go before the IDs, or otherwise dataTypeSpec won't
compile.
is it possible to turn off hoisting?
Thanks
Arthur
--
Senior Research Scientist
Computational Biology
Memorial Sloan-Kettering Cancer Center
More information about the antlr-interest
mailing list