[antlr-interest] Basic predicate question
Mikesell, Darin B.
Darin.Mikesell at gd-ais.com
Thu Jul 1 13:25:23 PDT 2010
I believe that's why in most languages that support various literals, each literal usually has its own unique prefix/suffix.
(i.e. in C 0x54 - Hex
54f - Float
54 - Decimal
054 - Octal)
Because if you think about it, there really is no way for the parser to recognize that a decimal value is supposed to be a hex value if that hex value is 54.
You could do something like:
HEX_LITERAL : '0' ('x' | 'X') HexDigit+ ;
Fragment
HexDigit : ('0'..'9'|'a'..'f'|'A'..'F') ;
That's how a Hex is defined in the C grammar.
- Darin
-----Original Message-----
From: Zeafla, Larry [mailto:zeaflal at aai.textron.com]
Sent: Thursday, July 01, 2010 1:08 PM
To: Mikesell, Darin B.
Subject: RE: Basic predicate question
Thanks. That looks a like progress. I am guessing that as I progress
further allowing the first integer to be parsed as either an integer or
a hex value will add some new complications when I need to evaluate the
underlying values. I guess I will worry about that later.
I also see that my example was a little over simplified. The letter
following the float is totally unrelated to the float. And overall
there are a lot of differing letter choices. I just simplified that
aspect of the example to the point were the independence was lost.
Thanks
Larry
-----Original Message-----
From: Mikesell, Darin B. [mailto:Darin.Mikesell at gd-ais.com]
Sent: Thursday, July 01, 2010 2:53 PM
To: Zeafla, Larry
Subject: RE: Basic predicate question
Something like this will prevent the ANTLR from getting confused:
grammar sample;
prog : test+ ;
test : 'TEST' COMMA (INT | HEX_OBJ) COMMA FLOAT_OBJ
COMMA HEX_OBJ ;
HEX_OBJ : HEX_DIGIT HEX_DIGIT ;
HEX_DIGIT
: '0'..'9' | 'A'..'F' | 'a'..'f' ;
INT
: '0'..'9'+ ;
FLOAT_OBJ
: '0'..'9'+ ('.' '0'..'9'*)? ('A' | 'B');
COMMA : ',' ;
- Darin
-----Original Message-----
From: antlr-interest-bounces at antlr.org
[mailto:antlr-interest-bounces at antlr.org] On Behalf Of Zeafla, Larry
Sent: Thursday, July 01, 2010 11:03 AM
To: antlr-interest at antlr.org
Subject: [antlr-interest] Basic predicate question
I am new to Antlr, which I am trying to use to parse simple existing
messages. The message structure is exceptionally simple and
straightforward. Message fields include integer and floating-point
numbers, single letter codes, and field separator characters. Each
individual message type has a narrowly defined structure, needs no look
ahead, and typically has at most 2 possible tokens for any location in
the message.
My problem is that one of the fields is a 2-digit (in ASCII)
representation of a hex number. This is known purely from context. It
seems there should be a simple technique (probably a predicate), to
force this behavior. I just can't seem to find it.
Here is a short sample grammar to illustrate:
grammar sample;
prog : test+ ;
test : 'TEST' COMMA INT COMMA FLOAT ( 'A' | 'B' )
COMMA HEX_DIGIT HEX_DIGIT ;
HEX_DIGIT : '0'..'9' | 'A'..'F' | 'a'..'f' ;
INT : '0'..'9'+ ;
FLOAT : '0'..'9'+ ('.' '0'..'9'*)? ;
COMMA : ',' ;
The associated test input is:
TEST,123,5.6A,2D
TEST,321,4.20A,3B
TEST,45,5.68B,78
For this example, the hex digits are the last 2 characters on each line.
For the first test statement, parsing is successful. For the second, I
get a MismatchedTokenException (0!=0) on the B (the last character).
For the third, I get a MismatchedTokenException(0!=0) on the 7 (the
next to last character). I am definitely confused.
Thanks,
Larry
List: http://www.antlr.org/mailman/listinfo/antlr-interest
Unsubscribe:
http://www.antlr.org/mailman/options/antlr-interest/your-email-address
More information about the antlr-interest
mailing list