[antlr-interest] Basic predicate question

Mikesell, Darin B. Darin.Mikesell at gd-ais.com
Thu Jul 1 13:25:23 PDT 2010


I believe that's why in most languages that support various literals, each literal usually has its own unique prefix/suffix.

(i.e. in C	0x54 - Hex
		54f - Float
		54 - Decimal
		054 - Octal)


Because if you think about it, there really is no way for the parser to recognize that a decimal value is supposed to be a hex value if that hex value is 54.

You could do something like:

HEX_LITERAL : '0' ('x' | 'X') HexDigit+ ;

Fragment
HexDigit : ('0'..'9'|'a'..'f'|'A'..'F') ;


That's how a Hex is defined in the C grammar.


- Darin



-----Original Message-----
From: Zeafla, Larry [mailto:zeaflal at aai.textron.com] 
Sent: Thursday, July 01, 2010 1:08 PM
To: Mikesell, Darin B.
Subject: RE: Basic predicate question

Thanks.  That looks a like progress.  I am guessing that as I progress
further allowing the first integer to be parsed as either an integer or
a hex value will add some new complications when I need to evaluate the
underlying values.  I guess I will worry about that later.

I also see that my example was a little over simplified.  The letter
following the float is totally unrelated to the float.  And overall
there are a lot of differing letter choices.  I just simplified that
aspect of the example to the point were the independence was lost.

Thanks

     Larry

 

-----Original Message-----
From: Mikesell, Darin B. [mailto:Darin.Mikesell at gd-ais.com] 
Sent: Thursday, July 01, 2010 2:53 PM
To: Zeafla, Larry
Subject: RE: Basic predicate question

Something like this will prevent the ANTLR from getting confused:

grammar sample;

prog 	:	test+ ;

test	:	'TEST' COMMA (INT | HEX_OBJ) COMMA FLOAT_OBJ
		COMMA HEX_OBJ ;

HEX_OBJ	:	HEX_DIGIT HEX_DIGIT ;

HEX_DIGIT
	:	'0'..'9' | 'A'..'F' | 'a'..'f' ;

INT	
	:	'0'..'9'+ ;
	
FLOAT_OBJ	
	:	'0'..'9'+ ('.' '0'..'9'*)? ('A' | 'B');
	
COMMA	:	',' ;




- Darin



-----Original Message-----
From: antlr-interest-bounces at antlr.org
[mailto:antlr-interest-bounces at antlr.org] On Behalf Of Zeafla, Larry
Sent: Thursday, July 01, 2010 11:03 AM
To: antlr-interest at antlr.org
Subject: [antlr-interest] Basic predicate question

I am new to Antlr, which I am trying to use to parse simple existing
messages.  The message structure is exceptionally simple and
straightforward.  Message fields include integer and floating-point
numbers, single letter codes, and field separator characters.  Each
individual message type has a narrowly defined structure, needs no look
ahead, and typically has at most 2 possible tokens for any location in
the message.

 

My problem is that one of the fields is a 2-digit (in ASCII)
representation of a hex number.  This is known purely from context.  It
seems there should be a simple technique (probably a predicate), to
force this behavior.  I just can't seem to find it.

 

Here is a short sample grammar to illustrate:

          grammar sample;
          prog   :   test+ ;
          test    :   'TEST' COMMA INT COMMA FLOAT ( 'A' | 'B' ) 

                              COMMA HEX_DIGIT  HEX_DIGIT    ;

          HEX_DIGIT   :  '0'..'9' | 'A'..'F' | 'a'..'f'  ;
          INT         :  '0'..'9'+ ;
          FLOAT       :  '0'..'9'+ ('.' '0'..'9'*)? ; 
          COMMA       :  ',' ;

The associated test input is:

          TEST,123,5.6A,2D

          TEST,321,4.20A,3B

          TEST,45,5.68B,78            



For this example, the hex digits are the last 2 characters on each line.
For the first test statement, parsing is successful.  For the second, I
get a MismatchedTokenException (0!=0) on the B (the last character).
For the third, I get a MismatchedTokenException(0!=0)  on the 7 (the
next to last character).  I am definitely confused.

 

Thanks,

 

    Larry

 


List: http://www.antlr.org/mailman/listinfo/antlr-interest
Unsubscribe:
http://www.antlr.org/mailman/options/antlr-interest/your-email-address


More information about the antlr-interest mailing list