[antlr-interest] How does INTEGER+ '.' INTEGER+ match "2."?
John B. Brodie
jbb at acm.org
Sun Aug 8 18:21:46 PDT 2010
On Sun, 2010-08-08 at 20:50 -0400, Kevin J. Cummings wrote:
> On 08/08/2010 08:35 PM, Ken Klose wrote:
> > Thanks for replying.
> >
> > 2. is not a valid PRICE. PRICE should have at least 1 digit following the
> > '.'. In the context of the string that I am trying to match "2." doesn't
> > have any particular significance, it is neither an INTEGER nor a PRICE. It
> > is simply an INTEGER following by an SYMBOL token. What I don't understand
> > is why ANTLR is getting hung up trying to match it as a PRICE when it
> > doesn't conform to the PRICE specification. PRICE specifies another INTEGER
> > following the '.' which this input doesn't have.
>
> Ken,
> What you are missing is that PRICE is that PRICE is a token. Tokens
> get matched based on longest possible match. Once the lexer sees that
> it has an INTEGER followed by a '.', its path is chosen. Its either a
> PRICE or its an error (which you are seeing). If that is not your
> intent, then you need to fix your lexer so that it knows better.
>
> Gerald poses a possible solution. But, perhaps he doesn't go far
> enough. Would the following work for you?
>
> INTEGER: DIGIT+ ( '.' DIGIT+ { $type=PRICE; } )?
> ;
>
> Now, if the lexer sees an INTEGER followed by a '.', it *must* be
> followed by DIGITs otherwise, it will just lex an INTEGER, and then try
> and deal with the '.' character....
>
this is (i think) one of the very rare instances where a Syntactic
Predicate is appropriate -- because the implicit look-ahead involved is
clearly bounded. generally you should avoid any predicates and/or
back-tracking because of the potential unbounded look-ahead. but that is
not an issue in this intance.
so try:
INTEGER: DIGIT+ ( ('.' DIGIT)=> '.' DIGIT+ {$type=PRICE;} )? ;
where PRICE is an imaginary token defined in a tokens{} block before any
rule in your grammar.
also, as an aside, ... i would be *VERY* worried by your SYMBOL lexer
rule --- use of the negation meta-syntax has always given me more
problems than solutions. please be sure to unit-test the heck out of
that puppy ;-) YMMV
> > On Sun, Aug 8, 2010 at 7:28 PM, Gerald Rosenberg <gerald at certiv.net> wrote:
> >
> >>
> >> ------ Original Message (Sunday, August 08, 2010 6:42:55 PM) From: Ken
> >> Klose ------
> >> Subject: [antlr-interest] How does INTEGER+ '.' INTEGER+ match "2."?
> >>
> >> INTEGER: DIGIT+;
> >>> PRICE: INTEGER '.' INTEGER;
> >>>
> >> Integer and price are ambiguous and, if "2." is a valid price, need to make
> >> the decimal field optional.
> >>
> >> Try:
> >>
> >> INTEGER : DIGIT+
> >> ( '.' (DIGIT+)? { $type=PRICE; } // define PRICE
> >> in the token block
> >> | // just an integer
> >> )
> >> ;
> >>
More information about the antlr-interest
mailing list