[antlr-interest] lexer rule matching problem

Sun Jan 8 20:46:09 PST 2006

Hi John,
   Seems like this should do just what i want. I'll test it out and
let you know.
Thanks,
Tinker
:)

On 1/6/06, John B. Brodie <jbb at acm.org> wrote:
> Tinker Tailor asked:
> >  I am trying to parse a subset of the vbscript language, and have run
> >into the following problem:
> >   The '&' in VBS can be used in two ways -
> >       1. As a concatenation operator
> >              e.g.:  a = b & c    or   a=b&c
> >       2.As part of the prefix ("&H") and optional suffix('&') for
> >hexadecimal numbers
> >             e.g.:  a=&H9Abc    or  a=&H9Abc&
> >
> >So, here are the rules I made in my lexer (lookahead=3):
> >
> >CONCAT : '&';
> >HEX : "&h" (HEX_DIGIT)+ (('&')?)! ;
> >HEX_DIGIT : '0'..'9' | 'a'..'f' ;
> >
> >Now what I want the lexer to do is to first try and match a hex
> >number, and only when that fails, to try and match for the CONCAT
> >token. But I am not really sure how to tell antlr that. :(
> > As things stand, the lexer first matches CONCAT, and as a result
> >throws the 'unexpected token: exception when I give it the following
> >valid input:
> >     a = &H345ad&
> >
> >Any suggestions?
>
> untested, but perhaps this might do it:
>
> token { HEX; }
> CONCAT : '&' (( 'h' (HEX_DIGIT)+ (('&')?)! ){ $setType(HEX); })? ;
> protected HEX_DIGIT : '0'..'9' | 'a'..'f' ;
>