[antlr-interest] Help with pesky Lexer determinism

Mark Bednarczyk voytechs at yahoo.com
Fri Jun 10 07:19:23 PDT 2005


Interesting code, thank you.

One thing I would suggest, and this comes from experience is
that I wouldn't let the lexer tokenize the '/' as a NETRANGE
part and return the whole thing as a single token. The reason
for this is that this totally takes out all flexibility out of a
IP knowledgable language. In my NPL language I overload '/'
operator in an expression and if the left part is of type
ipaddress (or in my case IP_V4 or IP_V6 token types) then the
right part must evaluate to a netmask in form of another
IP_V4/IP_v6 or integer.

Checkout examples of this I had working using a manual
lexer/parser I build. If I let the lexer build the IP address
with a range, I wouldn't have this flexibility:

http://netrepository.org/jnetstream/releases/0.3.0/index.htm#typ
ecasting_example

Cheers,
mark...



>-----Original Message-----
>From: antlr-interest-bounces at antlr.org
>[mailto:antlr-interest-bounces at antlr.org]On Behalf Of Nigel
>Sheridan-Smith
>Sent: Friday, June 10, 2005 1:31 AM
>To: 'ANTLR Interest'
>Subject: RE: [antlr-interest] Help with pesky Lexer determinism
>
>
>
>> -----Original Message-----
>> From: antlr-interest-bounces at antlr.org
>[mailto:antlr-interest-
>> bounces at antlr.org] On Behalf Of Mark Bednarczyk
>> Sent: Tuesday, 7 June 2005 10:16 AM
>> To: ANTLR Interest
>> Subject: RE: [antlr-interest] Help with pesky Lexer
>determinism
>>
>> Well I have another problem that is a little more involved so
>> maybe I can just get a couple of quick pointers.
>Same issue but
>> now with IPv6 address that actually steps of the toes on the
>> IDENT rule since IPv6 address is comprised of HEX digits so
>> 'a'..'f' overlap with IDENT rule of 'a'..'z'.
>>
>
>
>Here's a first cut attempt at solving these issues
>(attached)... obviously
>you still need some more checks inserted to guarantee
>that the tokens are
>valid.
>
>Also, it doesn't deal with negative numbers (mantissa
>or exponent).
>Furthermore, I have an inkling that IPv4 numbers can
>be in different forms
>(decimal, hexadecimal, dotted and non-dotted forms, etc).
>
>Not sure how much time I will have to finish this for you...
>
>Nigel
>
>--
>Nigel Sheridan-Smith
>PhD research student
>
>Faculty of Engineering
>University of Technology, Sydney
>Phone: 02 9514 7946
>Fax: 02 9514 2435
>
>
>
>




More information about the antlr-interest mailing list