[antlr-interest] Lexer ambigiuoties

Paul Bouché paul.bouche at apertio.com
Tue Feb 17 15:51:20 PST 2009


Hi,

that does not work. The problem is when I define a rule for unquoted 
strings like: (where komma is a delimiter):

Ustring : Integer ' '+ ~('\n' | '{' | ',') |  Name ' '+ ~('\n' | '{' | 
',') | ~(' ' | '\n' | ',')+;

The lexer will match >>3<< as an integer but >>3 << causes an error 
whereas before this was ok. Of course how should the lexer know that in 
one case blank is supposed to be a whitespace and in another case is 
part of the value, i.e. >>3 a<<.

What I would like to write is:

Ustring : ~Name | ~Integer;

but this is not possible.

BR,
Paul

Sidharth Kuruvila schrieb:
> Try moveing the rule for Name bellow Ipaddress.
>
> Regards,
> Sidharth
>
> On Wed, Feb 18, 2009 at 1:23 AM, "Paul Bouché (NSN)" 
> <paul.bouche at nsn.com <mailto:paul.bouche at nsn.com>> wrote:
>
>     Hi,
>
>     I have a lexer which already recognizes valid tokens of different
>     types,
>     e.g. an integer will generate an integer token, a quoted string a
>     string
>     token, an ip address and ipaddress token etc.
>     E.g:
>
>     property : key '=' value;
>     key : Name;
>     value : Integer | String | Ipaddress;
>     Name : ('a'..'z' | 'A'..'Z' | '0'..'9' | '_' | '-' | ':' | '%')+
>     Integer : ('+'|'-')? ('0'..'9')+;
>     Ipaddress : ('0'..'9')+ '.' ('0'..'9')+ '.' ('0'..'9')+ '.'
>     ('0'..'9')+
>     // simplified, actual grammar is correct max of three digits
>     String :  ( '\'' ( STRING_ | '`' | '"' | '\\' '\'' )* '\''
>             | '"' ( STRING_ | '`' | '\'' | '\\' '"' )* '"'
>             );
>     WHITESPACE
>           :
>           ( ' ' | '\t' | '\n' )+
>           { skip(); }
>           ;
>
>     All works fine. Now I need to include unquoted strings with
>     blanks. The
>     problem is '0 ' (zero blank - without quotes of course). I cannot get
>     the lexer to match this as an Integer as before. Basically I want
>     a rule
>     which says, if it is not something of the previous tokens, try if
>     is an
>     unquoted string. Of course an unquoted string may not have newlines.
>     Any hints on how to archive this?
>     I tried everything and ran several times into code too large
>     exceptions
>     because the actual grammar is much more complex (there are more
>     unquoted
>     values which are recognized by certain prefixed characters such as
>     < 0x
>     :: etc.).
>
>     Thanks a bunch!
>     Paul
>
>     --
>     Paul Bouché
>     Voice: +49 30 590080-1284
>
>     Nokia Siemens Networks GmbH & Co. KG, An den Treptowers 1, 12435
>     Berlin, Germany
>     Sitz der Gesellschaft: München / Registered office: Munich
>     Registergericht: München / Commercial registry: Munich, HRA 88537
>     WEEE-Reg.-Nr.: DE 52984304
>
>     Persönlich haftende Gesellschafterin / General Partner: Nokia
>     Siemens Networks Management GmbH
>     Geschäftsleitung / Board of Directors: Lydia Sommer, Olaf Horsthemke
>     Vorsitzender des Aufsichtsrats / Chairman of supervisory board:
>     Lauri Kivinen
>     Sitz der Gesellschaft: München / Registered office: Munich
>     Registergericht: München / Commercial registry: Munich, HRB 163416
>
>
>     List: http://www.antlr.org/mailman/listinfo/antlr-interest
>     Unsubscribe:
>     http://www.antlr.org/mailman/options/antlr-interest/your-email-address
>
>
>
>
> -- 
> I am but a man.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.antlr.org/pipermail/antlr-interest/attachments/20090218/a8734b6c/attachment.html 


More information about the antlr-interest mailing list