[antlr-interest] Lexer ambigiuoties
Paul Bouché
paul.bouche at apertio.com
Tue Feb 17 15:51:20 PST 2009
Hi,
that does not work. The problem is when I define a rule for unquoted
strings like: (where komma is a delimiter):
Ustring : Integer ' '+ ~('\n' | '{' | ',') | Name ' '+ ~('\n' | '{' |
',') | ~(' ' | '\n' | ',')+;
The lexer will match >>3<< as an integer but >>3 << causes an error
whereas before this was ok. Of course how should the lexer know that in
one case blank is supposed to be a whitespace and in another case is
part of the value, i.e. >>3 a<<.
What I would like to write is:
Ustring : ~Name | ~Integer;
but this is not possible.
BR,
Paul
Sidharth Kuruvila schrieb:
> Try moveing the rule for Name bellow Ipaddress.
>
> Regards,
> Sidharth
>
> On Wed, Feb 18, 2009 at 1:23 AM, "Paul Bouché (NSN)"
> <paul.bouche at nsn.com <mailto:paul.bouche at nsn.com>> wrote:
>
> Hi,
>
> I have a lexer which already recognizes valid tokens of different
> types,
> e.g. an integer will generate an integer token, a quoted string a
> string
> token, an ip address and ipaddress token etc.
> E.g:
>
> property : key '=' value;
> key : Name;
> value : Integer | String | Ipaddress;
> Name : ('a'..'z' | 'A'..'Z' | '0'..'9' | '_' | '-' | ':' | '%')+
> Integer : ('+'|'-')? ('0'..'9')+;
> Ipaddress : ('0'..'9')+ '.' ('0'..'9')+ '.' ('0'..'9')+ '.'
> ('0'..'9')+
> // simplified, actual grammar is correct max of three digits
> String : ( '\'' ( STRING_ | '`' | '"' | '\\' '\'' )* '\''
> | '"' ( STRING_ | '`' | '\'' | '\\' '"' )* '"'
> );
> WHITESPACE
> :
> ( ' ' | '\t' | '\n' )+
> { skip(); }
> ;
>
> All works fine. Now I need to include unquoted strings with
> blanks. The
> problem is '0 ' (zero blank - without quotes of course). I cannot get
> the lexer to match this as an Integer as before. Basically I want
> a rule
> which says, if it is not something of the previous tokens, try if
> is an
> unquoted string. Of course an unquoted string may not have newlines.
> Any hints on how to archive this?
> I tried everything and ran several times into code too large
> exceptions
> because the actual grammar is much more complex (there are more
> unquoted
> values which are recognized by certain prefixed characters such as
> < 0x
> :: etc.).
>
> Thanks a bunch!
> Paul
>
> --
> Paul Bouché
> Voice: +49 30 590080-1284
>
> Nokia Siemens Networks GmbH & Co. KG, An den Treptowers 1, 12435
> Berlin, Germany
> Sitz der Gesellschaft: München / Registered office: Munich
> Registergericht: München / Commercial registry: Munich, HRA 88537
> WEEE-Reg.-Nr.: DE 52984304
>
> Persönlich haftende Gesellschafterin / General Partner: Nokia
> Siemens Networks Management GmbH
> Geschäftsleitung / Board of Directors: Lydia Sommer, Olaf Horsthemke
> Vorsitzender des Aufsichtsrats / Chairman of supervisory board:
> Lauri Kivinen
> Sitz der Gesellschaft: München / Registered office: Munich
> Registergericht: München / Commercial registry: Munich, HRB 163416
>
>
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe:
> http://www.antlr.org/mailman/options/antlr-interest/your-email-address
>
>
>
>
> --
> I am but a man.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.antlr.org/pipermail/antlr-interest/attachments/20090218/a8734b6c/attachment.html
More information about the antlr-interest
mailing list