[antlr-interest] Newbie Q: lexer behaviour and lookahead
Mark Junker
m.junker at hm-software.de
Mon Aug 7 01:46:38 PDT 2006
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Hi,
first I want to say that ANTLR seems to be a great product that most
likely will fit our needs. But I have a (probably stupid) question about
the Lexer and its supported expressions.
First: I use ANTLR 2.7.6 with the C# code generator on my WinXP Pro SP2 box.
Let's assume that we have a file containing one
"property_id=property_value" per line. The problem is that we want to
allow "==" as "escape sequence" to allow a "=" inside the property_id.
I first tried to use the following token specification:
protected PROPERTY_ID_PART : (~('='|'"'|'\''|'{'|'}'|';'))+;
protected EQUALS_ESCAPE : "==";
SEMICOLON : ';';
PROPERTY_ID : PROPERTY_ID_PART (EQUALS_ESCAPE (PROPERTY_ID_PART)?)*;
In the scanner I used something like this:
property :
property_id EQUALS property_value SEMICOLON
;
property_id :
PROPERTY_ID
;
property_value :
// more stuff ...
;
But I got a problem when I tried to parse the following data:
A=B
I always get the message that '=' was expected, but 'B' was found. What
went wrong? My guess was that when it doesn't find a second '=', it
should simply ignore the first '=' and return the found text as the
token text.
- --
Kind regards,
Mark Junker
HM-Software
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.4 (MingW32)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
iD8DBQFE1v3t+vrgfQU/RLsRAhkFAJ972L2JdlLWhJWKaIhvcHuNNrLjmQCcDgTT
bk+/ZoZ5xFuXt9U2hWKibwk=
=+kgO
-----END PGP SIGNATURE-----
-------------- next part --------------
A non-text attachment was scrubbed...
Name: m.junker.vcf
Type: text/x-vcard
Size: 264 bytes
Desc: not available
Url : http://www.antlr.org/pipermail/antlr-interest/attachments/20060807/24f7a1d1/m.junker.vcf
More information about the antlr-interest
mailing list