[antlr-interest] Newbie Q: lexer behaviour and lookahead

Mark Junker m.junker at hm-software.de
Mon Aug 7 01:46:38 PDT 2006


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hi,

first I want to say that ANTLR seems to be a great product that most
likely will fit our needs. But I have a (probably stupid) question about
the Lexer and its supported expressions.

First: I use ANTLR 2.7.6 with the C# code generator on my WinXP Pro SP2 box.

Let's assume that we have a file containing one
"property_id=property_value" per line. The problem is that we want to
allow "==" as "escape sequence" to allow a "=" inside the property_id.

I first tried to use the following token specification:

protected PROPERTY_ID_PART : (~('='|'"'|'\''|'{'|'}'|';'))+;
protected EQUALS_ESCAPE : "==";
SEMICOLON : ';';
PROPERTY_ID : PROPERTY_ID_PART (EQUALS_ESCAPE (PROPERTY_ID_PART)?)*;

In the scanner I used something like this:
property :
	property_id EQUALS property_value SEMICOLON
;

property_id :
	PROPERTY_ID
;

property_value :
	// more stuff ...
;

But I got a problem when I tried to parse the following data:
A=B

I always get the message that '=' was expected, but 'B' was found. What
went wrong? My guess was that when it doesn't find a second '=', it
should simply ignore the first '=' and return the found text as the
token text.

- --
Kind regards,
Mark Junker
HM-Software
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.4 (MingW32)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFE1v3t+vrgfQU/RLsRAhkFAJ972L2JdlLWhJWKaIhvcHuNNrLjmQCcDgTT
bk+/ZoZ5xFuXt9U2hWKibwk=
=+kgO
-----END PGP SIGNATURE-----
-------------- next part --------------
A non-text attachment was scrubbed...
Name: m.junker.vcf
Type: text/x-vcard
Size: 264 bytes
Desc: not available
Url : http://www.antlr.org/pipermail/antlr-interest/attachments/20060807/24f7a1d1/m.junker.vcf


More information about the antlr-interest mailing list