[stringtemplate-interest] What is in an identifier

John Snyders jjsnyders at rcn.com
Thu Oct 26 15:48:49 PDT 2006


Is this a bug?

Why do action.g and group.g/interface.g have different definitions for ID?
The difference is that action.g allows _ to be the first character of an
identifier and allows / in the identifier.

group.g/interface.g

ID : ('a'..'z'|'A'..'Z') ('a'..'z'|'A'..'Z'|'0'..'9'|'-'|'_')*
;
action.g

ID
options {
testLiterals=true;
}
: ('a'..'z'|'A'..'Z'|'_') ('a'..'z'|'A'..'Z'|'0'..'9'|'_'|'/')*
;

It seems to me that since identifiers are used to reference Java members or
properties that the syntax for ID should be close to (if not the same as)
what Java allows for identifiers.

The Java language spec is a little vague about what it allows for an
identifier but it is basically anything that in some lanuage would be used
to form words.
I only speak English so I only use a-z, A-Z, 0-9 but I can define a property
getter such as: public int get\u0391Fooa() {...} I would not be able to
access this property using ST.

Specifically Java allows _ and $ for any character including the first one.
You are discouraged from using $ which is good because it would conflict
with ST's use of $.

How did / sneak in?

For consistency and to reduce confusion I think that all the grammars should
use the same definition for ID.

I think that _ should be allowed as the first character in ID since Java
allows it.

For better i18n support the same set of characters that
Character.isJavaIdentifierStart and Character.isJavaIdentifierPart do.

-John
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.antlr.org:8080/pipermail/stringtemplate-interest/attachments/20061026/429f2776/attachment.html 


More information about the stringtemplate-interest mailing list