[antlr-interest] Parsing Strings with placeholders
Joern Gebhardt
joern.gebhardt at gmail.com
Wed Feb 25 02:16:51 PST 2009
Hi,
my language is able to define and use some variables/placeholders similar to
the UNIX shell scripts:
a = "wonderful"
b = "The weather is ${a}."
The usage of these placeholder variables is only allowed inside of STRING
expressions.
My question is now, how do I define the lexer/parser rules in an intelligent
way so that I can easily replace the placeholders by their content?
Without the placeholders my STRING lexer rules looks like this:
STRING
: '"' ( EscapeSequence | ~( '\\' | '"' | '\r' | '\n' ) )* '"'
;
fragment
EscapeSequence
: '\\' ( 'b' | 't' | 'n' | 'f' | 'r' | '\"' | '\'' |
'\\' | ('0'..'3') ('0'..'7') ('0'..'7') | ('0'..'7') ('0'..'7') |
('0'..'7') )
;
Can anybody please give me a hint how I get the placeholders inside of that?
I tried this:
IDENTIFIER
: ('_' | 'a'..'z' | 'A'..'Z' ) ( '_' | 'a'..'z' | 'A'..'Z' | '1'..'9' )*
;
STRING
: '"' ( LITERAL | PLACEHOLDER )* '"'
;
LITERAL
: ( EscapeSequence | ~( '\\' | '"' | '\r' | '\n' ) )*
;
fragment
EscapeSequence
: '\\' ( 'b' | 't' | 'n' | 'f' | 'r' | '\"' | '\'' |
'\\' | ('0'..'3') ('0'..'7') ('0'..'7') | ('0'..'7') ('0'..'7') |
('0'..'7') )
;
PLACEHOLDER
: '$' IDENTIFIER
| '${' IDENTIFIER '}'
;
However, now the Lexer has no idea that a "LITERAL" can only exist inside a
STRING and the matching for the above rules is not unambiguous any more.
Thanks in advance for any useful hints,
Joe
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.antlr.org/pipermail/antlr-interest/attachments/20090225/653ac88b/attachment.html
More information about the antlr-interest
mailing list