[antlr-interest] Parsing Strings with placeholders
Gavin Lambert
antlr at mirality.co.nz
Wed Feb 25 11:04:30 PST 2009
At 23:16 25/02/2009, Joern Gebhardt wrote:
>a = "wonderful"
>b = "The weather is ${a}."
>
>The usage of these placeholder variables is only allowed inside
>of STRING expressions.
[...]
>My question is now, how do I define the lexer/parser rules in an
>intelligent way so that I can easily replace the placeholders by
>their content?
Personally, I wouldn't bother altering the rules -- just lex it
exactly as you did before (as a monolithic string) and then at
parse or tree-walk time put in some custom code to find and
replace the placeholders.
>I tried this:
>
>IDENTIFIER
> : ('_' | 'a'..'z' | 'A'..'Z' ) ( '_' | 'a'..'z' | 'A'..'Z' |
> '1'..'9' )*
> ;
>
>STRING
> : '"' ( LITERAL | PLACEHOLDER )* '"'
> ;
>
>LITERAL
> : ( EscapeSequence | ~( '\\' | '"' | '\r' | '\n' ) )*
> ;
>
>
>fragment
>EscapeSequence
> : '\\' ( 'b' | 't' | 'n' | 'f' | 'r' | '\"'
> | '\'' | '\\' | ('0'..'3') ('0'..'7') ('0'..'7') |
> ('0'..'7') ('0'..'7') | ('0'..'7') )
> ;
>
>PLACEHOLDER
> : '$' IDENTIFIER
> | '${' IDENTIFIER '}'
> ;
>
>However, now the Lexer has no idea that a "LITERAL" can only
>exist inside a STRING and the matching for the above rules is not
>unambiguous any more.
Your LITERAL and PLACEHOLDER rules should be fragments as well,
since you don't want them being matched at the top level.
More information about the antlr-interest
mailing list