[antlr-interest] Parsing Strings with placeholders

Johannes Luber JALuber at gmx.de
Wed Feb 25 03:47:30 PST 2009


> Hi,
> 
> my language is able to define and use some variables/placeholders similar
> to
> the UNIX shell scripts:
> 
> a = "wonderful"
> b = "The weather is ${a}."
> 
> The usage of these placeholder variables is only allowed inside of STRING
> expressions.
> 
> My question is now, how do I define the lexer/parser rules in an
> intelligent
> way so that I can easily replace the placeholders by their content?

I think using island-grammars may be the solution. See <http://www.antlr.org/wiki/display/ANTLR3/Island+Grammars+Under+Parser+Control> as well the mentioned example.

Johannes
> 
> Without the placeholders my STRING lexer rules looks like this:
> 
> STRING
>     :   '"' (  EscapeSequence | ~( '\\' | '"' | '\r' | '\n'  )  )*  '"'
>     ;
> 
> fragment
> EscapeSequence
>     :   '\\' ( 'b' |  't'  |   'n'  |   'f'  |   'r'  |   '\"' |   '\''  |
> '\\'   |  ('0'..'3') ('0'..'7') ('0'..'7')  | ('0'..'7') ('0'..'7')  |
> ('0'..'7')  )
>     ;
> 
> Can anybody please give me a hint how I get the placeholders inside of
> that?
> 
> I tried this:
> 
> IDENTIFIER
>     : ('_' | 'a'..'z' | 'A'..'Z' ) ( '_' | 'a'..'z' | 'A'..'Z' | '1'..'9'
> )*
>   ;
> 
> STRING
>     :    '"' ( LITERAL | PLACEHOLDER )* '"'
>     ;
> 
> LITERAL
>     :    (  EscapeSequence | ~( '\\' | '"' | '\r' | '\n'  )  )*
>     ;
> 
> 
> fragment
> EscapeSequence
>     :   '\\' ( 'b' |  't'  |   'n'  |   'f'  |   'r'  |   '\"' |   '\''  |
> '\\'   |  ('0'..'3') ('0'..'7') ('0'..'7')  | ('0'..'7') ('0'..'7')  |
> ('0'..'7')  )
>     ;
> 
> PLACEHOLDER
>     :    '$' IDENTIFIER
>     |    '${' IDENTIFIER '}'
>     ;
> 
> However, now the Lexer has no idea that a "LITERAL" can only exist inside
> a
> STRING and the matching for the above rules is not unambiguous any more.
> 
> Thanks in advance for any useful hints,
> Joe

-- 
Psssst! Schon vom neuen GMX MultiMessenger gehört? Der kann`s mit allen: http://www.gmx.net/de/go/multimessenger01


More information about the antlr-interest mailing list