[antlr-interest] Expression embedded in arbitary Text
Dmitry Gusev
dmitry.gusev at gmail.com
Tue Apr 1 04:18:00 PDT 2008
I'd recommend you to use Regular expressions to extract the "#{bla}" things.
Then you'll be able to use these match results as an input to your Parser.
On Tue, Apr 1, 2008 at 1:21 PM, Joachim Rosskopf <antlr at b0nz0.de> wrote:
> Hello List,
>
> currently I´m working on a small grammar to build an expression language
> for an ETL tool. This works very nice for the expression ( e.g
> #{foo.bar('test')} ) itself. It gets parsed to the desired AST.
>
> But I´m not able to figure out Lexer/Parser rules, that make it possible
> to embed the expression in arbitary text (e.g. an URI,
> http://www.dom.com/#{res.uri()} <http://www.dom.com/#%7Bres.uri%28%29%7D>). So every character not consumed by
> the expression should be in one rule.
>
> Can someone please give me an hint? I attached the grammar.
> Thank you in advance.
>
> Best regards
> ---
> Joachim
>
> grammar el;
>
> options {
> backtrack=true;
> output=AST;
> ASTLabelType=CommonTree;
> language=CSharp;
> }
>
> tokens {
> OBJECT_IDENTIFIER;
> LOGICAL_EXPRESSION;
> FUNCTIONAL_EXPRESSION;
> VALUE_EXPRESSION;
> ARGUMENT_LIST;
> }
>
> @lexer::namespace {
> DataPumper.AntlrExpressionLanguage
> }
>
> @parser::namespace {
> DataPumper.AntlrExpressionLanguage
> }
>
> statement
> : ( options { greedy=true; } : EXPRESSION_OPEN!
> expression EXPRESSION_CLOSE! )+
> ;
>
> expression
> : functionalExpression -> ^( FUNCTIONAL_EXPRESSION
> functionalExpression )
> | valueExpression -> ^( VALUE_EXPRESSION
> valueExpression )
> | literal
> ;
>
> valueExpression
> : objectIdentifier
> ;
>
>
> functionalExpression
> : objectIdentifier BRACE_OPEN! (argumentList)? BRACE_CLOSE!
> ;
>
>
> argumentList
> : argument (SEMICOLON argument )* -> ^( ARGUMENT_LIST
> argument+ )
> ;
>
> argument
> : ( literal | statement )
> ;
>
>
> objectIdentifier
> : IDENTIFIER ( '.' IDENTIFIER )* -> ^( OBJECT_IDENTIFIER
> IDENTIFIER+ )
> ;
>
> fragment
> literal
> : HEX_LITERAL -> ^( HEX_LITERAL )
> | DECIMAL_LITERAL -> ^( DECIMAL_LITERAL )
> | OCTAL_LITERAL -> ^( OCTAL_LITERAL )
> | FLOATING_POINT_LITERAL -> ^( FLOATING_POINT_LITERAL )
> | STRING_LITERAL -> ^( STRING_LITERAL )
> ;
>
> IDENTIFIER
> : LETTER ( LETTER | '0'..'9')*
> ;
>
> fragment
> LETTER
> : 'A'..'Z'
> | 'a'..'z'
> ;
>
> HEX_LITERAL
> : '0' ('x'|'X') HEX_DIGIT+
> ;
>
> DECIMAL_LITERAL
> : ('0' | '1'..'9' '0'..'9'*)
> ;
>
> OCTAL_LITERAL
> : '0' ('0'..'7')+
> ;
>
> fragment
> HEX_DIGIT
> : ('0'..'9' | 'a'..'f' | 'A'..'F')
> ;
>
>
> FLOATING_POINT_LITERAL
> : ('0'..'9')+ '.' ('0'..'9')* EXPONENT?
> | '.' ('0'..'9')+ EXPONENT?
> | ('0'..'9')+ EXPONENT?
> ;
>
> fragment
> EXPONENT
> : ('e'|'E') ('+'|'-')? ('0'..'9')+
> ;
>
>
> STRING_LITERAL
> : '\'' STRING '\''
> ;
>
> fragment
> STRING
> : ( ESCAPESEQ | ~('\'' | '\\') )*
> ;
>
> fragment
> ESCAPESEQ
> : '\\' ('b'|'t'|'n'|'f'|'r'|'\"'|'\''|'\\')
> ;
>
>
> WS
> : (' '|'\r'|'\t'|'\u000C'|'\n') { channel=99; }
> ;
>
> SEMICOLON
> : ','
> ;
>
> EXPRESSION_OPEN
> : '#{'
> ;
>
> EXPRESSION_CLOSE
> : '}'
> ;
>
> BRACE_OPEN
> : '('
> ;
>
> BRACE_CLOSE
> : ')'
> ;
>
> COMMENT
> : '/*' ( options {greedy=false;} : . )* '*/' { channel=99; }
> ;
>
> LINE_COMMENT
> : '//' ~('\n'|'\r')* '\r'? '\n' { channel=99; }
> ;
>
>
--
Dmitry Gusev
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.antlr.org/pipermail/antlr-interest/attachments/20080401/8ea13939/attachment-0001.html
More information about the antlr-interest
mailing list