[antlr-interest] Expression embedded in arbitary Text

Dmitry Gusev dmitry.gusev at gmail.com
Tue Apr 1 04:18:00 PDT 2008


I'd recommend you to use Regular expressions to extract the "#{bla}" things.

Then you'll be able to use these match results as an input to your Parser.


On Tue, Apr 1, 2008 at 1:21 PM, Joachim Rosskopf <antlr at b0nz0.de> wrote:

> Hello List,
>
> currently I´m working on a small grammar to build an expression language
> for an ETL tool. This works very nice for the expression ( e.g
> #{foo.bar('test')} ) itself. It gets parsed to the desired AST.
>
> But I´m not able to figure out Lexer/Parser rules, that make it possible
> to embed the expression in arbitary text (e.g. an URI,
> http://www.dom.com/#{res.uri()} <http://www.dom.com/#%7Bres.uri%28%29%7D>). So every character not consumed by
> the expression should be in one rule.
>
> Can someone please give me an hint? I attached the grammar.
> Thank you in advance.
>
> Best regards
> ---
> Joachim
>
> grammar el;
>
> options {
>        backtrack=true;
>        output=AST;
>        ASTLabelType=CommonTree;
>        language=CSharp;
> }
>
> tokens {
>        OBJECT_IDENTIFIER;
>        LOGICAL_EXPRESSION;
>        FUNCTIONAL_EXPRESSION;
>        VALUE_EXPRESSION;
>        ARGUMENT_LIST;
> }
>
> @lexer::namespace {
>        DataPumper.AntlrExpressionLanguage
> }
>
> @parser::namespace {
>        DataPumper.AntlrExpressionLanguage
> }
>
> statement
>        :       ( options { greedy=true; }  :    EXPRESSION_OPEN!
> expression EXPRESSION_CLOSE! )+
>        ;
>
> expression
>        :       functionalExpression            -> ^( FUNCTIONAL_EXPRESSION
> functionalExpression )
>        |       valueExpression                 -> ^( VALUE_EXPRESSION
> valueExpression )
>        |       literal
>        ;
>
> valueExpression
>        :       objectIdentifier
>        ;
>
>
> functionalExpression
>        :       objectIdentifier BRACE_OPEN! (argumentList)? BRACE_CLOSE!
>        ;
>
>
> argumentList
>        :       argument (SEMICOLON argument )*         -> ^( ARGUMENT_LIST
> argument+ )
>        ;
>
> argument
>        :        ( literal | statement )
>        ;
>
>
> objectIdentifier
>        :       IDENTIFIER ( '.' IDENTIFIER )* -> ^( OBJECT_IDENTIFIER
> IDENTIFIER+ )
>        ;
>
> fragment
> literal
>        :       HEX_LITERAL             -> ^( HEX_LITERAL )
>        |       DECIMAL_LITERAL         -> ^( DECIMAL_LITERAL )
>        |       OCTAL_LITERAL           -> ^( OCTAL_LITERAL )
>        |       FLOATING_POINT_LITERAL  -> ^( FLOATING_POINT_LITERAL )
>        |       STRING_LITERAL          -> ^( STRING_LITERAL )
>        ;
>
> IDENTIFIER
>        :       LETTER ( LETTER | '0'..'9')*
>        ;
>
> fragment
> LETTER
>        :       'A'..'Z'
>        |       'a'..'z'
>        ;
>
> HEX_LITERAL
>        :       '0' ('x'|'X') HEX_DIGIT+
>        ;
>
> DECIMAL_LITERAL
>        :       ('0' | '1'..'9' '0'..'9'*)
>        ;
>
> OCTAL_LITERAL
>        :       '0' ('0'..'7')+
>        ;
>
> fragment
> HEX_DIGIT
>        :       ('0'..'9' | 'a'..'f' | 'A'..'F')
>        ;
>
>
> FLOATING_POINT_LITERAL
>        :       ('0'..'9')+ '.' ('0'..'9')* EXPONENT?
>        |       '.' ('0'..'9')+ EXPONENT?
>        |       ('0'..'9')+ EXPONENT?
>        ;
>
> fragment
> EXPONENT
>        :       ('e'|'E') ('+'|'-')? ('0'..'9')+
>        ;
>
>
> STRING_LITERAL
>        :       '\'' STRING '\''
>        ;
>
> fragment
> STRING
>        :       ( ESCAPESEQ | ~('\'' | '\\') )*
>        ;
>
> fragment
> ESCAPESEQ
>        :       '\\' ('b'|'t'|'n'|'f'|'r'|'\"'|'\''|'\\')
>        ;
>
>
> WS
>        :       (' '|'\r'|'\t'|'\u000C'|'\n') { channel=99; }
>        ;
>
> SEMICOLON
>        :       ','
>        ;
>
> EXPRESSION_OPEN
>        :       '#{'
>        ;
>
> EXPRESSION_CLOSE
>        :       '}'
>        ;
>
> BRACE_OPEN
>        :       '('
>        ;
>
> BRACE_CLOSE
>        :       ')'
>        ;
>
> COMMENT
>        :       '/*' ( options {greedy=false;} : . )* '*/' { channel=99; }
>        ;
>
> LINE_COMMENT
>        :       '//' ~('\n'|'\r')* '\r'? '\n' { channel=99; }
>        ;
>
>
--
Dmitry Gusev
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.antlr.org/pipermail/antlr-interest/attachments/20080401/8ea13939/attachment-0001.html 


More information about the antlr-interest mailing list