[antlr-interest] Expression embedded in arbitary Text
Dmitry Gusev
dmitry.gusev at gmail.com
Tue Apr 1 04:55:17 PDT 2008
On Tue, Apr 1, 2008 at 3:45 PM, Joachim Rosskopf <antlr at b0nz0.de> wrote:
> Hello Drmitry,
>
> that was the approach I was using previously. I parsed the expressions
> soley with regex. But that was getting pretty ugly and doesn´t work well
> with nested as well as suceeding expression. So I searched for something
> like:
>
> statement
> : ( options { greedy=false; } : . )+
> | ( options { greedy=true; } : EXPRESSION_OPEN! expression
> EXPRESSION_CLOSE! )+
> ;
>
> But the point '.' in the above example stands for any defined lexer rule
> and not any character as I would like to have.
> Is that possible with antlr?
You can always declare a token that match any char you want and refer to
that token
>
> Regards
> ---
> Joachim
>
> Dmitry Gusev schrieb:
> > I'd recommend you to use Regular expressions to extract the "#{bla}"
> > things.
> >
> > Then you'll be able to use these match results as an input to your
> Parser.
> >
> >
> > On Tue, Apr 1, 2008 at 1:21 PM, Joachim Rosskopf <antlr at b0nz0.de
> > <mailto:antlr at b0nz0.de>> wrote:
> >
> > Hello List,
> >
> > currently I´m working on a small grammar to build an expression
> > language
> > for an ETL tool. This works very nice for the expression ( e.g
> > #{foo.bar('test')} ) itself. It gets parsed to the desired AST.
> >
> > But I´m not able to figure out Lexer/Parser rules, that make it
> > possible
> > to embed the expression in arbitary text (e.g. an URI,
> > http://www.dom.com/#{res.uri()}<http://www.dom.com/#%7Bres.uri%28%29%7D>
> > <http://www.dom.com/#%7Bres.uri%28%29%7D> ). So every character
> > not consumed by
> > the expression should be in one rule.
> >
> > Can someone please give me an hint? I attached the grammar.
> > Thank you in advance.
> >
> > Best regards
> > ---
> > Joachim
> >
> > grammar el;
> >
> > options {
> > backtrack=true;
> > output=AST;
> > ASTLabelType=CommonTree;
> > language=CSharp;
> > }
> >
> > tokens {
> > OBJECT_IDENTIFIER;
> > LOGICAL_EXPRESSION;
> > FUNCTIONAL_EXPRESSION;
> > VALUE_EXPRESSION;
> > ARGUMENT_LIST;
> > }
> >
> > @lexer::namespace {
> > DataPumper.AntlrExpressionLanguage
> > }
> >
> > @parser::namespace {
> > DataPumper.AntlrExpressionLanguage
> > }
> >
> > statement
> > : ( options { greedy=true; } : EXPRESSION_OPEN!
> > expression EXPRESSION_CLOSE! )+
> > ;
> >
> > expression
> > : functionalExpression -> ^(
> > FUNCTIONAL_EXPRESSION functionalExpression )
> > | valueExpression -> ^(
> > VALUE_EXPRESSION valueExpression )
> > | literal
> > ;
> >
> > valueExpression
> > : objectIdentifier
> > ;
> >
> >
> > functionalExpression
> > : objectIdentifier BRACE_OPEN! (argumentList)?
> > BRACE_CLOSE!
> > ;
> >
> >
> > argumentList
> > : argument (SEMICOLON argument )* -> ^(
> > ARGUMENT_LIST argument+ )
> > ;
> >
> > argument
> > : ( literal | statement )
> > ;
> >
> >
> > objectIdentifier
> > : IDENTIFIER ( '.' IDENTIFIER )* -> ^(
> > OBJECT_IDENTIFIER IDENTIFIER+ )
> > ;
> >
> > fragment
> > literal
> > : HEX_LITERAL -> ^( HEX_LITERAL )
> > | DECIMAL_LITERAL -> ^( DECIMAL_LITERAL )
> > | OCTAL_LITERAL -> ^( OCTAL_LITERAL )
> > | FLOATING_POINT_LITERAL -> ^( FLOATING_POINT_LITERAL
> )
> > | STRING_LITERAL -> ^( STRING_LITERAL )
> > ;
> >
> > IDENTIFIER
> > : LETTER ( LETTER | '0'..'9')*
> > ;
> >
> > fragment
> > LETTER
> > : 'A'..'Z'
> > | 'a'..'z'
> > ;
> >
> > HEX_LITERAL
> > : '0' ('x'|'X') HEX_DIGIT+
> > ;
> >
> > DECIMAL_LITERAL
> > : ('0' | '1'..'9' '0'..'9'*)
> > ;
> >
> > OCTAL_LITERAL
> > : '0' ('0'..'7')+
> > ;
> >
> > fragment
> > HEX_DIGIT
> > : ('0'..'9' | 'a'..'f' | 'A'..'F')
> > ;
> >
> >
> > FLOATING_POINT_LITERAL
> > : ('0'..'9')+ '.' ('0'..'9')* EXPONENT?
> > | '.' ('0'..'9')+ EXPONENT?
> > | ('0'..'9')+ EXPONENT?
> > ;
> >
> > fragment
> > EXPONENT
> > : ('e'|'E') ('+'|'-')? ('0'..'9')+
> > ;
> >
> >
> > STRING_LITERAL
> > : '\'' STRING '\''
> > ;
> >
> > fragment
> > STRING
> > : ( ESCAPESEQ | ~('\'' | '\\') )*
> > ;
> >
> > fragment
> > ESCAPESEQ
> > : '\\' ('b'|'t'|'n'|'f'|'r'|'\"'|'\''|'\\')
> > ;
> >
> >
> > WS
> > : (' '|'\r'|'\t'|'\u000C'|'\n') { channel=99; }
> > ;
> >
> > SEMICOLON
> > : ','
> > ;
> >
> > EXPRESSION_OPEN
> > : '#{'
> > ;
> >
> > EXPRESSION_CLOSE
> > : '}'
> > ;
> >
> > BRACE_OPEN
> > : '('
> > ;
> >
> > BRACE_CLOSE
> > : ')'
> > ;
> >
> > COMMENT
> > : '/*' ( options {greedy=false;} : . )* '*/' {
> > channel=99; }
> > ;
> >
> > LINE_COMMENT
> > : '//' ~('\n'|'\r')* '\r'? '\n' { channel=99; }
> > ;
> >
> >
> > --
> > Dmitry Gusev
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.antlr.org/pipermail/antlr-interest/attachments/20080401/aa7fdf5d/attachment.html
More information about the antlr-interest
mailing list