[antlr-interest] A newbie having problems creating his firstgrammar...

Niklas Söderberg niklas.soderberg at gmail.com
Thu Sep 27 00:41:23 PDT 2007


Bruce,
Thanks a lot for your tips! As you say, when I output the tree in c# code, I
also got the expected results regarding parens. I guess I was too focussed
on the AntlrWorks diagram to test the actual generated code. Adding the
"EOF!" that you mentioned also made it look correct in the AntlrWorks
interpreter diagram. Thanks!

I would be very interested in looking at your start of the lisp-parser, it
would probably give me some fresh ideas on how to attack my problem.

The AST rewrite rules sure made the tree more simplified. I searched for
some documentation on the syntax, but couldn't find any. Do you happen to
know where I should look?

Thanks again!
Niklas



On 9/25/07, Bruce Pierson <bpierson at theglobal.net> wrote:
>
>  Niklas:
>
>
>
> I also noticed that you didn't include any "AST rewrite" rules in your
> grammar. They are a bit difficult to grok, but worth the time to figure out.
> Adding them to your grammar reduces the tree to:
>
>
>
> 0: <OR>
>
> 0: <WILDCARD>
>
> 0: `abc*`
>
> 1: <WILDCARD>
>
> 0: `qwe*`
>
>
>
> Which would look like:
>
>
>
>                              <OR>
>
>                    /                           \
>
>           <WILDCARD>          <WILDCARD>
>
>                     /                            \
>
>           'abc*'                               'qwe*'
>
>
>
> It will then be much easier for you to traverse this and take appropriate
> action.
>
>
>
> Here's the rewritten grammar:
>
>
>
> grammar Query;
>
> options
>
> {
>
> language = 'CSharp';
>
> output=AST;
>
> }
>
> fragment KEYWORD_BEGIN : '<';
>
> fragment KEYWORD_END : '>';
>
> fragment KEYWORD_LIST
>
>  : ('WILDCARD' | 'OR' | 'AND' | 'WORD' | 'CASE' )
>
>  ;
>
> KEYWORD
>
>  : (KEYWORD_BEGIN KEYWORD_LIST KEYWORD_END)
>
>  ;
>
>
>
> query
>
>           : opExpr EOF!
>
>           ;
>
>
>
> opExpr  : operation (',' operation)* -> ^(operation operation*) //rewrite
> AST
>
>  | KEYWORD '(' opExpr ')' -> ^(KEYWORD opExpr) // rewrite AST
>
>  ;
>
> operation
>
>  : (KEYWORD '(' STRING_LITERAL ')') -> ^(KEYWORD STRING_LITERAL) //
> rewrite AST
>
>  ;
>
> INTLIT  : ('0'..'9')+;
>
> STRING_LITERAL : ('`'! (~('`'|'\n'|'\r'))+ '`'!);
>
> WS : (' ' | '\t' | '\f' | '\r\n' | '\r' | '\n') { channel = HIDDEN; };
>
>
>
> --Bruce
>
>
>  ------------------------------
>
> *From:* antlr-interest-bounces at antlr.org [mailto:
> antlr-interest-bounces at antlr.org] *On Behalf Of *Niklas Söderberg
> *Sent:* Tuesday, September 25, 2007 3:23 AM
> *To:* antlr-interest at antlr.org
> *Subject:* [antlr-interest] A newbie having problems creating his
> firstgrammar...
>
>
>
>
>
> Hi all! A newbie here, just discovered Antlr and it seems like an awesome
> tool! I'm trying to create a grammar for a querylanguage I need to parse in
> c# and I'm totally stuck, perhaps some kind soul out there could point me in
> the right direction... feel free to comment my grammar as well, perhaps I'm
> going at this all wrong?
>
>
>
> My problem is that when I run my grammar in the AntlrWorks interpreter,
> the last parenthesis in my input is "lost", the interpreter runs without
> error and displays the grammar treegraph, but the tree is "unbalanced" with
> the last parenthesis missing, and I can't understand what in my rules are
> causing this behaviour?
>
>
>
> It's probably a silly mistake on my part, but obviously I can't see it.
>
>
>
> I figured it should be easier to start small, so I picked a piece of the
> input and started working on that, but I can't even get this to work as
> expected:-/ The small sample input I'm trying to parse is this:
>
>
>
> <OR>(<WILDCARD>(`abc*`),<WILDCARD>(`qwe*`))
>
>
>
> using this grammar:
>
> grammar Query;
>
> options
> {
> language = 'CSharp';
> output=AST;
> }
>
> fragment KEYWORD_BEGIN : '<';
> fragment KEYWORD_END : '>';
> fragment KEYWORD_LIST
>  : ('WILDCARD' | 'OR' | 'AND' | 'WORD' | 'CASE' )
>  ;
>
> KEYWORD
>  : (KEYWORD_BEGIN KEYWORD_LIST KEYWORD_END)
>  ;
>
> opExpr  : operation (',' operation)*
>  | KEYWORD '(' opExpr ')'
>  ;
>
> operation
>  : (KEYWORD '(' STRING_LITERAL ')')
>  ;
>
> INTLIT  : ('0'..'9')+;
> STRING_LITERAL : ('`'! (~('`'|'\n'|'\r'))+ '`'!);
> WS : (' ' | '\t' | '\f' | '\r\n' | '\r' | '\n') { channel = HIDDEN; };
>
> Thanks in advance for any help,
> Niklas
>
> If anyone is interested, here is a sample of a complete query that I want
> to parse:
>
> <AND>(((<OR>(<WILDCARD>(`string1*`),<NEAR/5>(<OR>(<WILDCARD>(`string2*`)),<OR>(<WILDCARD>(`string3*`))),<NEAR/5>(<OR>(<CASE><WORD>(`string4`),<CASE><WORD>(`string5`)),<OR>(<WILDCARD>(`string6*`),<WILDCARD>(`string7*`),<WILDCARD>(`string8*`),<WILDCARD>(`string9*`),<WILDCARD>(`string10*`),<WILDCARD>(`string11*`)))))<IN>(MAINTITLE,SUBTITLE,INGRESS,ARTICLETEXT)),(<OR>(<WORD>4617,<WORD>4619,<WORD>4620)<IN>(SOURCE_ID)))
>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.antlr.org/pipermail/antlr-interest/attachments/20070927/9dfdcabe/attachment.html 


More information about the antlr-interest mailing list