[antlr-interest] A newbie having problems creating his firstgrammar...
Bruce Pierson
bpierson at theglobal.net
Tue Sep 25 08:35:24 PDT 2007
Niklas:
I would recommend adding this:
query
: opExpr EOF!
;
However, when I run your grammar and do a recursive tree printout (even w/o
the query rule), the parens balance just fine
3 lefts and 3 rights on the
sample you sent
0: nil
0: <OR>
1: (
2: <WILDCARD>
3: (
4: `abc*`
5: )
6: ,
7: <WILDCARD>
8: (
9: `qwe*`
10: )
11: )
This was output with:
public static void Main(string[] args)
{
//--Query
Lexer lexer = new QueryLexer(new ANTLRStringStream(
"<OR>(<WILDCARD>(`abc*`),<WILDCARD>(`qwe*`))"));
Parser parser = new QueryParser(new
CommonTokenStream(lexer));
ITree tree = (ITree)((QueryParser)parser).opExpr().Tree as
ITree;
WriteTree(tree, 0);
}
private static void WriteTree(ITree tree, int index)
{
Console.WriteLine(String.Format("{0}: {1}", index, tree));
for (int i = 0; i < tree.ChildCount; i++)
WriteTree(tree.GetChild(i), i);
}
Your grammar looks rather lisp-like (i.e., prefix notation) so you may
want to consider what would be a much easier parser, something that would
look for;
(and expr expr (or expr expr (not expr)))
Where the (and, (or, and (not are functions that always begin with (. I
have the beginnings of a lisp parser that does some AST rewrites if you want
to look at it and modify it for your own use.
--Bruce
_____
From: antlr-interest-bounces at antlr.org
[mailto:antlr-interest-bounces at antlr.org] On Behalf Of Niklas Söderberg
Sent: Tuesday, September 25, 2007 3:23 AM
To: antlr-interest at antlr.org
Subject: [antlr-interest] A newbie having problems creating his
firstgrammar...
Hi all! A newbie here, just discovered Antlr and it seems like an awesome
tool! I'm trying to create a grammar for a querylanguage I need to parse in
c# and I'm totally stuck, perhaps some kind soul out there could point me in
the right direction... feel free to comment my grammar as well, perhaps I'm
going at this all wrong?
My problem is that when I run my grammar in the AntlrWorks interpreter, the
last parenthesis in my input is "lost", the interpreter runs without error
and displays the grammar treegraph, but the tree is "unbalanced" with the
last parenthesis missing, and I can't understand what in my rules are
causing this behaviour?
It's probably a silly mistake on my part, but obviously I can't see it.
I figured it should be easier to start small, so I picked a piece of the
input and started working on that, but I can't even get this to work as
expected:-/ The small sample input I'm trying to parse is this:
<OR>(<WILDCARD>(`abc*`),<WILDCARD>(`qwe*`))
using this grammar:
grammar Query;
options
{
language = 'CSharp';
output=AST;
}
fragment KEYWORD_BEGIN : '<';
fragment KEYWORD_END : '>';
fragment KEYWORD_LIST
: ('WILDCARD' | 'OR' | 'AND' | 'WORD' | 'CASE' )
;
KEYWORD
: (KEYWORD_BEGIN KEYWORD_LIST KEYWORD_END)
;
opExpr : operation (',' operation)*
| KEYWORD '(' opExpr ')'
;
operation
: (KEYWORD '(' STRING_LITERAL ')')
;
INTLIT : ('0'..'9')+;
STRING_LITERAL : ('`'! (~('`'|'\n'|'\r'))+ '`'!);
WS : (' ' | '\t' | '\f' | '\r\n' | '\r' | '\n') { channel = HIDDEN; };
Thanks in advance for any help,
Niklas
If anyone is interested, here is a sample of a complete query that I want to
parse:
<AND>(((<OR>(<WILDCARD>(`string1*`),<NEAR/5>(<OR>(<WILDCARD>(`string2*`)),<O
R>(<WILDCARD>(`string3*`))),<NEAR/5>(<OR>(<CASE><WORD>(`string4`),<CASE><WOR
D>(`string5`)),<OR>(<WILDCARD>(`string6*`),<WILDCARD>(`string7*`),<WILDCARD>
(`string8*`),<WILDCARD>(`string9*`),<WILDCARD>(`string10*`),<WILDCARD>(`stri
ng11*`)))))<IN>(MAINTITLE,SUBTITLE,INGRESS,ARTICLETEXT)),(<OR>(<WORD>4617,<W
ORD>4619,<WORD>4620)<IN>(SOURCE_ID)))
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.antlr.org/pipermail/antlr-interest/attachments/20070925/81665408/attachment-0001.html
More information about the antlr-interest
mailing list