[antlr-interest] simple query language EBNF
Harald M. Müller
harald_m_mueller at gmx.de
Tue Jan 1 09:45:18 PST 2008
Did you succeed?
I see at least the following problem with your grammar: WS is to be hidden
from the parser ...
WS
: (' '|'\t'|'\r'? '\n')+ {$channel=HIDDEN;} ;
... but you use it in your rules, e.g.
fromSpec returns [IDateRange result]
: FROM WS SPECTEXT
The rule should instead be
fromSpec returns [IDateRange result]
: FROM SPECTEXT
For the rest, I would say that you do NOT want "everything behind the
keyword" - at least that would be a very bad language design (have you done
language design for a few languages already??).
A good language should allow the human reader to understand where the
boundaries between "parsed text" and "non-parsed text" are - therefore you
would design the language e.g. so that the "raw text" is embedded in some
delimiters:
from <LastMonth MultipliedBy 3>
filter <WeekDays>
filter <Not Holidays>
set <EachDay 8-hours>
with <Expectations>
But no! - you'll exclaim at this ... my users can readily find out the
boundaries by ... what? Maybe it's the newlines? - is the following ok??
from LastMonth MultipliedBy 3 filter WeekDays filter Not Holidays set
EachDay 8-hours with Expectations
If it is not, then you have at least an "end delimiter", and you can define
a symbol
REST_OF_TEXT : ~NL NL ;
where NL is your definition of an NL character.
It the above one-liner IS ok (i.e. there need not be new-line separations
between clauses), then you should decree that at least the tokenization of
those "tails" is clear - so that you do NOT allow e.g.
set EachDay with 'u'
with Expectations
(even though it looks nice: days with 'u' are tUesday, ThUrsday, satUrday
and sUnday ;-) ).
In that case, you define a list of tokens for those tails - e.g.,
identifiers (which in your case include dashes), numbers, and whatever. And
the specText then becomes
specText : ( ID | NUMBER | ...)*
To sum up:
* Either you define delimiters around the "open language", between which
"everything goes" (even there, you may want to track nested parentheses
etc.)
* Or you do not delimit the open segments - then you should define the
tokens allows in them.
Everything else is not so good; and comes usually under the heading "badly
designed language" ... ... ... ... IMVHO.
Regards
Harald
_____
From: antlr-interest-bounces at antlr.org
[mailto:antlr-interest-bounces at antlr.org] On Behalf Of Pieter Breed
Sent: Friday, December 14, 2007 7:19 AM
To: antlr-interest at antlr.org
Subject: [antlr-interest] simple query language EBNF
Hi,
I am trying to get a small special purpose query language working with
ANTLR, and I am having some trouble sorting out the right way to do some
things.
The basic domain problem is this:
you have some keywords: 'from', 'with', 'display', 'filter', 'set'
an example of a valid "query" is this:
from LastMonth MultipliedBy 3
filter WeekDays
filter Not Holidays
set EachDay 8-hours
with Expectations
The idea is that ANTLR only takes care of the big structure of the query
(sorting out what string value goes with from, what string value goes with
filter etc) and then I will use these strings and do custom parsing on them.
(Using reflections. Ex, LastMonth is a method on a specific object, it has a
method Multipliedby which takes a parameter 3 and so on)
My ANTLR problem is that I want the raw text "LastMonth MultipliedBy 3" as
output from ANTLR, but I don't know how to specify that rule. I don't know
how say "everything but one of the commandwords". Below I tried to use
string quoting to delimit the text I am interested in, but that also doesn't
work.
This is what I have at the moment (I am troubleshooting at the moment, so I
put the comments in queryLine rule to help with this.):
grammar WorkLogQL;
tokens {
FROM = 'from';
WITH = 'with';
FILTER = 'filter';
SET = 'set';
DISPLAY = 'display';
}
queryLine
: fromSpec
//(WS filterSpec)*
//WS actionSpec
//WS withSpec
;
fromSpec returns [IDateRange result]
: FROM WS SPECTEXT
{
result = ParseDateRangeSpecification($SPECTEXT.value);
}
;
withSpec
: WITH WS SPECTEXT
;
actionSpec
: DISPLAY
| SET WS SPECTEXT
;
filterSpec
: FILTER WS SPECTEXT
;
SPECTEXT
: '\'' .+ '\''
;
WS
: (' '|'\t'|'\r'? '\n')+ {$channel=HIDDEN;} ;
As is (ie, with the comments) and this input:
from 'Today'
The parser falls over in SPECTEXT. When I am running in ANTLRWorks, in the
Interpreter mode, I get a tree that looks something like this:
<grammar worklogql>
<queryLine>
<fromSpec>
<from> - <MismatchedTokenException>
How can I get this working? Any ideas?
Regards,
Pieter
--
Tempus est mensura motus rerum mobilium.
Time is the measure of movement.
-- Auctoritates Aristotelis
+27 82 567 6207
http://pieterbreed.blogspot.com/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.antlr.org/pipermail/antlr-interest/attachments/20080101/77b430eb/attachment.html
More information about the antlr-interest
mailing list