[antlr-interest] ANTLR NUB
Jan Nielsen
jan.sture.nielsen at gmail.com
Mon Jan 21 22:25:51 PST 2008
Hi Gavin and Andy,
Thanks for the input! Per your suggestions, I modified my grammar (and
DSL a bit) and the resulting parser now passes these test cases:
"from 1/January/2008"
"from 1/January/2008 to 1/January/2009"
"from 1/January/2008 to 1/January/2009 exclude 1/January/2008"
"from 1/January/2008 to 1/January/2009 exclude 21/January/2008"
"from 1/January/2008 to 1/January/2009 exclude Thursday-Sunday"
"from 1/January/2008 to 1/January/2009 exclude Monday, Wednesday, Friday"
"from 1/January/2008 to 1/January/2009 exclude Thursday[4]/November"
"from 1/January/2008 to 1/January/2009 exclude Thursday-Sunday
include June-July"
"from 1/January/2008 to 1/January/2009 exclude Monday-Thursday
include 21/January/2008"
"from 1/January exclude 1/January"
"from 1/January exclude 21/January"
"from 1/January exclude Thursday-Sunday"
"from 1/January exclude Monday, Wednesday, Friday"
"from 1/January exclude Thursday[4]/November"
"from 1/January exclude Thursday-Sunday include June-July"
"from 1/January exclude Monday-Thursday include 21/January/2008"
I initially envisioned having repeated exclusion and inclusion clauses
but I don't think I need to support it now; I'll probably have a go at
it once I get the parser and tie-ins working.
Thanks, again, for your help.
-Jan
grammar T;
options{
output = AST;
ASTLabelType = CommonTree;
}
prog
: 'from' date ('to' date)? exclude_clause? include_clause?
;
date
: day_of_month '/' MONTH ('/' year)?
;
exclude_clause
: 'exclude' period (',' period)*
;
include_clause
: 'include' period (',' period)*
;
period
: day_of_month_period
| day_of_week_period
| month_period
;
day_of_month_period
: date ('-' date)?
;
day_of_week_period
: DAY_OF_WEEK ('[' occurrence ']')? ('-' DAY_OF_WEEK)?
;
month_period
: MONTH ('-' MONTH)?
;
occurrence
: NUMBER
;
year
: NUMBER
;
MONTH
: 'January'
| 'February'
| 'March'
| 'April'
| 'May'
| 'June'
| 'July'
| 'August'
| 'September'
| 'October'
| 'November'
| 'December'
;
day_of_month
: NUMBER
;
DAY_OF_WEEK
: 'Monday'
| 'Tuesday'
| 'Wednesday'
| 'Thursday'
| 'Friday'
| 'Saturday'
| 'Sunday'
;
NUMBER
: ('0'..'9')+
;
WS : (' '|'\r'|'\t'|'\u000C'|'\n') {$channel=HIDDEN;}
;
COMMENT
: '/*' ( options {greedy=false;} : . )* '*/' {$channel=HIDDEN;}
;
LINE_COMMENT
: '//' ~('\n'|'\r')* '\r'? '\n' {$channel=HIDDEN;}
;
On Jan 21, 2008 12:31 PM, Gavin Lambert <antlr at mirality.co.nz> wrote:
> At 06:32 22/01/2008, Jan Nielsen wrote:
> >
> >Here are a few examples of valid expressions:
> >
> > "from 1/January/2008"
> > "from 1/January/2008 to 1/January/2009"
> > "from 1/January/2008 to 1/January/2009 excluding
> 21/January/2008"
> > "from 1/January/2008 to 1/January/2009 excluding
> 21/January/2008"
> > "from 1/January/2008 to 1/January/2009 excluding
> Thursday-Sunday"
> > "from 1/January/2008 to 1/January/2009 excluding
> Thursday-Sunday
> >including June-July"
> > "from 1/January/2008 to 1/January/2009 excluding
> Monday-Thursday
> >including 21/January/2008"
> > "from 1/January/2008 to 1/January/2009 excluding
> Monday-Thursday
> >including 'Dr. Martin Luther King Day'"
> >
> >A "including" after an "excluding", i.e., to the right of,
> >overrides the exclusion.
>
> Is it permitted to have repeated clauses? ie. "from X including A
> excluding B including C"?
>
> >prog
> > : 'from' date ('to' date)?
> > ('including' period)? (',' period)*
> > ('excluding' period)? (',' period)*
> > ;
>
> This enforces an order between "including" and "excluding"; one
> which doesn't match your examples above. At minimum to get the
> examples to work (and assuming repeated clauses are not permitted)
> you'll need to reverse these.
>
> Also your scoping on the comma-separated bits is wrong; this
> should be inside the optional clause (otherwise it doesn't make
> much sense). So:
>
> prog
> : 'from' date ('to' date)? excluding_clause? including_clause?
> ;
>
> excluding_clause
> : 'excluding' period (',' period)*
> ;
>
> including_clause
> : 'including' period (',' period)*
> ;
>
> >day_of_month_period
> > : DAY_OF_MONTH (MONTH)? (YEAR)?
> > ;
> >
> >day_of_week_period
> > : DAY_OF_WEEK ('[' OCCURRENCE ']')? (YEAR)?
> > ;
>
> Shouldn't these have slashes? You're also not covering other
> types of constructs permitted by your examples.
>
> >OCCURRENCE
> > : '1'..'4'
> > ;
> >
> >YEAR
> > : '1'..'9' '0'..'9' '0'..'9' '0'..'9'
> > ;
> [...]
> >DAY_OF_MONTH : '1'..'9' | '1'..'2' '0'..'9' | '30' | '31';
>
> You can't do this. The most important thing to remember is that
> all lexing is done up front with no input from the parser (since
> the parser doesn't even exist yet). Thus any non-fragment tokens
> become independent candidates for output. Facing a '3' in the
> input stream, it could match any one of these rules; given no
> clear preference ANTLR will choose the first listed and generate
> an OCCURRENCE token, which you're not accepting when it finally
> does reach the parser.
>
> At the lexer level you should just recognise basic integers, and
> then validate them based on context in the parser:
>
> NUMBER: ('0'..'9')+;
>
> occurrence: n=NUMBER { validateOccurrence($n.text); }?;
> year: n=NUMBER { validateYear($n.text); }?;
> day_of_month: n=NUMBER { validateDayOfMonth($n.text); }?;
>
> > But once I have a parser for my expression, how do I actually
> > use the parser to implement my API???
>
> There are two common ways to do this. One is to output an AST
> from the parser, which in your case could end up looking something
> like this (expressed in string form):
> ^(FROM ^(DATE 1 January 2008) ^(DATE 1 January 2009)
> ^(EXCLUDING ^(DAY ^(RANGE Monday Thursday))) ^(INCLUDING ^(DAY
> "Dr. Martin Luther King Day"))
>
> (Of course the exact syntax is variable; you can put in what you
> want for the most part, although a certain structure will be
> dictated by how the rules are organised.) Then you can just write
> tree-walking code to call your various API functions as
> appropriate. (Or even write a tree parser, though that's usually
> unnecessary.)
>
> Another approach is to simply include the action code as you are
> parsing. For example:
>
> excluding_clause
> : 'excluding' p=period { addExclusion($p.result); }
> (',' p=period { addExclusion($p.result); } )*
> ;
>
> Of course for this to work, you'll need to also enhance the period
> and date rules with 'returns' clauses, which create a data
> structure that describes what they have just recognised.
>
>
More information about the antlr-interest
mailing list