[antlr-interest] ANTLR NUB

Gavin Lambert antlr at mirality.co.nz
Mon Jan 21 11:31:30 PST 2008


At 06:32 22/01/2008, Jan Nielsen wrote:
 >
 >Here are a few examples of valid expressions:
 >
 > "from 1/January/2008"
 > "from 1/January/2008 to 1/January/2009"
 > "from 1/January/2008 to 1/January/2009 excluding 
21/January/2008"
 > "from 1/January/2008 to 1/January/2009 excluding 
21/January/2008"
 > "from 1/January/2008 to 1/January/2009 excluding 
Thursday-Sunday"
 > "from 1/January/2008 to 1/January/2009 excluding 
Thursday-Sunday
 >including June-July"
 > "from 1/January/2008 to 1/January/2009 excluding 
Monday-Thursday
 >including 21/January/2008"
 > "from 1/January/2008 to 1/January/2009 excluding 
Monday-Thursday
 >including 'Dr. Martin Luther King Day'"
 >
 >A "including" after an "excluding", i.e., to the right of,
 >overrides the exclusion.

Is it permitted to have repeated clauses?  ie. "from X including A 
excluding B including C"?

 >prog
 >    : 'from' date ('to' date)?
 >      ('including' period)? (',' period)*
 >      ('excluding' period)? (',' period)*
 >    ;

This enforces an order between "including" and "excluding"; one 
which doesn't match your examples above.  At minimum to get the 
examples to work (and assuming repeated clauses are not permitted) 
you'll need to reverse these.

Also your scoping on the comma-separated bits is wrong; this 
should be inside the optional clause (otherwise it doesn't make 
much sense).  So:

prog
   : 'from' date ('to' date)? excluding_clause? including_clause?
   ;

excluding_clause
   : 'excluding' period (',' period)*
   ;

including_clause
   : 'including' period (',' period)*
   ;

 >day_of_month_period
 >    : DAY_OF_MONTH (MONTH)? (YEAR)?
 >    ;
 >
 >day_of_week_period
 >    : DAY_OF_WEEK ('[' OCCURRENCE ']')? (YEAR)?
 >    ;

Shouldn't these have slashes?  You're also not covering other 
types of constructs permitted by your examples.

 >OCCURRENCE
 >    : '1'..'4'
 >    ;
 >
 >YEAR
 >    : '1'..'9' '0'..'9' '0'..'9' '0'..'9'
 >    ;
[...]
 >DAY_OF_MONTH : '1'..'9' | '1'..'2' '0'..'9' | '30' | '31';

You can't do this.  The most important thing to remember is that 
all lexing is done up front with no input from the parser (since 
the parser doesn't even exist yet).  Thus any non-fragment tokens 
become independent candidates for output.  Facing a '3' in the 
input stream, it could match any one of these rules; given no 
clear preference ANTLR will choose the first listed and generate 
an OCCURRENCE token, which you're not accepting when it finally 
does reach the parser.

At the lexer level you should just recognise basic integers, and 
then validate them based on context in the parser:

NUMBER: ('0'..'9')+;

occurrence: n=NUMBER { validateOccurrence($n.text); }?;
year: n=NUMBER { validateYear($n.text); }?;
day_of_month: n=NUMBER { validateDayOfMonth($n.text); }?;

 > But once I have a parser for my expression, how do I actually
 > use the parser to implement my API???

There are two common ways to do this.  One is to output an AST 
from the parser, which in your case could end up looking something 
like this (expressed in string form):
   ^(FROM ^(DATE 1 January 2008) ^(DATE 1 January 2009) 
^(EXCLUDING ^(DAY ^(RANGE Monday Thursday))) ^(INCLUDING ^(DAY 
"Dr. Martin Luther King Day"))

(Of course the exact syntax is variable; you can put in what you 
want for the most part, although a certain structure will be 
dictated by how the rules are organised.)  Then you can just write 
tree-walking code to call your various API functions as 
appropriate.  (Or even write a tree parser, though that's usually 
unnecessary.)

Another approach is to simply include the action code as you are 
parsing.  For example:

excluding_clause
   : 'excluding' p=period { addExclusion($p.result); }
     (',' p=period { addExclusion($p.result); } )*
   ;

Of course for this to work, you'll need to also enhance the period 
and date rules with 'returns' clauses, which create a data 
structure that describes what they have just recognised.



More information about the antlr-interest mailing list