[antlr-interest] suggested ANTLR projects?

Pete Forman pete.forman at westerngeco.com
Tue Aug 12 02:01:34 PDT 2003


At 2003-08-11 11:36 -0700, Terence Parr wrote:
>Also, I'm going to see if I can get students to build grammars.  Can
>people suggest grammars they want built?  They might have to describe
>it to the students. ;)

One pet grammar of mine is that of the international date and time
format ISO 8601:2000.  Most people will have come across dates such
as 2003-08-12 but the standard covers many other formats.  A summary
can be found at
http://www.iso.org/iso/en/prods-services/popstds/datesandtime.html

A final draft of the standard can be found via
http://www.qsl.net/g1smd/temp/PDF_Links.html

Here is a summary of the grammar that might form the basis of a parser.
The goal ought to be to recognize all the examples in the standard.

5 Representations
5.1 Explanations
5.1.1 Characters used in place of digits or signs: YMDwhmsn+
   [+ should be plus_or_minus]
5.1.2 Characters used as designators: PRTWZDHMS
   [D and M are used both in place of digits and as designators in durations]
4.4 The space character shall not be used in the representations
   [but a common misuse of ISO8601 uses space instead of T]
   Lower case characters may be substituted for upper case
4.5 Characters used as separators: -:/#,.
   [the FDIS is inconsistent, # is probably not used at all]
5.2 Dates
5.2.1 Calendar date
5.2.1.1 Complete representation
5.2.1.1.B: YYYYMMDD
5.2.1.1.E: YYYY-MM-DD
5.2.1.2 Representations with reduced precision
5.2.1.2.a.B: YYYY-MM
5.2.1.2.b.B: YYYY
5.2.1.2.c.B: YY
5.2.1.3 Truncated representations
5.2.1.3.a.B: YYMMDD
5.2.1.3.a.E: YY-MM-DD
5.2.1.3.b.B: -YYMM
5.2.1.3.b.E: -YY-MM
5.2.1.3.c.B: -YY
5.2.1.3.d.B: --MMDD
5.2.1.3.d.E: --MM-DD
5.2.1.3.e.B: --MM
5.2.1.3.f.B: ---DD
5.2.1.4 Expanded representations (optional, here year has 2 extra digits)
5.2.1.4.a.B: +YYYYYYMMDD
5.2.1.4.a.B: +YYYYYY-MM-DD
5.2.1.4.b.B: +YYYYYY-MM
5.2.1.4.c.B: +YYYYYY
5.2.1.4.d.B: +YYYY
5.2.2 Ordinal date
5.2.2.1 Complete representation
5.2.2.1.B: YYYYDDD
5.2.2.1.E: YYYY-DDD
5.2.2.2 Truncated representations
5.2.2.2.B: YYDDD
5.2.2.2.E: YY-DDD
5.2.2.3 Expanded representations (optional, here year has 2 extra digits)
5.2.2.3.B:  +YYYYYYDDD
5.2.2.3.B:  +YYYYYY-DDD
5.2.3 Week date
5.2.3.1 Complete representation
5.2.3.1.B: YYYYWwwD
5.2.3.1.E: YYYY-Www-D
5.2.3.2 Representation with reduced precision
5.2.3.2.a.B: YYYYWww
5.2.3.2.a.E: YYYY-Www
5.2.3.3 Truncated representations
5.2.3.3.a.B: YYWwwD
5.2.3.3.a.E: YY-Www-D
5.2.3.3.b.B: YYWww
5.2.3.3.b.E: YY-Www
5.2.3.3.c.B: -YWwwD
5.2.3.3.c.E: -Y-Www-D
5.2.3.3.d.B: -YWww
5.2.3.3.d.E: -Y-Www
5.2.3.3.e.B: -WwwD
5.2.3.3.e.E: -Www-D
5.2.3.3.f.B: -Www
5.2.3.3.g.B: -W-D
5.2.3.4 Expanded representations (optional, here year has 2 extra digits)
5.2.3.4.a.B: +YYYYYYWwwD
5.2.3.4.a.E: +YYYYYY-Www-D
5.2.3.4.b.B: +YYYYYYWww
5.2.3.4.b.E: +YYYYYY-Www
5.3 Time of the day
5.3.1 Local time of the day
5.3.1.1 Complete representation
5.3.1.1.B: hhmmss
5.3.1.1.E: hh:mm:ss
5.3.1.2 Representations with reduced precision
5.3.1.2.a.B: hhmm
5.3.1.2.a.E: hh:mm
5.3.1.2.b.B: hh
5.3.1.3 Representation of decimal fractions (may use . instead of ,)
   (fractions shown here with two places, spec is one or more)
5.3.1.3.a.B: hhmmss,ss
5.3.1.3.a.E: hh:mm:ss,ss
5.3.1.3.b.B: hhmm,mm
5.3.1.3.b.E: hh:mm:ss,ss
5.3.1.3.c.B: hh,hh
5.3.1.4 Truncated representations
   (fractions shown here with one place, spec is one or more)
5.3.1.4.a.B: -mmss
5.3.1.4.a.E: -mm:ss
5.3.1.4.b.B: -mm
5.3.1.4.c.B: --ss
5.3.1.4.d.B: -mmss,s
5.3.1.4.d.E: -mm:ss,s
5.3.1.4.e.B: -mm,m
5.3.1.4.f.B: --ss,s
5.3.1.5 Representation with time designator
   If the time of the day is represented in basic format in a context that does
   not clearly identify a time only expression, the time designator [T] 
shall be
   used immediately in front of the presentations defined in 5.3.1.1 through
   5.3.1.3.
5.3.2 Midnight
   In 5.3.1.* hh is either 00 or 24 and mm is 00.
5.3.3 Coordinated Universal Time (UTC)
   To express the time of the day in Coordinated Universal Time, the
   representations specified in 5.3.1.1 through 5.3.1.3 shall be used, followed
   immediately, without spaces, by the UTC designator [Z].
5.3.4 Local time and Coordinated Universal Time
5.3.4.1 Difference between local time and Coordinated Universal Time
5.3.4.1.a.B: +hhmm
5.3.4.1.a.E: +hh:mm
5.3.4.1.b.B: +hh
5.3.4.2 Local time and the difference with Coordinated Universal Time
5.3.1*B plus 5.3.4.1.*.B
5.3.1*E plus 5.3.4.1.a.E or 5.3.4.1.b.B
5.4 Combinations of date and time of the day
5.4.1 Complete representation
5.4.1.a: year month day timeDesignator hour minute second zoneDesignator
5.4.1.b: year day timeDesignator hour minute second zoneDesignator
5.4.1.c: year weekDesignator week day timeDesignator hour minute second 
zoneDesignator
5.4.2 Representations other than complete
5.2.* plus T plus 5.3.4.2
   provided that
   a) the rules specified in those sections are applied;
   b) the resulting expression does not qualify as a complete representation in
      accordance with 5.4.1;
   c) the date component shall not be represented with reduced precision 
and the
      time component shall not be truncated. Note that this excludes the date
      representations in 5.2.1.3 and 5.2.3.3 that are truncated and reduced and
      the date representations in 5.2.1.4 and 5.2.3.4 that are expanded and
      reduced;
   d) the expression shall either be completely in basic format, in which case
      the minimum number of separators necessary for the required expression is
      used, or completely in extended format, in which case additional 
separators
      shall be used in accordance with 5.2 and 5.3.
5.5 Time-intervals
5.5.1 Means of specifying time-intervals
   A time-interval shall be expressed in one of the following ways:
   a) by a start and an end;
   b) by a duration not associated with any start or end;
   c) by a start and a duration;
   d) by a duration and an end.
5.5.2 Separators and designators
   A time interval is expressed according to the following rules:
   a) a solidus [/] shall be used to separate the two components in each of 
5.5.1
      a), c) and d).
   b) for 5.5.1 b), c) and d) the designator [P] shall precede, without spaces,
      the representation of the duration.
   c) other designators (and the hyphen when used to indicate omitted 
components)
      shall be used as shown in 5.5.4 and 5.5.5 below.
   NOTE In certain application areas a double hyphen is used as a separator
      instead of a solidus.
5.5.3 Representation of duration
5.5.3.1 Format with time-unit designators
   In expressions of time-interval or recurring time-interval duration can be
   represented by a data element using time unit designators. The number of 
years
   shall be followed by the designator [Y], the number of months by [M], the
   number of weeks by [W], and the number of days by [D]. The part 
including time
   components shall be preceded by the designator [T]; the number of hours 
shall
   be followed by [H], the number of minutes by [M] and the number of 
seconds by
   [S]. In the examples [n] represents one or more digits, constituting a
   positive integer or zero.

   In basic and extended format the complete representation for duration 
shall be
   nYnMnDTnHnMnS or nW.

   For reduced precision, decimal or truncated representations of this 
format the
   following rules apply.
   a) If necessary for a particular application the lowest order components may
      be omitted to represent duration with reduced precision.
   b) If necessary for a particular application the lowest order component may
      have a decimal fraction. The decimal fraction shall be divided from the
      integer part by the decimal sign specified in ISO 31-0: i.e. the 
comma [,]
      or full stop [.]. Of these, the comma is the preferred sign. The decimal
      fraction shall at least have one digit. If the magnitude of the number is
      less than unity, the decimal sign shall be preceded by a zero (see ISO
      31-0).
   c) If the number of years, months, days, hours, minutes or seconds in any of
      these expressions equals zero, the number and the corresponding 
designator
      may be absent; however, at least one number and its designator shall be
      present. Note that the removal of leading non-zero components is not
      allowed.
   d) The designator T shall be absent if all of the time components are 
absent.
5.5.3.2 Alternative format (optional)
5.5.4 Complete representations
5.5.4.1 Representation of time-intervals identified by start and end
5.4.1.* / 5.4.1.*
5.5.4.2 Representation of time-interval by duration only
5.5.4.2.1 Format with time-unit designators
5.5.4.2.1.a.BE: PnYnMnDTnHnMnS
5.5.4.2.1.b.BE: PnW
5.5.4.2.2 Alternative format (optional)
5.5.4.2.2.B: PYYYYMMDDThhmmss
5.5.4.2.2.E: PYYYY-MM-DDThh:mm:ss
5.5.4.3 Representation of time-interval identified by its start and its 
duration
5.5.4.3.B: 5.4.1.*.B / 5.5.3.*.B
5.5.4.3.E: 5.4.1.*.E / 5.5.3.*.E
5.5.4.4 Representation of time-interval identified by its duration and its end
5.5.4.4.B: 5.5.3.*.B / 5.4.1.*.B
5.5.4.4.E: 5.5.3.*.E / 5.4.1.*.E
5.5.5 Representations other than complete
   A representation other than complete of a time-interval shall be an 
expression
   in accordance with 5.5.1 and 5.5.2, where time-points are represented in
   accordance with 5.2, 5.3 or 5.4 and where duration is represented in
   accordance with 5.5.3.1 or 5.5.3.2, provided that:
   a) the rules specified in those sections are applied;
   b) the result is not a complete representation in accordance with 5.5.4, and
   c) for which the resulting expression is either consistently in basic format
      or consistently in extended format;
   d) the use of a representation needs to be agreed by the partners in
      information interchange, if the use of any of its constituent parts needs
      to be agreed by the partners in information interchange.
   In the representation of time-intervals in accordance with 5.5.1 a),
   - if higher order components are omitted from the expression following the
     solidus (i.e. the representation for "end of time-interval"), it shall be
     assumed that the corresponding components from the "start of 
time-interval"
     expression apply (e.g. if [YYYYMM] are omitted by using a derived
     representation, the end of the time-interval is in the same year and month
     as the start of the time-interval);
   - representations for time-zones and Coordinated Universal Time included 
with
     the component preceding the solidus shall be assumed to apply to the
     component following the solidus, unless a corresponding alternative is
     included.
5.6 Recurring time-intervals
5.6.1 Means of specifying recurring time-intervals
   A recurring time-interval shall be expressed in one of the following ways:
   a) By a number of recurrences (optional), a start and an end. This 
represents
      a recurring time-interval of which the first time-interval is 
identified by
      the first two components of the expression and the number of 
recurrences by
      the last component. If the last component is absent the number of
      occurrences is unbounded.
   b) By a number of recurrences (optional) and a duration. This represents a
      recurring time interval with the indicated duration for each 
time-interval
      and with the indicated number of recurrences. If the number of 
recurrences
      is absent the number of occurrences is unbounded.
   c) By a number of recurrences (optional) a start and a duration. This
      represents a recurring time-interval of which the first time-interval is
      identified by the first two components of the expression and the 
number of
      recurrences by the last component. If the last component is absent the
      number of occurrences is unbounded.
   d) By a number of recurrences (optional), a duration and an end. This
      represents a recurring time-interval of which the last time-interval is
      identified by the first two components of the expression and the 
number of
      recurrences by the last component. If the last component is absent the
      number of occurrences is unbounded.
5.6.2 Separators and designators
   All representations start with the designator [R], followed, without spaces,
   by the number of recurrences, if present, followed, without spaces, by a
   solidus [ /], followed, without spaces, by the expression of a time interval
   in accordance with 5.5.1. For the representation 5.6.1 a), 5.6.1 b), 
5.6.1 c)
   and 5.6.1 d) the time interval in accordance with 5.5.1 a), 5.5.1 b), 
5.5.1 c)
   and 5.5.1 d) shall be used respectively.
5.6.3 Complete representations
5.6.3: Rn / 5.5.*

DIGIT:   '0'..'9';
HYPHEN_OR_MINUS:  '-';
COLON:   ':';
SOLIDUS: '/';
DECIMAL: ',' | '.';
PERIOD:  'P' | 'p';
RECUR:   'R' | 'r';
TIME:    'T' | 't' | ' '; // space is illegal but commonly used
WEEK:    'W' | 'w';
ZULU:    'Z' | 'z';
PLUS:    '+';
// HYPHEN_OR_MINUS is done above
YEAR:    'Y' | 'y';
MONTH_OR_MINUTE: 'M' | 'm';
// WEEK is done above
DAY:     'D' | 'd';
HOUR:    'H' | 'h';
// MONTH_OR_MINUTE is done above
SECOND:  'S' | 's';
// HASH is probably not part of the Standard



-- 
Pete Forman                -./\.-  Disclaimer: This post is originated
WesternGeco                  -./\.-   by myself and does not represent
pete.forman at westerngeco.com    -./\.-   opinion of Schlumberger, Baker
http://petef.port5.com           -./\.-   Hughes or their divisions.


 

Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/ 




More information about the antlr-interest mailing list