[antlr-interest] suggested ANTLR projects?
Matthew Ford
Matthew.Ford at forward.com.au
Tue Aug 12 03:06:16 PDT 2003
I think this is a great project and I would be interested in the results.
Usually I just specify the date format the user can use.
Another small project would be complex number parsing (with good error
messages)
examples include
1
1.0
+1.0
1+i 1+j
1-i 1-j
-i
1-5.0i
j5 (perhaps)
etc
I have tried this without using Antlr and quickly gave up and went back to
Antlr to do the parsing and give useful error messages.
The trick I used for errormessages was to code into the parser some common
errors and parse them to give an explicit error message.
eg
1 5.0i
"missing sign for imaginary part"
1+5
"missing imaginary part - did you forget the 'i' or 'j' "
1j + 2
"real part should be first" (perhaps this is too strict??)
matthew
----- Original Message -----
From: "Pete Forman" <pete.forman at westerngeco.com>
To: <antlr-interest at yahoogroups.com>
Sent: Tuesday, August 12, 2003 7:01 PM
Subject: Re: [antlr-interest] suggested ANTLR projects?
> At 2003-08-11 11:36 -0700, Terence Parr wrote:
> >Also, I'm going to see if I can get students to build grammars. Can
> >people suggest grammars they want built? They might have to describe
> >it to the students. ;)
>
> One pet grammar of mine is that of the international date and time
> format ISO 8601:2000. Most people will have come across dates such
> as 2003-08-12 but the standard covers many other formats. A summary
> can be found at
> http://www.iso.org/iso/en/prods-services/popstds/datesandtime.html
>
> A final draft of the standard can be found via
> http://www.qsl.net/g1smd/temp/PDF_Links.html
>
> Here is a summary of the grammar that might form the basis of a parser.
> The goal ought to be to recognize all the examples in the standard.
>
> 5 Representations
> 5.1 Explanations
> 5.1.1 Characters used in place of digits or signs: YMDwhmsn+
> [+ should be plus_or_minus]
> 5.1.2 Characters used as designators: PRTWZDHMS
> [D and M are used both in place of digits and as designators in
durations]
> 4.4 The space character shall not be used in the representations
> [but a common misuse of ISO8601 uses space instead of T]
> Lower case characters may be substituted for upper case
> 4.5 Characters used as separators: -:/#,.
> [the FDIS is inconsistent, # is probably not used at all]
> 5.2 Dates
> 5.2.1 Calendar date
> 5.2.1.1 Complete representation
> 5.2.1.1.B: YYYYMMDD
> 5.2.1.1.E: YYYY-MM-DD
> 5.2.1.2 Representations with reduced precision
> 5.2.1.2.a.B: YYYY-MM
> 5.2.1.2.b.B: YYYY
> 5.2.1.2.c.B: YY
> 5.2.1.3 Truncated representations
> 5.2.1.3.a.B: YYMMDD
> 5.2.1.3.a.E: YY-MM-DD
> 5.2.1.3.b.B: -YYMM
> 5.2.1.3.b.E: -YY-MM
> 5.2.1.3.c.B: -YY
> 5.2.1.3.d.B: --MMDD
> 5.2.1.3.d.E: --MM-DD
> 5.2.1.3.e.B: --MM
> 5.2.1.3.f.B: ---DD
> 5.2.1.4 Expanded representations (optional, here year has 2 extra digits)
> 5.2.1.4.a.B: +YYYYYYMMDD
> 5.2.1.4.a.B: +YYYYYY-MM-DD
> 5.2.1.4.b.B: +YYYYYY-MM
> 5.2.1.4.c.B: +YYYYYY
> 5.2.1.4.d.B: +YYYY
> 5.2.2 Ordinal date
> 5.2.2.1 Complete representation
> 5.2.2.1.B: YYYYDDD
> 5.2.2.1.E: YYYY-DDD
> 5.2.2.2 Truncated representations
> 5.2.2.2.B: YYDDD
> 5.2.2.2.E: YY-DDD
> 5.2.2.3 Expanded representations (optional, here year has 2 extra digits)
> 5.2.2.3.B: +YYYYYYDDD
> 5.2.2.3.B: +YYYYYY-DDD
> 5.2.3 Week date
> 5.2.3.1 Complete representation
> 5.2.3.1.B: YYYYWwwD
> 5.2.3.1.E: YYYY-Www-D
> 5.2.3.2 Representation with reduced precision
> 5.2.3.2.a.B: YYYYWww
> 5.2.3.2.a.E: YYYY-Www
> 5.2.3.3 Truncated representations
> 5.2.3.3.a.B: YYWwwD
> 5.2.3.3.a.E: YY-Www-D
> 5.2.3.3.b.B: YYWww
> 5.2.3.3.b.E: YY-Www
> 5.2.3.3.c.B: -YWwwD
> 5.2.3.3.c.E: -Y-Www-D
> 5.2.3.3.d.B: -YWww
> 5.2.3.3.d.E: -Y-Www
> 5.2.3.3.e.B: -WwwD
> 5.2.3.3.e.E: -Www-D
> 5.2.3.3.f.B: -Www
> 5.2.3.3.g.B: -W-D
> 5.2.3.4 Expanded representations (optional, here year has 2 extra digits)
> 5.2.3.4.a.B: +YYYYYYWwwD
> 5.2.3.4.a.E: +YYYYYY-Www-D
> 5.2.3.4.b.B: +YYYYYYWww
> 5.2.3.4.b.E: +YYYYYY-Www
> 5.3 Time of the day
> 5.3.1 Local time of the day
> 5.3.1.1 Complete representation
> 5.3.1.1.B: hhmmss
> 5.3.1.1.E: hh:mm:ss
> 5.3.1.2 Representations with reduced precision
> 5.3.1.2.a.B: hhmm
> 5.3.1.2.a.E: hh:mm
> 5.3.1.2.b.B: hh
> 5.3.1.3 Representation of decimal fractions (may use . instead of ,)
> (fractions shown here with two places, spec is one or more)
> 5.3.1.3.a.B: hhmmss,ss
> 5.3.1.3.a.E: hh:mm:ss,ss
> 5.3.1.3.b.B: hhmm,mm
> 5.3.1.3.b.E: hh:mm:ss,ss
> 5.3.1.3.c.B: hh,hh
> 5.3.1.4 Truncated representations
> (fractions shown here with one place, spec is one or more)
> 5.3.1.4.a.B: -mmss
> 5.3.1.4.a.E: -mm:ss
> 5.3.1.4.b.B: -mm
> 5.3.1.4.c.B: --ss
> 5.3.1.4.d.B: -mmss,s
> 5.3.1.4.d.E: -mm:ss,s
> 5.3.1.4.e.B: -mm,m
> 5.3.1.4.f.B: --ss,s
> 5.3.1.5 Representation with time designator
> If the time of the day is represented in basic format in a context that
does
> not clearly identify a time only expression, the time designator [T]
> shall be
> used immediately in front of the presentations defined in 5.3.1.1
through
> 5.3.1.3.
> 5.3.2 Midnight
> In 5.3.1.* hh is either 00 or 24 and mm is 00.
> 5.3.3 Coordinated Universal Time (UTC)
> To express the time of the day in Coordinated Universal Time, the
> representations specified in 5.3.1.1 through 5.3.1.3 shall be used,
followed
> immediately, without spaces, by the UTC designator [Z].
> 5.3.4 Local time and Coordinated Universal Time
> 5.3.4.1 Difference between local time and Coordinated Universal Time
> 5.3.4.1.a.B: +hhmm
> 5.3.4.1.a.E: +hh:mm
> 5.3.4.1.b.B: +hh
> 5.3.4.2 Local time and the difference with Coordinated Universal Time
> 5.3.1*B plus 5.3.4.1.*.B
> 5.3.1*E plus 5.3.4.1.a.E or 5.3.4.1.b.B
> 5.4 Combinations of date and time of the day
> 5.4.1 Complete representation
> 5.4.1.a: year month day timeDesignator hour minute second zoneDesignator
> 5.4.1.b: year day timeDesignator hour minute second zoneDesignator
> 5.4.1.c: year weekDesignator week day timeDesignator hour minute second
> zoneDesignator
> 5.4.2 Representations other than complete
> 5.2.* plus T plus 5.3.4.2
> provided that
> a) the rules specified in those sections are applied;
> b) the resulting expression does not qualify as a complete
representation in
> accordance with 5.4.1;
> c) the date component shall not be represented with reduced precision
> and the
> time component shall not be truncated. Note that this excludes the
date
> representations in 5.2.1.3 and 5.2.3.3 that are truncated and
reduced and
> the date representations in 5.2.1.4 and 5.2.3.4 that are expanded
and
> reduced;
> d) the expression shall either be completely in basic format, in which
case
> the minimum number of separators necessary for the required
expression is
> used, or completely in extended format, in which case additional
> separators
> shall be used in accordance with 5.2 and 5.3.
> 5.5 Time-intervals
> 5.5.1 Means of specifying time-intervals
> A time-interval shall be expressed in one of the following ways:
> a) by a start and an end;
> b) by a duration not associated with any start or end;
> c) by a start and a duration;
> d) by a duration and an end.
> 5.5.2 Separators and designators
> A time interval is expressed according to the following rules:
> a) a solidus [/] shall be used to separate the two components in each
of
> 5.5.1
> a), c) and d).
> b) for 5.5.1 b), c) and d) the designator [P] shall precede, without
spaces,
> the representation of the duration.
> c) other designators (and the hyphen when used to indicate omitted
> components)
> shall be used as shown in 5.5.4 and 5.5.5 below.
> NOTE In certain application areas a double hyphen is used as a
separator
> instead of a solidus.
> 5.5.3 Representation of duration
> 5.5.3.1 Format with time-unit designators
> In expressions of time-interval or recurring time-interval duration can
be
> represented by a data element using time unit designators. The number
of
> years
> shall be followed by the designator [Y], the number of months by [M],
the
> number of weeks by [W], and the number of days by [D]. The part
> including time
> components shall be preceded by the designator [T]; the number of hours
> shall
> be followed by [H], the number of minutes by [M] and the number of
> seconds by
> [S]. In the examples [n] represents one or more digits, constituting a
> positive integer or zero.
>
> In basic and extended format the complete representation for duration
> shall be
> nYnMnDTnHnMnS or nW.
>
> For reduced precision, decimal or truncated representations of this
> format the
> following rules apply.
> a) If necessary for a particular application the lowest order
components may
> be omitted to represent duration with reduced precision.
> b) If necessary for a particular application the lowest order component
may
> have a decimal fraction. The decimal fraction shall be divided from
the
> integer part by the decimal sign specified in ISO 31-0: i.e. the
> comma [,]
> or full stop [.]. Of these, the comma is the preferred sign. The
decimal
> fraction shall at least have one digit. If the magnitude of the
number is
> less than unity, the decimal sign shall be preceded by a zero (see
ISO
> 31-0).
> c) If the number of years, months, days, hours, minutes or seconds in
any of
> these expressions equals zero, the number and the corresponding
> designator
> may be absent; however, at least one number and its designator shall
be
> present. Note that the removal of leading non-zero components is not
> allowed.
> d) The designator T shall be absent if all of the time components are
> absent.
> 5.5.3.2 Alternative format (optional)
> 5.5.4 Complete representations
> 5.5.4.1 Representation of time-intervals identified by start and end
> 5.4.1.* / 5.4.1.*
> 5.5.4.2 Representation of time-interval by duration only
> 5.5.4.2.1 Format with time-unit designators
> 5.5.4.2.1.a.BE: PnYnMnDTnHnMnS
> 5.5.4.2.1.b.BE: PnW
> 5.5.4.2.2 Alternative format (optional)
> 5.5.4.2.2.B: PYYYYMMDDThhmmss
> 5.5.4.2.2.E: PYYYY-MM-DDThh:mm:ss
> 5.5.4.3 Representation of time-interval identified by its start and its
> duration
> 5.5.4.3.B: 5.4.1.*.B / 5.5.3.*.B
> 5.5.4.3.E: 5.4.1.*.E / 5.5.3.*.E
> 5.5.4.4 Representation of time-interval identified by its duration and its
end
> 5.5.4.4.B: 5.5.3.*.B / 5.4.1.*.B
> 5.5.4.4.E: 5.5.3.*.E / 5.4.1.*.E
> 5.5.5 Representations other than complete
> A representation other than complete of a time-interval shall be an
> expression
> in accordance with 5.5.1 and 5.5.2, where time-points are represented
in
> accordance with 5.2, 5.3 or 5.4 and where duration is represented in
> accordance with 5.5.3.1 or 5.5.3.2, provided that:
> a) the rules specified in those sections are applied;
> b) the result is not a complete representation in accordance with
5.5.4, and
> c) for which the resulting expression is either consistently in basic
format
> or consistently in extended format;
> d) the use of a representation needs to be agreed by the partners in
> information interchange, if the use of any of its constituent parts
needs
> to be agreed by the partners in information interchange.
> In the representation of time-intervals in accordance with 5.5.1 a),
> - if higher order components are omitted from the expression following
the
> solidus (i.e. the representation for "end of time-interval"), it
shall be
> assumed that the corresponding components from the "start of
> time-interval"
> expression apply (e.g. if [YYYYMM] are omitted by using a derived
> representation, the end of the time-interval is in the same year and
month
> as the start of the time-interval);
> - representations for time-zones and Coordinated Universal Time
included
> with
> the component preceding the solidus shall be assumed to apply to the
> component following the solidus, unless a corresponding alternative
is
> included.
> 5.6 Recurring time-intervals
> 5.6.1 Means of specifying recurring time-intervals
> A recurring time-interval shall be expressed in one of the following
ways:
> a) By a number of recurrences (optional), a start and an end. This
> represents
> a recurring time-interval of which the first time-interval is
> identified by
> the first two components of the expression and the number of
> recurrences by
> the last component. If the last component is absent the number of
> occurrences is unbounded.
> b) By a number of recurrences (optional) and a duration. This
represents a
> recurring time interval with the indicated duration for each
> time-interval
> and with the indicated number of recurrences. If the number of
> recurrences
> is absent the number of occurrences is unbounded.
> c) By a number of recurrences (optional) a start and a duration. This
> represents a recurring time-interval of which the first
time-interval is
> identified by the first two components of the expression and the
> number of
> recurrences by the last component. If the last component is absent
the
> number of occurrences is unbounded.
> d) By a number of recurrences (optional), a duration and an end. This
> represents a recurring time-interval of which the last time-interval
is
> identified by the first two components of the expression and the
> number of
> recurrences by the last component. If the last component is absent
the
> number of occurrences is unbounded.
> 5.6.2 Separators and designators
> All representations start with the designator [R], followed, without
spaces,
> by the number of recurrences, if present, followed, without spaces, by
a
> solidus [ /], followed, without spaces, by the expression of a time
interval
> in accordance with 5.5.1. For the representation 5.6.1 a), 5.6.1 b),
> 5.6.1 c)
> and 5.6.1 d) the time interval in accordance with 5.5.1 a), 5.5.1 b),
> 5.5.1 c)
> and 5.5.1 d) shall be used respectively.
> 5.6.3 Complete representations
> 5.6.3: Rn / 5.5.*
>
> DIGIT: '0'..'9';
> HYPHEN_OR_MINUS: '-';
> COLON: ':';
> SOLIDUS: '/';
> DECIMAL: ',' | '.';
> PERIOD: 'P' | 'p';
> RECUR: 'R' | 'r';
> TIME: 'T' | 't' | ' '; // space is illegal but commonly used
> WEEK: 'W' | 'w';
> ZULU: 'Z' | 'z';
> PLUS: '+';
> // HYPHEN_OR_MINUS is done above
> YEAR: 'Y' | 'y';
> MONTH_OR_MINUTE: 'M' | 'm';
> // WEEK is done above
> DAY: 'D' | 'd';
> HOUR: 'H' | 'h';
> // MONTH_OR_MINUTE is done above
> SECOND: 'S' | 's';
> // HASH is probably not part of the Standard
>
>
>
> --
> Pete Forman -./\.- Disclaimer: This post is originated
> WesternGeco -./\.- by myself and does not represent
> pete.forman at westerngeco.com -./\.- opinion of Schlumberger, Baker
> http://petef.port5.com -./\.- Hughes or their divisions.
>
>
>
>
> Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/
>
>
Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/
More information about the antlr-interest
mailing list