[antlr-interest] Can one identify the type of parser needed for a given BNF grammar

Douglas Godfrey douglasgodfrey at gmail.com
Mon Jul 11 16:38:55 PDT 2011


When I converted the ISO SQL grammar from EBNF to Antlr, I handled multiple
alternate
character set rules by using semantic predicates in the Lexer. The rule for
a SQL Identifier
has 8 variations depending on which SQL variant you are parsing [each has a
predicate].
The same predicate functions are available to the Parser to handle the
syntactic differences
between the variant dialects.

On Mon, Jul 11, 2011 at 5:21 PM, The Researcher <researcher0x00 at gmail.com>wrote:

> Douglas,
>
> Thanks.
>
>  This is not meant to be a real-time parser, ease of learning is more
> important than speed, but quality is more important than ease of learning
> so
> what you suggest bears some weight. But I would be willing to trade off
> better documentation if a more complex solution is used.
>
> Here are some specifics of what keeps me up at night on parsing C++, or
> food
> for thought. While I have some ideas on this, I haven't come to a
> conclusion.
>
> In the  ISO C++ grammar for the preprocessor section is the lparen rule.
>
> lparen: the left_parenthesis character without preceding white_space.
>
> Another rule is H_char
>
> h_char: any member of the source character set except new_line and > .
>
> If the machine is ASCII it means one thing. If it is Unicode it is another.
> If it is country specific it is still another.
>
> Also, I plan to be able to parse ISO, GCC and Microsoft C++ with this.
> Should it be one parser with options, three parsers, one parser with the
> back end sorting it all out, or something else.
> **
> Thanks Eric
>
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe:
> http://www.antlr.org/mailman/options/antlr-interest/your-email-address
>


More information about the antlr-interest mailing list