[antlr-interest] Can one identify the type of parser needed for a given BNF grammar

Mon Jul 11 14:16:42 PDT 2011

On 3.) :  The parser just recognizes syntax and ignores semantic ambiguities; 
then in a first tree walker pass, the symbol tables are constructed; a second 
tree walker pass uses the symbol table information to resolve ambiguities (which 
of several interpretations of valid syntax is correct) and rewrites the tree 
into an unambiguous form.  Then you have finished recognizing C++ and can 
continue with further processing.  Never, never, never would I suggest adding a 
feedback mechanism that couples parser to lexer.

--Loring

----- Original Message ----
> From: The Researcher <researcher0x00 at gmail.com>
> To: antlr-interest at antlr.org
> Sent: Mon, July 11, 2011 12:55:05 PM
> Subject: Re: [antlr-interest] Can one identify the type of parser needed for a 
>given BNF grammar
> 
> >
> > Loring,
> 
> 
> Thanks for the feedback.
> 
> While I now can  understand your answer, I still don't have enough
> experience  to  implement any approach correctly.
> 
> As a help to others who are working on  this in private or interested in
> creating a C++  parser I will gloss  what Loring said because last week it
> would have mostly been Greek to me. If  I say something wrong, please
> correct. I am not afraid to make mistakes in  public so that others may learn
> from them.
> 
> The big thing that one has  to learn about parsing C++ is how to handle the
> ambiguities. That is the  dragon to be slayed here.
> 
> 1. And then you have to figure out how to prune  the GLR-generated "forests".
> 
> GLR http://en.wikipedia.org/wiki/GLR_parser
> 
> The reason GLR is considered  for C++ is that do to the ambiguities of C++
> most parsers in a typical  fashion have to couple the feedback from the
> parser to the lexer and  reference a symbol table to correctly parse C++.
> This is contrary to what is  taught in school and most of the early examples
> of parsing you will come  across. This allows them to produce an AST without
> ambiguities after the  parser. This is pushing the work to the front. GLR is
> one of the most  powerful parsing strategies and can parse C++ without
> coupling the parser to  the lexer, but does so by generating multiple
> subtrees for each ambiguity,  for which you must later prune out the
> ambiguities from the forests. This is  pushing the work to the back.
> 
> 2. C++ is nasty; it can be parsed with  ANTLR (as shown by NEXT and David
> Wigg's
> adaptions of that  grammar).
> 
> The ANTLR C++ grammar does so by creating a symbol table and  using
> predicates during the parse. In my initial conversions of the ANTLR  C
> grammar to C# I was able to convert it, but did not understand why it  did
> certain things at first. After researching it was clear that a lot of  the
> ambiguities of C++ arise when an name is encountered and then it's use (  a
> type, an identifier, etc)  has to be know in order to take the  appropriate
> branch. Thus the creation of the symbol tables and other tables  and the use
> of predicates. This pushes the work to the front.
> 
> 3. I  believe that the right strategy with ANTLR is actually to use
> multi-pass  recognition to sort out the ambiguities.  That has not been  done
> yet.
> The problem is that C++ cannot be adequately described with a  context-free
> grammar; you have to do some context-sensitive processing to  resolve the
> syntax
> that is semantically ambiguous.
> 
> It could/should  be possible to parse C++ with ANTLR without a symbol table
> and associated  predicates, and then in the AST analysis verify that the
> input C++ syntax is  valid C++ semantics.
> Remember that the BNF for a grammar does not guarantee  valid code/semantics
> only correct syntax. So the AST is analyzed to find  invalid semantics. I am
> not sure if this will require coupling the parser to  the lexer or the AST to
> the parser, but if I go this route I will soon  know.
> 
> I know this went off topic but was worth it.
> 
> Any answer to  the original question is still  appreciated.
> 
> Thanks
> 
> Eric
> 
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe: 
>http://www.antlr.org/mailman/options/antlr-interest/your-email-address
>