[antlr-interest] Error reporting with ANTLR tree grammar

Terence Parr parrt at cs.usfca.edu
Wed Nov 24 16:35:19 PST 2010


first question.  are you backtracking a lot?  that can cause bad error reporting liek this:

a : foo
   | bar
   ;

if it's backtracking and can't match foo or bar, it'll say "can't do nothin' with first token"

Ter
On Nov 24, 2010, at 3:26 PM, Arthur Goldberg wrote:

> Hello All
> 
> I'm writing a parser for a fairly simple language (14 rules & 10 tokens) 
> that reads a description of a graph -- like this OncoPrint 
> <http://cbio.mskcc.org/cancergenomics-dataportal/index.do?case_set_id=gbm_3way_complete&tab_index=tab_visualize&action=Submit&genetic_profile_ids=gbm_mutations&genetic_profile_ids=gbm_cna_rae&genetic_profile_ids=gbm_mrna_zscores&case_ids=&Z_SCORE_THRESHOLD=1.0&cancer_type_id=gbm&gene_list=EGFR+ERBB2+PDGFRA+MET+KRAS+NRAS+HRAS+NF1+SPRY2+FOXO1+FOXO3+AKT1+AKT2+AKT3+PIK3R1+PIK3CA+PTEN&gene_set_choice=glioblastoma:_rtk/ras/pi3k/akt_signaling_%2817_genes%29&> 
> -- of cancer data and produces a data structure that will be used to 
> select, organize and filter the data to be shown in the graph. Users 
> will enter the language on our web site.
> 
> I have a working one-pass grammar, but after building it found that it's 
> very difficult to produce error messages in one pass. For example, one 
> might think that a failed semantic predicate would be a good place to 
> report an error, but that doesn't work because exceptions are not thrown 
> when predicates are hoisted and predicates are called multiple times as 
> the parser backtracks to find a parse. (See my previous message on use 
> of semantic predicates and hoisting 
> <http://www.antlr.org/pipermail/antlr-interest/2010-November/040091.html>.)
> 
> I simply want to say things like
> "Syntax error at 'xyz' at char <c> on line <l>"   // when the input 
> syntax is wrong (I can't say "line 1:0 no viable alternative at input 
> 'xyz'"), and
> "<input> is not a valid <type> at char <c> on line <l>"   // when the 
> input semantics is wrong, for example when <input> should be a word that 
> fits a pattern that describes a genetic data type
> 
> Therefore, I'm told that one should postpone error reporting until 
> later, and that I need a two pass grammar -- 1) build AST, 2) walk the 
> tree -- to easily and accurately report errors. I've started down that 
> path, and have a few productions in each grammar and a driver program 
> that connects them and handles bits of input.
> 
> I think that I can report the syntax errors by overriding
>    public void displayRecognitionError(String[] tokenNames, 
> RecognitionException e) and
>    public String getErrorMessage(RecognitionException e, String[] 
> tokenNames)
> in Phase 1,
> 
> But it isn't clear how one accesses data in the AST with the tree 
> grammar. That is, inside the tree grammar how do I get the data I need 
> to produce the semantic error message above?
> 
> Is that documented? I don't see it in The Definitive ANTLR Ref, Chap. 8 
> or 10.
> 
> Thanks & Thanksgiving
> Arthur
> 
> -- 
> Senior Research Scientist
> Computational Biology
> Memorial Sloan-Kettering Cancer Center
> 
> 
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address



More information about the antlr-interest mailing list