[antlr-interest] Syntactic predicates cause unexplainable
compilation errors in different partsof the code
Loring Craymer
Loring.G.Craymer at jpl.nasa.gov
Wed Jan 26 14:54:25 PST 2005
At 02:11 PM 1/26/2005, Peter Robinson wrote:
>On Wed, 2005-01-26 at 22:48, Loring Craymer wrote:
> > This is one of those cases that is usually handled by factoring out the
> comma:
> >
> > gene_ref:
> > gene_refline ("," gene_refline)*
> > ;
> >
>
>Thanks, I did consider that but was stumped because of the following.
>When I parse a GeneRef object, I create a Java object to store the info
>that in turn gets passed further up. I need to initialize up to 6
>different variables in this object depending on which type of
>gene_refline are found (mainly strings, but also more complicated
>structures). This is why I wanted to try to do everything in one rule so
>that I would always have the reference to the GeneRef object handy...
I'd handle this case by having gene_refline be passed a GeneRef argument
unless there is a reason for insisting on order of keywords (either order
is needed for evaluation, or you are producing a syntax checker).
>gene_ref returns [GeneRef gr=null]
>{
> String s;
>} GENEREF_KW { gr = new GeneRef(); }
> s=gene_refline { gr.addString(s); } // But there are 6
> //different types of
> String //variable to be initialied
> //depending on what type of gene_refline
>
>
>
> > However, I think what you are really running into is ANTLR 2's approximate
> > LLk. If you look at the generated code (without the synpreds), I think
> > that you will find that it does the right thing.
> >
>
>It was actually (nearly) correct, but since a bunch of downstream
Would it do the right thing, though? Usually, the if statements have
cross-product conditionals (if ((LA(1) == A || LA(1) ==B) && (LA(2) ==
STRING) || LA(2) == BOOLEAN)))
which are not strictly correct, but the case statements impose the correct
orderings when matching tokens.
>analyses depend on correct parsing, I would somehow like to get rid of
>all error messages...
A good goal--every time I run into this consequence of the LLk-approx, I
get paranoid. There is an inline option to suppress the nondeterminism
warnings when you can verify that the generated code would work properly,
but I tend to avoid that myself.
--Loring
>Monty, thanks for your reply. Yes I did look at the ASN.1 grammar on the
>website, but (given that I am not that well versed in ASN.1, at least
>not yet) I was not able to adapt that to my needs...
>
>
>
>
> > --Loring
> >
> >
> > At 12:34 PM 1/26/2005, Peter Robinson wrote:
> >
> >
> > >Gene-ref ::= SEQUENCE {
> > > A VisibleString OPTIONAL ,
> > > B VisibleString OPTIONAL ,
> > > C VisibleString OPTIONAL ,
> > > D VisibleString OPTIONAL ,
> > > E BOOLEAN DEFAULT FALSE ,
> > > F SET OF Dbtag OPTIONAL ,
> > > G SET OF VisibleString OPTIONAL ,
> > > H VisibleString OPTIONAL }
> > > END
> > >
> > >
> > >
> > >
> > >Dear ANTLR list,
> > >
> > >First of all thanks to you all for being a helpful and informative list.
> > >I recently have been trying to learn antlr and cannot now imagine using
> > >things like lex/yacc with which I previously occasionally did things.
> > >
> > >I am now trying to parse a file structure from NCBI in ASN.1 format. The
> > >specification of a small part of the entire thing is as follows ( I
> > >have replaced some keywords with the letters A-H for clarity). Any one
> > >of the entries is optional and is followed by a comma if there is going
> > >to be another line. There are Gene-ref entries with only one entry (and
> > >no comma).
> > >
> > >
> > >Gene-ref ::= SEQUENCE {
> > > A VisibleString OPTIONAL ,
> > > B VisibleString OPTIONAL ,
> > > C VisibleString OPTIONAL ,
> > > D VisibleString OPTIONAL ,
> > > E BOOLEAN DEFAULT FALSE ,
> > > F SET OF Dbtag OPTIONAL ,
> > > G SET OF VisibleString OPTIONAL ,
> > > H VisibleString OPTIONAL }
> > > END
> > >
> > >After trying constructs such as (",")? and getting nondeterminateness
> > >warnings, I tried my hand at a syntactic predicate as follows:
> > >
> > >generef_line returns [myJavaObject ... ]
> > >{
> > > String s;
> > > Dbtag d;
> > >}: GENE_KW "{"
> > > ( ( A STRING ",")=>
> > > A s1:STRING { System.out.println(s1.getText()); } ","
> > > | A s2:STRING { System.out.println(s2.getText()); }
> > > )?
> > > ( (B STRING ",")=>
> > > B s3:STRING { System.out.println(s3.getText()); } ","
> > > | B s4:STRING { System.out.println(s4.getText()); }
> > > )?
> > > AND SO ON...
> > >
> > > "}"
> > >;
> > >
> > >
> > >However, this now causes unexplainable compilation errors in other parts
> > >of the code (about 400 lines of grammar etc) to appear, in code that
> > >**worked perfectly fine** before. What is going on?? and is there a
> > >better way to parse the above construct? Thanks, Peter
> > >
> > >--
> > >Peter N. Robinson
> > >peter.robinson at t-online.de
> > >peter.robinson at charite.de
> > >http://www.charite.de/ch/medgen/robinson/
>--
>Peter N. Robinson
>peter.robinson at t-online.de
>peter.robinson at charite.de
>http://www.charite.de/ch/medgen/robinson/
More information about the antlr-interest
mailing list