[antlr-interest] Invalid parser generation

mark4 at voila.fr mark4 at voila.fr
Thu Sep 6 06:11:02 PDT 2012


But I read that ANTLR gave precedence to the first encountered rule. So, in that case, I supposed it would match a DIMENSION.

Anyway, I have changed:

DIMENSION: ID;

into

dimension : ID;

to remove the issue. What troubles me in ANTLR is the mix of lexer rules and parser rules within the same grammar. I used tools like Lex/Yacc or JLex/CUP and in my memories, the lexer and the parser were separated. So, within the parser, you just had to write what you -expected-, for instance:

myrule : LPAREN DIMENSION RPAREN;

So, the token DIMENSION was scanned ONLY if '(' was found. But within ANTLR, it seems that both parser rules and lexer rules compete at the same time, which make the grammars much trickier to design. At least, it's the way I see it.

Anyway, I still have much difficulty raising ambiguities in my lexer rules because they compete with one another. My grammar compiles, but the built AST shows errors recognizing tokens. Maybe I should only use ID's and check whether the strings are valid within the program...

> Message du 06/09/12 à 14h54
> De : "Jesse McGrew"
> A : "mark4 at voila.fr"
> Copie à : "Stefan Mätje" , antlr-interest at antlr.org
> Objet : Re: [antlr-interest] Invalid parser generation
>
> You can't have two lexer rules that match the same input. When the
> lexer sees a string like "foo", how is it supposed to know whether it
> should return DIMENSION or ITEM (or ID)? You should probably be using
> parser rules instead.
>
> Jesse
>
> On Thu, Sep 6, 2012 at 2:44 AM, mark4 at voila.fr wrote:
> > Hi Stefan,
> >
> > I wanted to revert to your post. You recommended to put the most specific lexer rules first. But how can I do if 2 rules are close, or even identical?
> >
> > For instance:
> > DIMENSION : ID;
> > ITEM : ID;
> >
> > They automatically generate an error in ANTLR. Of course, this situation seems useless, but in the future, I may modify these rules and make them different. That's the reason why I'd like to distinguish them in the grammar file.
> >
> > Thanks in advance,
> > Mark
> >
> >> Message du 04/09/12 à 15h40
> >> De : "Stefan Mätje"
> >> A : antlr-interest at antlr.org
> >> Copie à : "mark4 at voila.fr"
> >> Objet : Re: [antlr-interest] Invalid parser generation
> >>
> >> Am 04.09.2012 14:35, schrieb mark4 at voila.fr:> Hi Stefan,
> >> >
> >> > Thanks for your reply. I didn't understand the difference between
> >> > lexer rules and parser rules because,
> >> > in fine, a parser rule will always resolve in a series of lexer
> >> > rules...
> >>
> >> Please don't mix the lexer and the parser phase in your mind. The lexer
> >> deals with single characters and groups them into tokens.
> >>
> >> The parser doesn't know anything about single characters and deals only
> >> with tokens.
> >>
> >> > Anyway, I applied the modification but I now get an error:
> >> >
> >> > COMPTE : ('0'..'9')+;
> >> >
> >> > ID : ('a'..'z'|'A'..'Z'|'_') ('a'..'z'|'A'..'Z'|'0'..'9'|'_')* ;
> >> >
> >> > The following token definitions can never be matched because prior
> >> > tokens match the same input: COMPTE,ID
> >>
> >> You have rules in your grammar before COMPTE and ID that define a
> >> superset of the character sequences that COMPTE and ID can match.
> >>
> >> > Well, I have several entities in my grammar that have different
> >> > encoding forms, so how can I specify them one after the other?
> >>
> >> If at the end one type of token should be produced all needed
> >> regular expressions have to go into one rule.
> >>
> >> > Thanks,
> >> > Mark
> >> >
> >>
> >> As rule of thumb write the most specific lexer rules first and then
> >> follow them with the less specific rules. The lexer will give the
> >> rules first written a higher precedence.
> >>
> >> So put your keywords first (which are fixed strings). Then follow them
> >> with something like operators (also fixed strings). At the lower level
> >> rules that can match different strings like ID and COMPTE follow.
> >>
> >> See what Antlrworks tells you about multiple matches and which rules are
> >> involved.
> >>
> >> Don't know if this may help but the rule that matches both COMPTE and ID
> >> would be most interesting.
> >>
> >> Best regards,
> >> Stefan
> >>
> >> PS.: Please reply also to the list.
> >>
> >>
> >
> > ___________________________________________________________
> > 10 conseils pour un ventre plat sur Voila.fr http://actu.voila.fr/evenementiel/beaute-minceur/conseils-ventre-plat/
> >
> > List: http://www.antlr.org/mailman/listinfo/antlr-interest
> > Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address
>

___________________________________________________________
10 conseils pour un ventre plat sur Voila.fr http://actu.voila.fr/evenementiel/beaute-minceur/conseils-ventre-plat/


More information about the antlr-interest mailing list