[antlr-interest] Antlr lexer does not try other possible matches when it fails to match a token

Jim Idle jimi at temporal-wave.com
Wed Dec 1 11:05:22 PST 2010


Well, these rules are ambiguous - you should read the getting started
documents, but as you have no rules that can catch wrong paths, you just
need to left factor these in to one rule and use a predicate. Then set $type
to whatever you need it to be.

Note that the lexer runs independently of the parser.

So your S input is enough to trigger STATION, so try this:

STATION : 'S'
            (   ('TATION')=>'TATION'
              | { $type = LETTER; }
            )
;

You just have to be more specific is all. I think it is easier to see what
the intent is anyway.

jim

> -----Original Message-----
> From: antlr-interest-bounces at antlr.org [mailto:antlr-interest-
> bounces at antlr.org] On Behalf Of Nick Vlassopoulos
> Sent: Wednesday, December 01, 2010 6:10 AM
> To: COUJOULOU, Philippe
> Cc: antlr-interest at antlr.org
> Subject: Re: [antlr-interest] Antlr lexer does not try other possible
> matches when it fails to match a token
> 
> Hello Philippe,
> 
> Although I am not an expert, I thing you should let the lexer sort out
> the "3 letters 1 digit" in the station name. Alternatively, you could
> probably add the station name as an identifier and check if it is in
> the correct format after parsing it.
> 
> Without being sure if it is a good solution, the following seems to
> work:
> 
> Best regards,
> 
> Nikos
> 
> -------------------------
> grammar Stations;
> 
> stationParameter :
> KEYWORD_STATION SPACE stationName;
> 
> stationName
> : STATION_NAME;
> 
> STATION_NAME
> : LETTER LETTER LETTER DIGIT;
> 
> KEYWORD_STATION : 'STATION';
> LETTER : 'a'..'z' | 'A'..'Z';
> DIGIT : '0'..'9';
> SPACE : ' ';
> -------------------------
> 
> 
> On Wed, Dec 1, 2010 at 2:18 PM, COUJOULOU, Philippe <
> philippe.coujoulou at airbus.com> wrote:
> 
> > Dear all,
> >
> > I am trying to parse a message that contains parameters values like
> > <PARAM_NAME> <VALUE>, for instance "STATION EST1".
> > Here is a very simple extract of my grammar for one of these
> > parameters (the one given in the above example):
> >
> > grammar test;
> >
> > KEYWORD_STATION :       'STATION';
> > DIGIT    :        '0'..'9';
> > LETTER  :        'a'..'z' | 'A'..'Z';
> > SPACE   :       ' ';
> >
> > stationParameter        :       KEYWORD_STATION SPACE stationName;
> > stationName     :       LETTER LETTER LETTER DIGIT;
> >
> >
> > The point is that when I try to parse my example message (STATION
> > EST1), I get a MismatchTokenException at the point where the parser
> > attempts to read the last "ST1". After some analysis, I understood
> > that the lexer generated the following tokens: KEYWORD_STATION SPACE
> > LETTER for the string "STATION E"  and then attempted to match the
> > remaining "ST1" with KEYWORD_STATION but failed to complete it.
> >
> > At this point, I would expect the lexer to backtrack to the beginning
> > of 'ST1' and then match it with LETTER LETTER DIGIT, but it doesn't.
> >
> > I have tried various combinations of "backtrack", "memorize" and "k"
> > options without any success. I must have missed something. (Should it
> > help, I use ANTLRWorks 1.4).
> >
> > Please could you tell me how to proceed in order to make the lexer
> > backtrack and try other alternatives when a keyword of my language is
> > not exactly matched ?
> >
> > Thanks in advance for your help.
> >
> > Best Regards,
> >
> > Philippe Coujoulou.
> >
> >
> > The information in this e-mail is confidential. The contents may not
> > be disclosed or used by anyone other than the addressee. Access to
> > this e-mail by anyone else is unauthorised.
> > If you are not the intended recipient, please notify Airbus
> > immediately and delete this e-mail.
> > Airbus cannot accept any responsibility for the accuracy or
> > completeness of this e-mail as it has been sent over public networks.
> > If you have any concerns over the content of this message or its
> > Accuracy or Integrity, please contact Airbus immediately.
> > All outgoing e-mails from Airbus are checked using regularly updated
> > virus scanning software but you should take whatever measures you
> deem
> > to be appropriate to ensure that this message and any attachments are
> virus free.
> >
> >
> > List: http://www.antlr.org/mailman/listinfo/antlr-interest
> > Unsubscribe:
> > http://www.antlr.org/mailman/options/antlr-interest/your-email-
> address
> >
> 
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-
> email-address



More information about the antlr-interest mailing list