[antlr-interest] antlr should throw NoViableAltException

femto gary femtowin at gmail.com
Sun Apr 15 08:19:41 PDT 2007


actually I found why SEMI is ambuity.
when matching statement thru alt1,
after matching expression, modifier_line,
if it meets a SEMI, it can either match on SEMI?
or just ends this statement , starting a new matching
(which means a new statement just match SEMI),
so this is ambuity.
anyway, thanks for David,
you point me the right direction, thanks.

On 4/15/07, femto gary <femtowin at gmail.com> wrote:
> besides, do you know how to conditionally pipe a token to
> differnent channel?
>   I mean, for the following grammar:
> grammar Rubyv3;
>
> options {
>        output=AST;
> }
> tokens {
>        // 'imaginary' tokens
>        STATEMENT_LIST;
>        STATEMENT;
>        IF;
>        RPAREN_IN_METHOD_DEFINATION;
>        BODY;
>        CALL;
>        ARG;
>        //COMPSTMT;
>        SYMBOL;
>        BLOCK;
>        MULTIPLE_ASSIGN;
>        MULTIPLE_ASSIGN_WITH_EXTRA_COMMA;
>        BLOCK_ARG;
>        BLOCK_ARG_WITH_EXTRA_COMMA;
>        MRHS;
>        NESTED_LHS;
>        SINGLETON_METHOD;
>        STRING;
> }
>
> @rulecatch {
> catch (RecognitionException e) {
> throw e;
> }
> }
>
> @header {
> package com.xruby.compiler.parser;
> }
> @lexer::header {
> package com.xruby.compiler.parser;
> }
>
> @members{
>        private int can_be_command_ = 0;
>        private Rubyv3Lexer lexer;
>        protected void enterScope()     {assert(false);}
>        protected void enterBlockScope()        {assert(false);}
>        protected void leaveScope()     {assert(false);}
>        protected void addVariable(Token id)    {assert(false);}
>        protected void setIsInNestedMultipleAssign(boolean v)   {assert(false);}
>        protected void
> tellLexerWeHaveFinishedParsingMethodparameters()        {assert(false);}
>        protected void tellLexerWeHaveFinishedParsingSymbol()   {assert(false);}
>        protected void
> tellLexerWeHaveFinishedParsingStringExpressionSubstituation()   {assert(false);}
>        protected void
> tellLexerWeHaveFinishedParsingRegexExpressionSubstituation()    {assert(false);}
>        protected void
> tellLexerWeHaveFinishedParsingHeredocExpressionSubstituation()  {assert(false);}
> }
>
> @lexer::members
> {
>        //The following methods are to be implemented in the subclass.
>        //In fact they should be 'abstract', but antlr refuses to generate
>        //abstract class. We can either insert 'abstract' keyword manually
>        //after the lexer is generated, or simply use assert() to prevent
>        //these function to run (so you have to overide them). I choosed
>        //the later approach.
>        public int line_break_channel = HIDDEN;
>        public void openLineBreakChannel() {
>        line_break_channel = DEFAULT_TOKEN_CHANNEL;
>        }
>
>        protected boolean expectOperator(int k) throws
> Exception               {assert(false);return false;}
>        protected boolean expectUnary()  throws
> Exception                       {assert(false);return false;}
>        protected boolean expectHash()                                  {assert(false);return false;}
>        protected boolean expectHeredoc()                               {assert(false);return false;}
>        protected boolean expectLeadingColon2()         {assert(false);return false;}
>        protected boolean expectArrayAccess()                           {assert(false);return false;}
>        protected boolean lastTokenIsDotOrColon2()              {assert(false);return false;}
>        protected boolean lastTokenIsSemi()                             {assert(false);return false;}
>        protected boolean
> lastTokenIsKeywordDefOrColonWithNoFollowingSpace()                      {assert(false);return
> false;}
>        protected boolean
> lastTokenIsColonWithNoFollowingSpace()                  {assert(false);return false;}
>        protected boolean shouldIgnoreLinebreak()                       {assert(false);return false;}
>        protected int trackDelimiterCount(char next_char, char delimeter, int
> delimeter_count)        {assert(false);return 0;}
>        protected boolean isDelimiter(String next_line, String
> delimiter)      {assert(false);return false;}
>        protected boolean isAsciiValueTerminator(char
> value)  {assert(false);return false;}
>        protected boolean justSeenWhitespace()  {assert(false);return false;}
>        protected void setSeenWhitespace()                      {assert(false);}
>        protected boolean expressionSubstitutionIsNext()        throws
> Exception       {assert(false);return false;}
>        protected boolean spaceIsNext() throws Exception        {assert(false);return false;}
>        protected void setCurrentSpecialStringDelimiter(char delimiter, int
> delimiter_count)        {assert(false);}
>        protected void updateCurrentSpecialStringDelimiterCount(int
> delimiter_count)        {assert(false);}
> }
>
>
> program
> @init
> {
>  lexer = (Rubyv3Lexer) getTokenStream().getTokenSource();
> }               :       statement_list
>                ;
>
> statement_list
>                :       statement* -> ^(STATEMENT_LIST statement*)
>                        ;
>
> /*terminal
>                :       SEMI!
>                |       LINE_BREAK!
>                ;*/
> statement
>        :       expression (modifier_line)* SEMI? -> ^(STATEMENT expression
> (modifier_line)*)
>        |       SEMI!
>        ;
>
> modifier_line
>        :(IF_MODIFIER|UNLESS_MODIFIER|WHILE_MODIFIER|UNTIL_MODIFIER|RESCUE_MODIFIER)^
> expression
>                ;
> IF_MODIFIER     :  'if';
> UNLESS_MODIFIER :  'unless';
> WHILE_MODIFIER  :  'while';
> UNTIL_MODIFIER  :  'until';
> RESCUE_MODIFIER :  'resuce';
>
> SEMI    :';'
>        ;
>
> LINE_BREAK
>        :'\r'? '\n'{$channel=line_break_channel;}
>        ;
> //OMIT_LINE_BREAK
> //      :       LINE_BREAK* {skip();}
> //      ;
> //emptyable_expression
> //      :       expression|;
> expression
>        :       'expression0' | 'expression1' | 'expression2'|boolean_expression|
> block_expression|if_expression|unless_expression;
>
> block_expression
>        :       'begin' body 'end';
> body    :       statement_list;
> boolean_expression
>        :       'false'|'nil'|'true';
> if_expression
>        :       'if' b=boolean_expression
> {lexer.openLineBreakChannel();}('then'|':'|LINE_BREAK)
>                body0=body ('elsif' b1=boolean_expression
> ('then'|':'|LINE_BREAK) body1+=body)*
>                ('else' body2=body)?
>                'end' -> ^(IF $b $body0 $b1* $body1* $body2? )
>                ;
> unless_expression
>        :       'unless' boolean_expression ('then'|':'|LINE_BREAK)
>                body
>                ('else' body)?
>                'end';
>
> WS      :       (' ' | '\t') { skip(); }
>        ;
> ID      :       ('a'..'z' | 'A'..'Z') (('a'..'z' | 'A'..'Z') | ('0'..'9'))*
>        ;
> I want to openLineBreakChannel in if_expression,
> {lexer.openLineBreakChannel();}, so after if boolean_expression
> the line_break or 'then'|':' are mandatory, not skip() or channel HIDDEN,
> but I've tried this, found nothing happened.
> it seems Lexer all parse out token stream then handle it to parser, so
> parser can't affect lexer thru call to
> lexer.openLineBreakChannel();
>
> On 4/15/07, femto gary <femtowin at gmail.com> wrote:
> > Hi David, thanks for the information,
> > I'll check it out.
> >
> > On 4/15/07, David Holroyd <dave at badgers-in-foil.co.uk> wrote:
> > > On Sun, Apr 15, 2007 at 08:11:39PM +0800, femto gary wrote:
> > > > also, generating parser will also produce the following warning:
> > > > [20:08:36] warning(200): Rubyv3.g:101:32: Decision can match input
> > > > such as "SEMI" using multiple alternatives: 1, 2
> > > > As a result, alternative(s) 2 were disabled for that input
> > > >
> > > > but for the grammar:
> > > > statement
> > > >       :       expression (modifier_line)* SEMI? -> ^(STATEMENT expression
> > > > (modifier_line)*)
> > > >       |       SEMI!
> > > >       ;
> > > > input SEMI shouldn't cause an ambiguity, because expression can't be empty,
> > > > so either match the alt1 or alt2, why does it will report that warning.
> > > > Anybody has any ideas? Thanks.
> > >
> > > I think that since the grammar allows,
> > >
> > >  statement*
> > >
> > > the ambiguity is between the alternatives of,
> > >
> > >  1) matching 'SEMI?' right now, in this invocation of 'statement', and,
> > >
> > >  2) not matching 'SEMI?', exiting this invocation of the 'statement' rule
> > >    and then matching the 'SEMI!' alternative the next time around the
> > >    'statement*' loop
> > >
> > > Is "1;" to be parsed as
> > >
> > >  (STATEMENT 1) (STATEMENT ;)
> > > or
> > >  (STATEMENT 1;)
> > >
> > >
> > >
> > > i.e. the decision being referred to in the message is probably the one
> > > at the '?', not the one at the '|', if that helps at all :)
> > >
> > >
> > > ta,
> > > dave
> > >
> > > --
> > > http://david.holroyd.me.uk/
> > >
> >
> >
> > --
> > Best Regards
> > XRuby http://xruby.com
> > femto http://hi.baidu.com/femto
> >
>
>
> --
> Best Regards
> XRuby http://xruby.com
> femto http://hi.baidu.com/femto
>


-- 
Best Regards
XRuby http://xruby.com
femto http://hi.baidu.com/femto


More information about the antlr-interest mailing list