[antlr-interest] antlr should throw NoViableAltException

Sun Apr 15 07:51:50 PDT 2007

besides, do you know how to conditionally pipe a token to
differnent channel?
   I mean, for the following grammar:
grammar Rubyv3;

options {
        output=AST;
}
tokens {
	// 'imaginary' tokens
	STATEMENT_LIST;
	STATEMENT;
	IF;
	RPAREN_IN_METHOD_DEFINATION;
	BODY;
	CALL;
	ARG;
	//COMPSTMT;
	SYMBOL;
	BLOCK;
	MULTIPLE_ASSIGN;
	MULTIPLE_ASSIGN_WITH_EXTRA_COMMA;
	BLOCK_ARG;
	BLOCK_ARG_WITH_EXTRA_COMMA;
	MRHS;
	NESTED_LHS;
	SINGLETON_METHOD;
	STRING;
}

@rulecatch {
catch (RecognitionException e) {
throw e;
}
}

@header {
package com.xruby.compiler.parser;
}
@lexer::header {
package com.xruby.compiler.parser;
}

@members{
	private int can_be_command_ = 0;
	private Rubyv3Lexer lexer;
        protected void enterScope()	{assert(false);}
	protected void enterBlockScope()	{assert(false);}
	protected void leaveScope()	{assert(false);}
	protected void addVariable(Token id)	{assert(false);}
	protected void setIsInNestedMultipleAssign(boolean v)	{assert(false);}
	protected void
tellLexerWeHaveFinishedParsingMethodparameters()	{assert(false);}
	protected void tellLexerWeHaveFinishedParsingSymbol()	{assert(false);}
	protected void
tellLexerWeHaveFinishedParsingStringExpressionSubstituation()	{assert(false);}
	protected void
tellLexerWeHaveFinishedParsingRegexExpressionSubstituation()	{assert(false);}
	protected void
tellLexerWeHaveFinishedParsingHeredocExpressionSubstituation()	{assert(false);}
}

@lexer::members
{
	//The following methods are to be implemented in the subclass.
	//In fact they should be 'abstract', but antlr refuses to generate
	//abstract class. We can either insert 'abstract' keyword manually
	//after the lexer is generated, or simply use assert() to prevent
	//these function to run (so you have to overide them). I choosed
	//the later approach.
	public int line_break_channel = HIDDEN;
	public void openLineBreakChannel() {
        line_break_channel = DEFAULT_TOKEN_CHANNEL;
        }

	protected boolean expectOperator(int k) throws
Exception		{assert(false);return false;}
	protected boolean expectUnary()	 throws
Exception			{assert(false);return false;}
	protected boolean expectHash()					{assert(false);return false;}
	protected boolean expectHeredoc()				{assert(false);return false;}
	protected boolean expectLeadingColon2()		{assert(false);return false;}
	protected boolean expectArrayAccess()				{assert(false);return false;}
	protected boolean lastTokenIsDotOrColon2()		{assert(false);return false;}
	protected boolean lastTokenIsSemi()				{assert(false);return false;}
	protected boolean
lastTokenIsKeywordDefOrColonWithNoFollowingSpace()			{assert(false);return
false;}
	protected boolean
lastTokenIsColonWithNoFollowingSpace()			{assert(false);return false;}
	protected boolean shouldIgnoreLinebreak()			{assert(false);return false;}
	protected int trackDelimiterCount(char next_char, char delimeter, int
delimeter_count)	{assert(false);return 0;}
	protected boolean isDelimiter(String next_line, String
delimiter)	{assert(false);return false;}
	protected boolean isAsciiValueTerminator(char
value)	{assert(false);return false;}
	protected boolean justSeenWhitespace()	{assert(false);return false;}
	protected void setSeenWhitespace()			{assert(false);}
	protected boolean expressionSubstitutionIsNext()	throws
Exception	{assert(false);return false;}
	protected boolean spaceIsNext()	throws Exception	{assert(false);return false;}
	protected void setCurrentSpecialStringDelimiter(char delimiter, int
delimiter_count)	{assert(false);}
	protected void updateCurrentSpecialStringDelimiterCount(int
delimiter_count)	{assert(false);}
}

program
@init
{
  lexer = (Rubyv3Lexer) getTokenStream().getTokenSource();
}               :	statement_list
		;

statement_list
		:	statement* -> ^(STATEMENT_LIST statement*)
			;

/*terminal
		:	SEMI!
		|	LINE_BREAK!
		;*/
statement
	:	expression (modifier_line)* SEMI? -> ^(STATEMENT expression
(modifier_line)*)
	|       SEMI!
	;

modifier_line
	:(IF_MODIFIER|UNLESS_MODIFIER|WHILE_MODIFIER|UNTIL_MODIFIER|RESCUE_MODIFIER)^
expression
		;
IF_MODIFIER     :  'if';
UNLESS_MODIFIER :  'unless';
WHILE_MODIFIER  :  'while';
UNTIL_MODIFIER  :  'until';
RESCUE_MODIFIER :  'resuce';

SEMI	:';'
	;

LINE_BREAK
	:'\r'? '\n'{$channel=line_break_channel;}
	;
//OMIT_LINE_BREAK
//	:	LINE_BREAK* {skip();}
//	;
//emptyable_expression
//	:	expression|;
expression
	:	'expression0' | 'expression1' | 'expression2'|boolean_expression|
block_expression|if_expression|unless_expression;

block_expression
	:	'begin' body 'end';
body	:	statement_list;
boolean_expression
	:	'false'|'nil'|'true';
if_expression
	:	'if' b=boolean_expression
{lexer.openLineBreakChannel();}('then'|':'|LINE_BREAK)
	        body0=body ('elsif' b1=boolean_expression
('then'|':'|LINE_BREAK) body1+=body)*
	        ('else' body2=body)?
	        'end' -> ^(IF $b $body0 $b1* $body1* $body2? )
	        ;
unless_expression
	:	'unless' boolean_expression ('then'|':'|LINE_BREAK)
	        body
	        ('else' body)?
	        'end';	

WS	:	(' ' | '\t') { skip(); }
	;
ID	:	('a'..'z' | 'A'..'Z') (('a'..'z' | 'A'..'Z') | ('0'..'9'))*
	;
I want to openLineBreakChannel in if_expression,
{lexer.openLineBreakChannel();}, so after if boolean_expression
the line_break or 'then'|':' are mandatory, not skip() or channel HIDDEN,
but I've tried this, found nothing happened.
it seems Lexer all parse out token stream then handle it to parser, so
parser can't affect lexer thru call to
lexer.openLineBreakChannel();

On 4/15/07, femto gary <femtowin at gmail.com> wrote:
> Hi David, thanks for the information,
> I'll check it out.
>
> On 4/15/07, David Holroyd <dave at badgers-in-foil.co.uk> wrote:
> > On Sun, Apr 15, 2007 at 08:11:39PM +0800, femto gary wrote:
> > > also, generating parser will also produce the following warning:
> > > [20:08:36] warning(200): Rubyv3.g:101:32: Decision can match input
> > > such as "SEMI" using multiple alternatives: 1, 2
> > > As a result, alternative(s) 2 were disabled for that input
> > >
> > > but for the grammar:
> > > statement
> > >       :       expression (modifier_line)* SEMI? -> ^(STATEMENT expression
> > > (modifier_line)*)
> > >       |       SEMI!
> > >       ;
> > > input SEMI shouldn't cause an ambiguity, because expression can't be empty,
> > > so either match the alt1 or alt2, why does it will report that warning.
> > > Anybody has any ideas? Thanks.
> >
> > I think that since the grammar allows,
> >
> >  statement*
> >
> > the ambiguity is between the alternatives of,
> >
> >  1) matching 'SEMI?' right now, in this invocation of 'statement', and,
> >
> >  2) not matching 'SEMI?', exiting this invocation of the 'statement' rule
> >    and then matching the 'SEMI!' alternative the next time around the
> >    'statement*' loop
> >
> > Is "1;" to be parsed as
> >
> >  (STATEMENT 1) (STATEMENT ;)
> > or
> >  (STATEMENT 1;)
> >
> >
> >
> > i.e. the decision being referred to in the message is probably the one
> > at the '?', not the one at the '|', if that helps at all :)
> >
> >
> > ta,
> > dave
> >
> > --
> > http://david.holroyd.me.uk/
> >
>
>
> --
> Best Regards
> XRuby http://xruby.com
> femto http://hi.baidu.com/femto
>

-- 
Best Regards
XRuby http://xruby.com
femto http://hi.baidu.com/femto