[antlr-interest] NoSuchElementException

Thu Sep 6 12:44:51 PDT 2012

Jim,

Thanks for your input. Based on your advice, I refactored the grammar
from earlier into the following:
----
grammar AerobasicPreprocessor;

preprocess : line (NEWLINE_ line)* EOF;

line : (PP_directive_ | ANY_*);

PP_directive_ : '#define';

NEWLINE_ : '\r'? '\n';

ANY_ : .;
----

This compiles, and I believe accomplishes what I need it to for lines
and newlines in principle. This is, of course, just a subset of my
grammar, in which I am having additional, similar problems. There is
something that I do not understand about the ~ operator in lexer rules.
Why can't I replace the '.' in the ANY_ rule above with '~NEWLINE_'
(causes the tool to crash)? I want to use this construct elsewhere in my
grammar. Here is an example snippet:

----
fragment PP_define
	:	'define' WS_ PP_define_name (WS_ PP_define_value)?
	;

fragment PP_define_name
	:	~WS_+
	;

fragment PP_define_value
	:	~NEWLINE_+ (PP_line_continuation ~NEWLINE_+)*
	;

fragment PP_line_continuation
	:	BACKSLASH_ WS_? NEWLINE_
	;
----

This sort of logic makes perfect sense to me, but seems to be choking
the lexer. What am I not understanding here?

Thanks,

- Justin

-----Original Message-----
From: antlr-interest-bounces at antlr.org
[mailto:antlr-interest-bounces at antlr.org] On Behalf Of Jim Idle
Sent: Thursday, September 06, 2012 12:07 PM
Cc: antlr-interest at antlr.org
Subject: Re: [antlr-interest] NoSuchElementException

You need to use the parser:

line : .* NL;

L1 : 'dfddfdf';
L2 : 'dfdfdfd';
NL: '\n';
ANY : . ;

Should get you a little nearer to what you want.

Jim

> -----Original Message-----
> From: antlr-interest-bounces at antlr.org [mailto:antlr-interest- 
> bounces at antlr.org] On Behalf Of Justin Murray
> Sent: Thursday, September 06, 2012 7:54 AM
> To: Mike Lischke
> Cc: antlr-interest at antlr.org
> Subject: Re: [antlr-interest] NoSuchElementException
>
> Mike,
>
> Thank you for the suggestion. I think I had tried something similar to

> this initially, but this also gives me problems. Here is the new
> grammar:
>
> ----
>
> grammar Test;
>
> // Parser rules
> preprocess
> 	:	line* EOF
> 	;
>
> line
> 	:	PP_directive_
> 	|	SOURCE_LINE_
> 	;
>
> // Lexer rules
>
> PP_directive_
> 	:	'#define'
> 	;
>
> NEWLINE_
> 	:	'\u000D'? '\u000A'
> 	|	'\u0085'
> 	|	'\u2028'
> 	|	'\u2029'
> 	;
>
> SOURCE_LINE_
> 	:	.* (EOF | NEWLINE_)
> 	;
>
> ----
>
> This one does not crash, but does give me the following error:
>
> error(201): AerobasicPreprocessor.g:27:4: The following alternatives 
> can never be matched: 1
>
> Line 27 corresponds to the SOURCE_LINE_ rule. This error doesn't 
> really make any sense to me. If I remove the EOF from the SOURCE_LINE_

> rule, the grammar builds successfully. However, this doesn't give me 
> what I need, which is the possibility of a line at the end of a file, 
> without a newline. Any other ideas?
>
> Thanks,
>
> - Justin
>
> -----Original Message-----
> From: Mike Lischke [mailto:mike at lischke-online.de]
> Sent: Thursday, September 06, 2012 10:11 AM
> To: Justin Murray
> Cc: antlr-interest at antlr.org
> Subject: Re: [antlr-interest] NoSuchElementException
>
>
> Justin,
>
> your grammar came over in an ugly format...
>
>
> Try something like this for lines instead:
>
> SOURCE_LINE_: .* (NEWLINE_ | EOF);
>
> Then your preprocess rule could go like this:
>
> preprocess:
> 	line* EOF
> ;
>
>
> ANTLR is clever enough to exclude the token after the Kleene operator 
> from what the dot matches, which is very convenient.
>
>
>
> > grammar Test;
> >
> >
> >
> > options
> >
> > {
> >
> >                language=C;
> >
> > }
> >
> >
> >
> >
> >
> > // Parser rules
> >
> > preprocess
> >
> >                :               (line? NEWLINE_)* line? EOF
> >
> >                ;
> >
> >
> >
> > line
> >
> >                :               PP_directive_
> >
> >                |              SOURCE_LINE_
> >
> >                ;
> >
> >
> >
> > // Lexer rules
> >
> >
> >
> > PP_directive_
> >
> >                :               '#define'
> >
> >                ;
> >
> >
> >
> > NEWLINE_
> >
> >                :               '\u000D'? '\u000A'
> >
> >                |              '\u0085'
> >
> >                |              '\u2028'
> >
> >                |              '\u2029'
> >
> >                ;
> >
> >
> >
> > SOURCE_LINE_
> >
> >                :               ~NEWLINE_+
> >
> >                ;
> >
> >
> >
> >
> >
> > So I have two questions. It seems to me that the tool should never 
> > crash, so is this an ANTLR bug? Secondly, there is clearly a problem

> > with what I am trying to do here. Is it not possible to capture 
> > everything on a line (that is not a newline) as a token? Does anyone

> > have a workaround?
> >
> >
> >
> > Thanks,
> >
> >
> >
> > - Justin
> >
> >
> > List: http://www.antlr.org/mailman/listinfo/antlr-interest
> > Unsubscribe:
> > http://www.antlr.org/mailman/options/antlr-interest/your-email-
> address
>
> Mike
> --
> www.soft-gems.net
>
>
>
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-
> email-address

List: http://www.antlr.org/mailman/listinfo/antlr-interest
Unsubscribe:
http://www.antlr.org/mailman/options/antlr-interest/your-email-address