[antlr-interest] EOF in Lexer- how to?

Alexey Demakov demakov at ispras.ru
Tue Jan 17 00:38:18 PST 2006


As far as I understand, the cause why you need to define EOF in lexer
is that you need to handle single line comments possibly not followed by NewLine.
My definition of SinelLinComment handles both cases - if comment is followed
by NewLine, this NewLine will be included in comment. If comment is followed
by EOF, comment still will be recognized but without NewLine.

It works, what else we need?

Btw, EOF can not be included in EndOfLine, especially when
whitespaces are skipped :)

Regards,
Alexey

-----
Alexey Demakov
TreeDL: Tree Description Language: http://treedl.sourceforge.net
RedVerst Group: http://www.unitesk.com



----- Original Message ----- 
From: Tomasz Jastrzebski 
To: antlr-interest at antlr.org 
Sent: Monday, January 16, 2006 7:31 PM
Subject: Re: [antlr-interest] EOF in Lexer- how to?


Thank you Alexy, but what I want is to solve EXACTLY this problem.
That is; I need to be able to match:
// comment text <EOF>
In another words: I would like to be able to define NewLine, or better yet, EndOfLine as: 
EndOfLine :(options{greedy=true;}:"\r\n" | '\r' | '\n' )  | EOF;
but I can no, the above definition obviously would not work.
-Tomasz

Alexey Demakov <demakov at ispras.ru> wrote:
Make NewLine at the end of single line comment optional:

SingleLineComment :"//" ( ~('\r' | '\n') )* ( NewLine )? ;

It will match NewLine everywhere except

// comment text <EOF>

Regards,
Alexey

-----
Alexey Demakov
TreeDL: Tree Description Language: http://treedl.sourceforge.net
RedVerst Group: http://www.unitesk.com


----- Original Message ----- 
From: Tomasz Jastrzebski
To: antlr-interest at antlr.org
Sent: Monday, January 16, 2006 12:03 PM
Subject: [antlr-interest] EOF in Lexer- how to?


Hi Everybody,

Is it possible to recognize EOF in the lexer?

Ok, why would someone wanted to do it in the first place?
Lets suppose I want my lexer to recognize a SingleLineComment, let's say Java "// com ment" style. My lexer rules should look more 
or less like this:
NewLine :(options{greedy=true;}:"\r\n" | '\r' | '\n' ) ;
SingleLineComment :"//" ( ~('\r' | '\n') )* NewLine ;

But there is a problem here. What if my input stream consists of only single comment and no NewLine? E.g.
// comment text 
This lexer will not recognize such an input correctly.
That is why I w ant my lexer to be able to treat EOF as NewLine.

However it seems like I can not use or define EOF token within Lexer. An attempt to use '\uFFFF' within the NewLine rule seems to 
block the lexer and lead to unpredictable results.

I would appreciate any help.







Yahoo! Photos - Showcase holiday pictures in hardcover
Photo Books. You design it and we'll bind it!



More information about the antlr-interest mailing list