[antlr-interest] Re: Why No Error?

genericised trigonometric at softhome.net
Thu Aug 15 06:56:42 PDT 2002


Actually your solution is incorrect:

file : (line)+ EOF ; 

would be wrong because a line would still expect a
NEWLINE token at the end, the correct solution is:

file   : (line)+ ;
line   : (record)+ (NEWLINE|EOF) ;
record : (r:RECORD) (COMMA)? ;

well at least I think this is the correct solution, it
looks like it is, and it is hard to think how something
so simple could be wrong anyway. I am still interested
in knowing why no error was generated in the original
post however.

--- In antlr-interest at y..., "genericised" <trigonometric at s...> wrote:
> oh didn't realise it was so easy, and I wanted
> comma to be optional, checkout my latest post however,
> it is a bit more tricky, hehe ;)
> 
> --- In antlr-interest at y..., Bogdan Mitu <bogdan_mt at y...> wrote:
> > Hi,
> > 
> > If you want to be sure that all the input has been parsed, you 
> should finish
> > the main rule with EOF:
> > 
> > file : (line)+ EOF ; 
> > 
> > As a side note, the way you defined the grammar, Comma between 
> records is
> > optional. If you want Comma to be mandatory between records, try:
> > 
> > line : rec (COMMA rec)* NEWLINE ;
> > rec  : r:RECORD { action ... }
> > 
> > Cheers,
> > Bogdan
> > 
> > --- genericised <trigonometric at s...> wrote:
> > > I created the following parser, as an example of how to
> > > parse comma separated variable (CSV) files:
> > > 
> > > class CSVParser extends Parser;
> > > file : (line)+ ;
> > > line : (rec)+ NEWLINE ;
> > > rec  : (r:RECORD) (COMMA)?
> > >        {System.out.println(r.getText());}
> > >      ;
> > > 
> > > The corresponding Lexer is:
> > > 
> > > class CSVLexer extends Lexer;
> > > options { charVocabulary='\3'..'\377'; }
> > > RECORD  : (~(','|'\r'|'\n'|' '|'\t'))+ ;
> > > COMMA   : ',' ;
> > > NEWLINE : ('\r''\n')=> '\r''\n' //DOS
> > >         | '\r'                  //MAC
> > >         | '\n'                  //UNIX
> > >         { newline(); }
> > >         ;
> > > WS      : (' '|'\t') { $setType(Token.SKIP); } ;
> > > 
> > > Pretty straightforward, but, when I run this on a
> > > CSV it produces no error.
> > > 
> > > The last line of a CSV is:
> > > 
> > > blah, blah, blah
> > > 
> > > so the line does not consist of
> > > 
> > > rec+ NEWLINE
> > > 
> > > but
> > > 
> > > rec+
> > > 
> > > When 
> > > 
> > > match(NEWLINE)
> > > 
> > > is called from the parser, why does it not throw
> > > a mismatchedTokenException?
> > > 
> > > Or does it throw some kind of exception that is
> > > caught and causes the parsing of the inputstream
> > > to terminate gracefully?
> > > 
> > > The parser is invoked from some main file like this:
> > > 
> > > csvParser.file();
> > > 
> > > I have spent a couple of hours investigating this,
> > > looking through the ANTLR source and stuff but I
> > > have not yet found where this is dealt with?
> > > 
> > > I might do a bit of weekend investigation into this
> > > because of what I will learn in the process of
> > > determining this but at the moment I am supposed to
> > > be writing this ANTLR tutorial and then got side
> > > tracked trying to explain why it is OK that the
> > > parser does not match the final NEWLINE.
> > > 
> > > Well actually, is it ok, or should the rule for file
> > > be defined something like:
> > > 
> > > file : (line)+ EOFCHAR;
> > > 
> > > Regards
> > > 
> > > A Person
> > > 
> > > 
> > >  
> > > 
> > > Your use of Yahoo! Groups is subject to 
> http://docs.yahoo.com/info/terms/ 
> > > 
> > > 
> > > 
> > 
> > 
> > __________________________________________________
> > Do You Yahoo!?
> > HotJobs - Search Thousands of New Jobs
> > http://www.hotjobs.com


 

Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/ 



More information about the antlr-interest mailing list