[antlr-interest] Re: Why No Error?
genericised
trigonometric at softhome.net
Thu Aug 15 06:56:42 PDT 2002
Actually your solution is incorrect:
file : (line)+ EOF ;
would be wrong because a line would still expect a
NEWLINE token at the end, the correct solution is:
file : (line)+ ;
line : (record)+ (NEWLINE|EOF) ;
record : (r:RECORD) (COMMA)? ;
well at least I think this is the correct solution, it
looks like it is, and it is hard to think how something
so simple could be wrong anyway. I am still interested
in knowing why no error was generated in the original
post however.
--- In antlr-interest at y..., "genericised" <trigonometric at s...> wrote:
> oh didn't realise it was so easy, and I wanted
> comma to be optional, checkout my latest post however,
> it is a bit more tricky, hehe ;)
>
> --- In antlr-interest at y..., Bogdan Mitu <bogdan_mt at y...> wrote:
> > Hi,
> >
> > If you want to be sure that all the input has been parsed, you
> should finish
> > the main rule with EOF:
> >
> > file : (line)+ EOF ;
> >
> > As a side note, the way you defined the grammar, Comma between
> records is
> > optional. If you want Comma to be mandatory between records, try:
> >
> > line : rec (COMMA rec)* NEWLINE ;
> > rec : r:RECORD { action ... }
> >
> > Cheers,
> > Bogdan
> >
> > --- genericised <trigonometric at s...> wrote:
> > > I created the following parser, as an example of how to
> > > parse comma separated variable (CSV) files:
> > >
> > > class CSVParser extends Parser;
> > > file : (line)+ ;
> > > line : (rec)+ NEWLINE ;
> > > rec : (r:RECORD) (COMMA)?
> > > {System.out.println(r.getText());}
> > > ;
> > >
> > > The corresponding Lexer is:
> > >
> > > class CSVLexer extends Lexer;
> > > options { charVocabulary='\3'..'\377'; }
> > > RECORD : (~(','|'\r'|'\n'|' '|'\t'))+ ;
> > > COMMA : ',' ;
> > > NEWLINE : ('\r''\n')=> '\r''\n' //DOS
> > > | '\r' //MAC
> > > | '\n' //UNIX
> > > { newline(); }
> > > ;
> > > WS : (' '|'\t') { $setType(Token.SKIP); } ;
> > >
> > > Pretty straightforward, but, when I run this on a
> > > CSV it produces no error.
> > >
> > > The last line of a CSV is:
> > >
> > > blah, blah, blah
> > >
> > > so the line does not consist of
> > >
> > > rec+ NEWLINE
> > >
> > > but
> > >
> > > rec+
> > >
> > > When
> > >
> > > match(NEWLINE)
> > >
> > > is called from the parser, why does it not throw
> > > a mismatchedTokenException?
> > >
> > > Or does it throw some kind of exception that is
> > > caught and causes the parsing of the inputstream
> > > to terminate gracefully?
> > >
> > > The parser is invoked from some main file like this:
> > >
> > > csvParser.file();
> > >
> > > I have spent a couple of hours investigating this,
> > > looking through the ANTLR source and stuff but I
> > > have not yet found where this is dealt with?
> > >
> > > I might do a bit of weekend investigation into this
> > > because of what I will learn in the process of
> > > determining this but at the moment I am supposed to
> > > be writing this ANTLR tutorial and then got side
> > > tracked trying to explain why it is OK that the
> > > parser does not match the final NEWLINE.
> > >
> > > Well actually, is it ok, or should the rule for file
> > > be defined something like:
> > >
> > > file : (line)+ EOFCHAR;
> > >
> > > Regards
> > >
> > > A Person
> > >
> > >
> > >
> > >
> > > Your use of Yahoo! Groups is subject to
> http://docs.yahoo.com/info/terms/
> > >
> > >
> > >
> >
> >
> > __________________________________________________
> > Do You Yahoo!?
> > HotJobs - Search Thousands of New Jobs
> > http://www.hotjobs.com
Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/
More information about the antlr-interest
mailing list