[antlr-interest] Re: C++ Parsers - charVocabulary option
therealtootalltimmy
therealtootalltimmy at yahoo.com
Tue Jan 8 10:38:40 PST 2002
--- In antlr-interest at y..., Ric Klaren <klaren at c...> wrote:
> Hi,
>
> On Mon, Jan 07, 2002 at 09:11:37PM -0000, therealtootalltimmy wrote:
> > I have a simple grammar that just handles comments.
> >
> > When I generate a Java parser and feed it a comment with a
copyright symbol
> > in it, it works (does not complain about unexpected tokens).
> >
> > When I generate a C++ parser and feed it a comment with a
copyright symbol
> > in it, it complains about an unexpected token.
>
> Is your input file unicode? If so then you're unlucky.
Ric,
Thanks a lot for replying to my question. I failed to mention
that I 1) am parsing ASCII input only and 2) that I am running on
Windows 2000.
Here is the grammar that I'm having problems with:
/*
header "post_include_hpp"
{
#include <iostream>
using namespace std;
}
options
{
language="Cpp"; // Generate C++ Code
namespaceAntlr="antlr";
}
*/
class MyParser extends Parser;
foo
: (COMMENT)+
;
class MyLexer extends Lexer;
options {
charVocabulary='\003'..'\377';
}
WS
: ( ' ' | '\t' )+
;
COMMENT
: '\'' (~('\n'|'\r'))* (NEWLINE)?
;
NEWLINE
: ( '\n' | '\r' '\n' )
;
By uncommenting the C++ specific settings I can build a C++ parser.
Here is my input:
' © lll
When I run my C++ parser on this file, I get:
unexpected char: <a character that looks like an upper left corner of
an ASCII box>
Running my java parser on this file, I get no output.
In the C++ parser's MyLexer::mCOMMENT method, when LA(1) returns the
copyright symbol, the else branch of:
if ((_tokenSet_0.member(LA(1))))
is executed and the loop is exited.
Thanks again for your help.
Tim
Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/
More information about the antlr-interest
mailing list