[antlr-interest] Pls remove my id(mitram_8@yahoo.com), I don't want to be in this list.
raj b
mitram_8 at yahoo.com
Wed May 14 06:43:43 PDT 2003
--- cgodfrey86 <cgodfrey at epnet.com> wrote:
> Hello,
>
> I am trying to write a grammar file which recognizes
> a subset of
> tokens only if in a specific state.
>
> For example AND is recognized as token AND_OP if NOT
> appearing within
> quotes. If appearing within quotes, AND is
> recognized as a PATTERN
> token. I've included the grammar file which I have
> defined. Any
> suggestions as to what I am doing wrong would be
> appreciated.
>
> When I run a test program using the generated lexer,
> tokens are
> recognized properly when appearing in quotes:
>
> "WAR AND PEACE";
> *************************************************
> > lexer mQUOTE; c=="
> < lexer mQUOTE; c==w
> Token: [""",<17>,line=1,col=1]
> Token Type: 17
> Token Text: "
> > lexer mTERM; c==w
> > lexer mALLOWCHARS; c==w
> < lexer mALLOWCHARS; c==a
> > lexer mALLOWCHARS; c==a
> < lexer mALLOWCHARS; c==r
> > lexer mALLOWCHARS; c==r
> < lexer mALLOWCHARS; c==
> < lexer mTERM; c==
> Token: ["WAR",<16>,line=1,col=2]
> Token Type: 16
> Token Text: WAR
> > lexer mWS; c==
> < lexer mWS; c==a
> > lexer mTERM; c==a
> > lexer mALLOWCHARS; c==a
> < lexer mALLOWCHARS; c==n
> > lexer mALLOWCHARS; c==n
> < lexer mALLOWCHARS; c==d
> > lexer mALLOWCHARS; c==d
> < lexer mALLOWCHARS; c==
> < lexer mTERM; c==
> Token: ["AND",<16>,line=1,col=6]
> Token Type: 16
> Token Text: AND
> > lexer mWS; c==
> < lexer mWS; c==p
> > lexer mTERM; c==p
> > lexer mALLOWCHARS; c==p
> < lexer mALLOWCHARS; c==e
> > lexer mALLOWCHARS; c==e
> < lexer mALLOWCHARS; c==a
> > lexer mALLOWCHARS; c==a
> < lexer mALLOWCHARS; c==c
> > lexer mALLOWCHARS; c==c
> < lexer mALLOWCHARS; c==e
> > lexer mALLOWCHARS; c==e
> < lexer mALLOWCHARS; c=="
> < lexer mTERM; c=="
> Token: ["PEACE",<16>,line=1,col=10]
> Token Type: 16
> Token Text: PEACE
> > lexer mQUOTE; c=="
> < lexer mQUOTE; c==;
> Token: [""",<17>,line=1,col=15]
> Token Type: 17
> Token Text: "
> > lexer mSEMI; c==;
> < lexer mSEMI; c==
> Token: [";",<26>,line=1,col=16]
> Token Type: 26
> Token Text: ;
> done lexing...
> *************************************************
>
> When appearing without quotes, tokens are not
> recognized as expected:
> WAR AND PEACE;
> *************************************************
> > lexer mTERM; c==w
> > lexer mWS; c==r
> < lexer mWS; c==r
> < lexer mTERM; c==w
> exception: line 1:1: unexpected char: 'w'
> *************************************************
> AND PEACE;
> *************************************************
> > lexer mTERM; c==a
> < lexer mTERM; c==
> Token: ["AND",<6>,line=1,col=1]
> Token Type: 6
> Token Text: AND
> > lexer mWS; c==
> < lexer mWS; c==p
> > lexer mTERM; c==p
> > lexer mWS; c==a
> < lexer mWS; c==a
> < lexer mTERM; c==p
> exception: line 1:5: unexpected char: 'p'
> *************************************************
>
> options
> {
> language = "CSharp";
> }
>
> class UserLexer extends Lexer;
> options {
> k=3;
> caseSensitive=false;
> caseSensitiveLiterals=false;
> }
>
> tokens {
> S_TAG;
> OR_OP;
> AND_OP;
> NOT_OP;
> GT_OP;
> GE_OP;
> LT_OP;
> LE_OP;
> EQ_OP;
> DASH;
> W_OP;
> N_OP;
> PATTERN;
> }
>
>
> {
>
>
> public bool isQuoted = false;
>
> }
>
>
> QUOTE : '"' {if (this.isQuoted) {this.isQuoted =
> false;} else
> {this.isQuoted = true;} };
>
> OPEN_PAREN : '(';
>
> CLOSE_PAREN : ')';
>
>
> TERM :
> {!this.isQuoted}?
> (
> ("gt")=> "gt"
> {$setType(GT_OP);}
> | (">")=> ">"
> {$setType(GT_OP);}
> |("ge")=> "ge"
> {$setType(GE_OP);}
> |(">=")=> ">="
> {$setType(GE_OP);}
> |("lt")=>"lt"
> {$setType(LT_OP);}
> |("<")=>"<"
> {$setType(LT_OP);}
> |("le")=>"le"
> {$setType(LE_OP);}
> |("<=")=>"<="
> {$setType(LE_OP);}
> |("eq")=>"eq"
> {$setType(EQ_OP);}
> |("=")=>"="
> {$setType(EQ_OP);}
> |("-")=>"-"
> {$setType(DASH);}
> | ("or") => "or"
> {$setType(OR_OP);}
> | ("and") => "and"
> {$setType(AND_OP);}
> | ("not") => "not"
> {$setType(NOT_OP);}
> |(('a'..'z')('a'..'z') WS) => ('a'..'z')('a'..'z')
> {
> $setType(S_TAG);
> }
> | ('w'INT)=>'w'INT
> {$setType(W_OP);}
> | ('n'INT)=>'n'INT
> {$setType(N_OP);}
> )
> |
> (ALLOWCHARS)+
> {$setType(PATTERN);}
> ;
>
>
> protected
> REAL : INT'.'INT;
>
>
=== message truncated ===
__________________________________
Do you Yahoo!?
The New Yahoo! Search - Faster. Easier. Bingo.
http://search.yahoo.com
Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/
More information about the antlr-interest
mailing list