[antlr-interest] Lexer not putting colon back
Sriram Durbha
cintyram at yahoo.com
Fri Nov 15 06:49:42 PST 2002
hi lucas,
i have a similar problem,
the reason i guess is that antlr is an ll parser,
that is it looks for tokens from left to right and also it takes a
decision to match at the earliest possible opportunity;
with antlr we should be able to use syntactic predicate to make sure
that if the : is followed by QName only it has to match the whole as a
Qname, other wise it should consider the second alt;
hope this helps
cheers
ram
--- "Paul J. Lucas" <dude at darkfigure.org> wrote:
> Assume I want to parse a statement of the form:
>
> let $x := $y
>
> or:
>
> LET DOLLAR QNAME ASSIGN DOLLAR QNAME
>
> where the lexer is defined as:
>
> tokens { LET; QNAME; }
>
> protected Digit : '0'..'9' ;
> protected Letter : 'A'..'Z' | 'a'..'z' | '_' ;
> protected NCName : Letter (NCNameChar)* ;
> protected NCNameChar : Letter | Digit | '.' | '-' ;
> protected QName : NCName (':' NCName)? ;
> protected WhiteSpace : ' ' | '\t' | '\r' | '\n' ;
>
> ASSIGN : ":=" ;
> DOLLAR : '$' ;
> EQUAL : '=' ;
> S : (WhiteSpace)+ { $setType( Token.SKIP ); } ;
>
> Keywords
> : "let" { $setType( LET ); }
> | QName { $setType( QNAME ); }
> ;
>
> This works fine as given above. But if I remove the whitespace
> after the $x like:
>
> let $x:= $y
>
> Then it gets it wrong. An excerpt of the trace output is:
>
> > lexer mKeywords; c==x
> > lexer mQName; c==x
> > lexer mNCName; c==x
> > lexer mLetter; c==x
> < lexer mLetter; c==:
> < lexer mNCName; c==:
> > lexer mNCName; c===
> > lexer mLetter; c===
> < lexer mLetter; c===
> < lexer mNCName; c===
> < lexer mQName; c===
> < lexer mKeywords; c===
> < varRef; > lexer mEQUAL; c===
> < lexer mEQUAL; c==1
> LA(1)===
> < startRule; LA(1)===
> exception: line 1:8: unexpected char: '='
>
> When it encounters the ':', it tries to make it part of a
> QName, e.g, "x:z"; but since the next character is an '=', it
> can't do that. What it SHOULD do is put the ':' back, return
> 'x' as the QNAME, then pick up with ':' as part of ":=". But
> it doesn't. Why not? And how can I fix this so that it
> correctly returns the right tokens regardless of whether
> whitespace is there?
>
> - Paul
>
>
>
>
> Your use of Yahoo! Groups is subject to
> http://docs.yahoo.com/info/terms/
>
>
__________________________________________________
Do you Yahoo!?
Yahoo! Web Hosting - Let the expert host your site
http://webhosting.yahoo.com
Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/
More information about the antlr-interest
mailing list