[antlr-interest] Lexing problem I cannot resolve

Raphael Reitzig r_reitzi at cs.uni-kl.de
Sun Aug 3 05:15:52 PDT 2008


"Gavin Lambert" <antlr at mirality.co.nz> wrote (Sun Aug  3 14:01:32 2008):
> The trouble with doing this kind of thing in the parser is that you  
> no longer have single tokens.  In addition to giving extra work for  
> the tree walker to deal with

I can't see extra work. I rather might save some work, for if lexer  
would give me a RANGE token with text "5..10", to create a proper AST,  
I'd have to split this string. My way, I have the components directly  
at hand. But where is an imagined tree walker affected  by how we  
create the AST?

> this also means that any HIDDEN or off-channel tokens produced by  
> the lexer could have been silently inserted.
> This isn't always a bad thing -- after all, it will let you parse  
> "1./*foo*/5" as if it were "1.5" -- but that might turn out to be  
> more confusing than helpful.  And it would similarly parse "1.    5"  
> as "1.5" as well, which is usually less desirable.  (Assuming that  
> comments and whitespace are being hidden.)

This is, of course, true. We don't want another spacecraft explode in midair.
One could get around this not ignoring whitespaces et al. But that, I  
suppose, would lead to far more trouble.

I wonder, anyway, why neither

| (a=INT)? ONE_DOT b=INT  -> ^(FLOAT {new CommonToken(FLOAT, $a.text +  
"." + $b.text)})


| (a=INT)? ONE_DOT b=INT  -> {new CommonTree(new CommonToken(FLOAT,  
$a.text + "." + $b.text))}

would work as page 170 of ANTLR reference proposes.



This message was sent using IMP, the Internet Messaging Program.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 191 bytes
Desc: PGP Digital Signature
Url : http://www.antlr.org/pipermail/antlr-interest/attachments/20080803/6b7700d5/attachment.bin 

More information about the antlr-interest mailing list