[antlr-interest] Changing/affecting the Lexer from the Parser?

Sat Nov 10 08:11:33 PST 2012

Bernard,

On Sat, Nov 10, 2012 at 10:48 AM, Bernard Kaiflin
<bkaiflin.ruby at gmail.com>wrote:

> I still don't see the relationship between 2 ARR(1:5) ARR(1.2:4) ARR(1.#I:#J)
> and a Python CommonTokenStream. Is it a special version of Natural ? Do
> you have the specifications for this language ?
>

With the existing CommonTokenStream, the 1.2 in ARR(1.2:4) has been lexed
as a float before the parser started, which is way before the parser gets
to the expression. The Python CommonTokenStream bootstraps itself by
tokenizing all input on the first call to any of the methods that return a
token.

I built the grammar for Natural from the reference material, which includes
sort-of grammar descriptions.

I think that a language like Ruby requires a parser-guided lexer. I've
built some of those by hand before, and they are very efficient. But
Natural's grammar was too big (~3000 lines) to try to approach it by hand.

Cheers,

-- 
Juancarlo *Añez*