[antlr-interest] Changing/affecting the Lexer from the Parser?

Bernard Kaiflin bkaiflin.ruby at gmail.com
Sat Nov 10 07:18:43 PST 2012


Ola Juancarlo,

Astonishing that we can tokenize all the input. In Ruby it's impossible
because, seeing a regular expression like /(\.\'\d+)?/, the lexer would
emit a DIV, an LPAR, a MEMBER and, seeing the apostrophe, would match an
APOST_STRING and eat the rest of the file until it encounters another
apostrophe.

I still don't see the relationship between 2 ARR(1:5) ARR(1.2:4) ARR(1.#I:#J)
and a Python CommonTokenStream. Is it a special version of Natural ? Do you
have the specifications for this language ?

2012/11/10 Juancarlo Añez <apalala at gmail.com>

> Hello, Bernard,
>
> On Fri, Nov 9, 2012 at 7:23 PM, Bernard Kaiflin <bkaiflin.ruby at gmail.com
> >wrote:
>
> > No, the lexer answers to nextToken() requests from the parser. Starting
> at
> > the character position behind the last token consumed, it chooses the
> rule
> > that matches the most input characters. If the input can match two
> rules, ANTLR
> > resolves this lexical ambiguity by matching the input string to the rule
> > specified first in the grammar.
> >
>
> The Python implementation of CommonTokenStream lexes all the input in one
> pass storing the tokens in a list that it later indexes to deliver tokens
> to the parser.
>
> To do what I suggested, I would have to write my own token stream, and
> probably resource to the "mark" family of methods to allow the parser to
> backtrack.
>
> --
> Juancarlo *Añez*
>
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe:
> http://www.antlr.org/mailman/options/antlr-interest/your-email-address
>


More information about the antlr-interest mailing list