[antlr-interest] pass state from parser to lexer

Mon Jul 2 12:30:28 PDT 2012

Thanks. Yes, here is the form of a statement in the language, which otherwise is context-free:

exec mode <delimiter><body><delimiter>

Statements always start at the beginning of a new line.
<delimiter> is a single character that marks off the <body> text. The start/end delims match. The user can choose any character to be the <delimiter>.
The <body>, though, may be multiline and have whitespace. But it cannot have the <delimiter> character in it.

Example:
exec mode #Here is
Some body text.
#

If I tokenize the <body> piecemeal, then the rest of the grammar is totally unmanageable. I need <body> as one token.
I could potentially try to detect this command in the lexer and manually emit tokens. That would in effect be implementing the parser rule for this command in the lexer.

(I have another, more complicated situation for another language, involving segments of text embedded in the main content, which follow different token rules. I am facing this almost every day now....)

Thanks for any insights.
Scobie

From: Bart Kiers [mailto:bkiers at gmail.com] 
Sent: Monday, July 02, 2012 12:15 PM
To: Scobie Smith (Insight Global)
Cc: antlr-interest at antlr.org
Subject: Re: [antlr-interest] pass state from parser to lexer

On Mon, Jul 2, 2012 at 9:02 PM, Scobie Smith (Insight Global) <v-scobis at microsoft.com> wrote:
Is there a way to pass state information from the parser to the lexer? I am continually facing situations where the lexer should tokenize differently based on the parser rule. I have seen this question about ANTLR discussed on the web, but so far I haven't seen any solutions.

ANTLR's lexer operates independently from the parser, so the "easy" answer would be: you can't. 
However, there are several ways to make the lexer (a bit) context sensitive. Could you explain what problem you're actually trying to solve here? 

Regards,

Bart.