[antlr-interest] pass state from parser to lexer

Benjamin S Wolf jokeserver at gmail.com
Tue Jul 3 09:13:55 PDT 2012


I believe you can also use ~ as a negation, eg.

BODY : '#' ~'#'* '#' ;

(if # is your delimiter, as an example)

or use greedy=false, e.g

ML_COMMENT : '/*' ( options {greedy=false;} : . )* '*/' ;

This latter is an example from
http://www.antlr.org/wiki/display/ANTLR3/Quick+Starter+on+Parser+Grammars+-+No+Past+Experience+Required
which matches C-style multiline comments.

On Tue, Jul 3, 2012 at 5:22 AM, Bart Kiers <bkiers at gmail.com> wrote:
> On Mon, Jul 2, 2012 at 9:30 PM, Scobie Smith (Insight Global) <
> v-scobis at microsoft.com> wrote:
>
>> Thanks. Yes, here is the form of a statement in the language, which
>> otherwise is context-free:
>>
>> exec mode <delimiter><body><delimiter>
>>
>> Statements always start at the beginning of a new line.
>> <delimiter> is a single character that marks off the <body> text. The
>> start/end delims match. The user can choose any character to be the
>> <delimiter>.
>> The <body>, though, may be multiline and have whitespace. But it cannot
>> have the <delimiter> character in it.
>>
>
>> Example:
>> exec mode #Here is
>> Some body text.
>> #
>>
>>
> You can match such a token like this:
>
> BODY  : {input.LA(1) == delimiter}?=> . ({input.LA(1) != delimiter}?=> . )*
> . ;
>
>
> where 'delimiter' is a character you set while instantiating the lexer and
> putting it in the members-section like this:
>
> @lexer::members {
>   private char delimiter = '#';
> }
>
>
> Note that in my suggestion above, the 'BODY' might not end with
> 'delimiter', but with an EOF instead. You will need to do an extra check at
> the end of the rule, if necessary.
>
> Regards,
>
> Bart.
>
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address


More information about the antlr-interest mailing list