[antlr-interest] Reuse of same token in multiple lexer modes

Gerald Rosenberg gerald at certiv.net
Wed Oct 10 18:26:53 PDT 2012


Do you really need to recursively pushMode(PAR) when you are already in 
PAR mode?  Or, do you just need to match parens to know when to exit PAR 
mode?

If the latter, you should be able to use a redundant string literal 
(TDAR4, pg 275):

// default mode
OPAR : '(' -> pushMode(PAR);
CPAR : ')' ;
mode PAR
OTHINGY : '(' {cnt++;} ;
CTHINGY : ')'   ( { cnt == 0 }?  -> type(CPAR), popMode )? { if (cnt > 
0) cnt-- ; } ;

Lexer actions are limited in where they can appear.  Coding is from 
memory, but should be close.



On 10/10/2012 5:32 AM, Kai Burjack (HBT) wrote:
> Hey Terence,
>
> thanks. I did not know of that 'type' command. But now, that clutters up my grammar heavily.
>
> What would be totally awesome, I guess, would be the possibility to specify the tokens once with their definitions at the start of the lexer grammar and then enumerate the token names under all modes in which they should be active/recognized together with their respective commands, such as pushMode, popMode, skip, etc...
>
> Just a suggestion, though for ANTLR v5 ;-)
>
> -----Ursprüngliche Nachricht-----
> Von: Terence Parr [mailto:parrt at cs.usfca.edu]
> Gesendet: Di 09.10.2012 19:23
> An: Kai Burjack (HBT)
> Cc: antlr-interest at antlr.org
> Betreff: Re: [antlr-interest] Reuse of same token in multiple lexer modes
>   
> hi. Since the parser needs to have a unique token type, ANTLR does not allow you to redefine token roles. You can of course use a lexer command to change the token type of something after you match it with -> type(OPEN_PAREN).
>
>   I should also note that it sounds like what you really want is a recursive lexer rule, given that you are doing a push in the lexer mode as well.
> Ter
> On Oct 9, 2012, at 5:30 AM, Kai Burjack (HBT) wrote:
>
>> Hello Terrence,
>>
>> first of all, many thanks for ANTLR and ANTLRv4 in particular, with its (among other things) greatly improved error reporting!
>>
>> I have a question about lexer modes. I want to write an "island grammar", as it is called in your ANTLR4 beta2 book, and have found that it does not seem to be possible to reference the same token rule in multiple lexer modes.
>>
>> Simplified example grammar:
>>
>> lexer grammar MyLexer;
>>
>> OPEN_PAREN : '(' -> pushMode(PAR) ;
>>
>> mode PAR ;
>>
>> OPEN_PAREN : '(' -> pushMode(PAR) ;
>> CLOSE_PAREN : ')' -> popMode ;
>> ...other tokens that are otherwise not allowed outside of parenthese...
>>
>> What I am trying to do there is to "know" when I am in parenthesis in order to allow more tokens (such as '>' for freemarker template language style) that were otherwise not allowed outside of that mode.
>>
>> The "Tool" runs through this grammar fine, but the generated Java code contains errors due to non-existing identifier "PAR".
>>
>> Can you help me on this one, please?
>>
>> Thanks.
>>
>> Best Regards,
>> Kai
>>
>> List: http://www.antlr.org/mailman/listinfo/antlr-interest
>> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address
>
>
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address
>




More information about the antlr-interest mailing list