[antlr-interest] Re: lexer labels in v3

Terence Parr parrt at cs.usfca.edu
Sat Sep 24 17:04:46 PDT 2005


Oliver, nice work.  Labels on lexer fragment rule refs now work great:

lexer grammar T;

A : 'a' x=B y="void" z='x' ;

fragment
B : 'b' ;

It still works, but is slightly inefficient when B is not a fragment.

I added similar stuff to char and string refs.  Note that labelling a  
string should be a token label

x="void"

should define x as a Token right whereas

c='x'

has label c as an int?

Another difference is that labels will be unique to the alternative  
as they can have different types.  In parser/tree parser rules,  
everything is one type: Token or AST so I collect and put labels at  
the start of the rule so you can ref them like this:

( x:A | x:B )
{print x;}

pretty handy.  Are you ok with the inconsistency?  It's rare to label  
stuff in the lexer so this is probably ok.

The code block for A is (comment auto generated):

         // t.g:3:5: 'a' x= B y= "void" z= 'x'
         {
         match('a');
         int xStart = getCharIndex();
         mB();
         Token x = new CommonToken(input, Token.INVALID_TOKEN_TYPE,  
Token.DEFAULT_CHANNEL, xStart, getCharIndex()-1);

         int yStart = getCharIndex();
         match("void");
         Token y = new CommonToken(input, Token.INVALID_TOKEN_TYPE,  
Token.DEFAULT_CHANNEL, yStart, getCharIndex()-1);

         int z = input.LA(1);
         match('x');
         }

Oh, wildcard labels work too :)

Ter

On Sep 12, 2005, at 6:39 AM, Oliver Zeigermann wrote:

> Modifying template lexerRuleRef in Java.stg
> (org/antlr/codegen/templates/) to this
>
> lexerRuleRef(label,rule,args) ::= <<
> <if(label)>
> int <label>Start = getCharIndex();<\n>
> m<rule>(<args>);<\n>
> Token <label> = new CommonToken(input, Token.INVALID_TOKEN_TYPE,
> Token.DEFAULT_CHANNEL, <label>Start, getCharIndex()-1);<\n>
> <else>
> m<rule>(<args>);<\n>
> <endif>
>
>>>
>>>
>
> allowed me to have labled lexer fragments like
>
> name=GENERIC_ID
>
> ...
>
> fragment GENERIC_ID     : ... ;
>
> Oliver
>
>
> 2005/9/12, Oliver Zeigermann <oliver.zeigermann at gmail.com>:
>
>> In codegen.g, rule atom the attribute "label" is set for
>> lexerStringRef, charRef and lexerRuleRef while there is no attribute
>> label for these rules. This causes an error upon generation.
>>
>> Obvious solution: either remove code that sets "label" (already done
>> at certain parts of rule atom) or add attribute "label" to the
>> templates.
>>
>> Oliver
>>
>

--
CS Professor & Grad Director, University of San Francisco
Creator, ANTLR Parser Generator, http://www.antlr.org
Cofounder, http://www.jguru.com



More information about the antlr-interest mailing list