[antlr-interest] Matching a token from only one rule?

Sun Oct 3 17:28:08 PDT 2010

 Thanks for the reply Martin.

I get the lexer/parser separation, and was looking for a way around it
for special cases.  I will try using rewrite rules.

Thanks,

Ryan Twitchell

On 10/01/2010 02:45 PM, Piper, Martin wrote:
> Tokens are decided by the lexer, without regard to how they are eventually used in parser rules.
> You really can't have tokens defined by what other tokens are around them, this is a parsing thing, so you can't have the lexer recognize a given string of characters as TOKEN1 in one portion of the input and TOKEN2 in another.
> What are the rules for ID? 
> If ID is allowed the same characters or a subset of the characters that DECL is allowed it will never be checked because DECL will match it first.
>
> If they both allow the same characters have one token definition, and have the rules decide how that token is used. If in the end you want to have different token names, you can use rewrite rules to make that happen.
>
> elem 	
> 	: declaration
> 	| assignment
> ;
> declaration:
> ID ';' -> DECL[ID]
> ;
> assignment:
> 	ID '=' expr ';'
> ;
>
> Also I'd recommend putting ';' and '=' into their own tokens. 
>
> SEMI: ';' ;
> EQUAL: '=' ;
>
>
>
> -----Original Message-----
> From: antlr-interest-bounces at antlr.org [mailto:antlr-interest-bounces at antlr.org] On Behalf Of Ryan Twitchell
> Sent: Monday, September 27, 2010 7:40 AM
> To: antlr-interest at antlr.org
> Subject: [antlr-interest] Matching a token from only one rule?
>
>  Hi all,
>
> At the start of one parser rule I would like, as one alternative, to
> match nearly any input ending before a certain character value.  I would
> like this to match as a single token if possible.  I am not sure how to
> achieve this, and have tried a number of things so far.  Here is my best
> shot so far:
>
> elem
>     :    DECL ';'
>     |    ID '=' expr ';'
>     ;
>
> DECL: (DECL_CHAR+ ';') => DECL_CHAR+
>     ;
>
> fragment
> DECL_CHAR
>     :    ~(';'|'=')
>     ;
>
> Working with the above, ANTLR reports that tokens such as ID can never
> be matched, since DECL matches them already.  I had not thought this
> would be the case with a syntactic predicate in front of the alternative.
>
>
> So far, I have only had success by incorporating the end character into
> the token, as follows.  But I believe this will lead to the token
> matching in other, unexpected places.
>
> DECL:  DECL_CHAR+ ';'
>     ;
>
> The important problem is that I don't want DECL to match at other parts
> of the grammar. 
>
> TIA for any advice,
>
> Ryan Twitchell
>
>
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address