[antlr-interest] Need help with simple grammar

David Holroyd dave at badgers-in-foil.co.uk
Mon Apr 23 03:40:57 PDT 2007


On Mon, Apr 23, 2007 at 08:05:34PM +1200, Gavin Lambert wrote:
> At 08:57 23/04/2007, Johannes Luber wrote:
> >FILE : ID;
> >
> >3. Exchange all token rules with normal grammar rules like:
> >
> >GET : {input.LT(1).getText().equals("get")}? ID;
> 
> Another approach, which avoids the predicate (and so is slightly 
> more "pure" grammar) is to do something like this:
> 
> tokens {
>   GET = 'get';
>   PUT = 'put';
> }
> ...
> keyword: GET | PUT;
> target: ID | keyword;
> 
> This is a bit more work, since every time you add a keyword you 
> not only have to add it as a token but you also have to add it to 
> the keyword rule.  You also need to get the order of lexer token 
> definitions correct, because despite what someone posted last week 
> it does appear to make a difference (at least it did when I tried 
> it).
> 
> Plus, while this will match "get" as a filename, the token type 
> will still be GET, not ID.  If that's important to you, I think 
> you can override that with rewrite rules, but I haven't played 
> with those long enough to give you a concrete example.

I use rewrites like this in my grammar,

ident
	:	IDENT
	|	i=USE -> IDENT[$i]
	|	i=XML -> IDENT[$i]
	|	i=DYNAMIC -> IDENT[$i]
	|	i=NAMESPACE -> IDENT[$i]
	|	i=IS -> IDENT[$i]
	|	i=AS -> IDENT[$i]
	|	i=GET -> IDENT[$i]
	|	i=SET -> IDENT[$i]
	;

I switched to this method, having used the predicate-style,

foo: {input.LT(1).getText().equals("foo")}? IDENT

method before, because I found it helped me debug various ambiguity
problems in my grammar (lots of messages that say "could match input
such as IDENT IDENT IDENT using multiple alternatives" were just
confusing for me :)

ta,
dave

-- 
http://david.holroyd.me.uk/


More information about the antlr-interest mailing list