[antlr-interest] trouble with ids and keywords
Bob Marinier
bob.marinier at soartech.com
Sat Feb 7 05:16:34 PST 2009
Gavin Lambert wrote:
> At 11:07 7/02/2009, Bob Marinier wrote:
> >I'm using antlr 2.7.6 and I have a problem with keywords and
> >identifiers conflicting. Specifically, if I have an identifier
> >that starts with a keyword, then the beginning gets picked up
> >as a keyword, as opposed to the whole thing getting recognized
> >as an identifier. For example, one of my keywords is "new". If
> >the input contains "newX", then this gets tokenized as the
> >"new" keyword and an identifier "X", whereas I want just an
> >identifier "newX". That is, I want the identifier rule to
> >be greedy, and only check the literals table after it's read
> >as much as it can.
>
> One of the classic resolutions to this problem is to avoid matching
> the keywords in the lexer at all -- match them all just as IDs in the
> lexer, and then test the text of the ID in the parser to verify
> whether it's a keyword or not. (If you're outputting an AST, you can
> then swap it to a keyword token type, if you want.)
>
The problem is that IDs in my system don't allow dashes, but some
keywords have dashes in them. So keywords with dashes don't get
recognized in the parser. But it seems to work if I put all of the
keywords with dashes in the lexer, and the ones without dashes in the
parser. Having the keywords list spread out over two locations makes me
cringe a bit, but maybe this is the cleanest solution?
Thanks,
Bob
More information about the antlr-interest
mailing list