[antlr-interest] Conflicting lexer tokens?

Mon Feb 11 20:00:51 PST 2008

You're right - the lexer doesn't have any context. What is this:

Foo

Is that an Identifer or DomainPart?

Your best bet is probably to allow both, and then signal an error if the 
wrong kind comes in. Allow either underscore or dash (or both) in a 
"Name", and then let the parser disambiguate (and check to make sure 
that there aren't any illegal characters in the Name).

Alternatively, you could have the lexer recognize NamePart's and Dash's 
and Underbar's, and then assemble them in the parser.

=Austin

Dan Ellis wrote:
> On 12/02/2008, Dan Ellis <dan at remember.this.name> wrote:
>
>   
>> What have I done wrong?
>>     
>
> I think my problem here is forgetting that even though all the rules
> are in one file, the lexer and the parser are separate, so the lexer
> needs to be able to distingiush between Identifier and DomainPart
> without any clues from the parser part of the grammar.
>
> I see the Java example uses Identifier for DomainPart, which does rule
> out domains with '-' in them and allows ones with '_'. Is there a
> better way to do what I'm trying?
>
>
>