[antlr-interest] Understanding priorities in lexing (newbie)
Wincent Colaiuta
win at wincent.com
Thu Jul 12 17:05:31 PDT 2007
El 13/7/2007, a las 0:18, Daniel Brosseau escribió:
> Hi,
>
> Love this:
>
>> Well, it does what I expected so it's "correct", just not what
>> you want ;)
>>
>
> Case 1:
> grammar lex;
> KEYWORD : 'a' 'b' 'c';
> OTHER : 'a' | 'b' | 'c';
> program : ( KEYWORD | OTHER )*
>
> Input: "aba" chokes on second a
>
> Case 2:
> grammar lex;
> kEYWORD : 'a' 'b' 'c';
> oTHER : 'a' | 'b' | 'c';
> program : ( kEYWORD | oTHER )*
>
> Input: "aba" outputs oTHER oTHER oTHER
>
> Same grammar, two different state machines.
>
> As I tried to say earlier, although the rules language used for the
> lexer and parser seems to be describing things in the same manner,
> they in fact describe very different state machines. So at the
> least this is an inconsistency which leads to confusion.
One thing to bear in mind is that lexing and parsing are completely
separate phases in ANTLR; sure the parser and lexer run at the same
time because the parser is just saying "give me a token, give me
another token" etc until all tokens are produced, but conceptually
because there is no communication from the parser to the lexer you
can think of them as two completely separate phases.
So when you take your first lexer, which has two rules (KEYWORD and
OTHER) and then morph it into the second lexer, which only has one
rule (KEYWORD) then you are changing it in a fundamental way which
completely changes the way it operates.
Cheers,
Wincent
More information about the antlr-interest
mailing list