[antlr-interest] Lexer rule question
Jim Idle
jimi at temporal-wave.com
Fri Feb 8 10:34:44 PST 2008
> -----Original Message-----
> From: Johannes Luber [mailto:jaluber at gmx.de]
> Sent: Friday, February 08, 2008 8:10 AM
> To: antlr-interest at antlr.org
> Subject: [antlr-interest] Lexer rule question
>
> Hi!
>
> I have never needed to know the answer before now, but what is the
> actual difference between:
>
> A : B ;
>
> B : 'B' ;
>
> and
>
> A : B ;
>
> fragment B : 'B' ;
In the first instance, you will get an error that B is unreachable
because it sees a non fragment rule A first and that calls B. Because B
is not a fragment, ANTLR tries to produce a token match for that as well
as A and finds that the spec for both A and B is exactly the same.
In the second instance, B is a fragment and so ANTLR knows not to try to
produce a real token B, as it is just a rule that is called by other
lexer token definitions. Hence there is only a spec for the token A,
which just calls the rule B.
All rules produce a single token only, but may call other rules, whether
fragment rules or not, as part of the spec. However, if you dont use
the fragment modifier, then the lexer will try to produce a token for
that rule on its own, as well as the other rules that use it in
combination.
So, basically, if your rule is just something for another rule to match
with such as DIGIT etc, then use fragment and the lexer will not try to
produce code that matches and produces the token DIGIT. Always use
fragment if the parser is not expecting a token called by the lexer rule
name.
To produce multiple tokens from one production you have to start
deriving the token stream and storing the tokens produced in a List that
you can consume/add to the token list (see source code comments here).
That would be an overhead that most lexers dont need, so it isnt the
default. There are few occasions that the only solution is to produce
two tokens from one lexer rule; it does happen but I have always managed
to find another way.
Jim
More information about the antlr-interest
mailing list